University of Maryland University College STAT200  Assignment #1: Descriptive Statistics Data Analysis Plan
Scenario: Please write a few lines describing your scenario and the four variables (in addition to income) you have selected.
I selected a 33 yearold married head of household with no children, as that describes myself. I chose head of household and marital status as the variables for socioeconomics. Marital status having only two options allows for clearcut comparisons, even in relation to other variables. I chose Age of the head of household because of the large variation of sample size. For Expenditure variables I chose annual expenditures and housing for the large range in samples.
Use Table 1 to report the variables selected for this assignment. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 1. Variables Selected for the Analysis
Variable Name in the Data Set 
Description (See the data dictionary for describing the variables.) 
Type of Variable (Qualitative or Quantitative)

Variable 1: “Income” 
Annual household income in USD. 
Quantitative

Variable 2: “MaritalStatus” 
Marital Status of Head of Household 
Qualitative

Variable 3: “AgeHeadHousehold” 
Age of the Head of Household 
Quantitative 
Variable 4: “AnnualExpenditures” 
Total Amount of Annual Expenditures 
Quantitative 
Variable 5: “Housing”

Total Amount of Annual Expenditure on Housing 
Quantitative 
Reason(s) for Selecting the Variables and Expected Outcome(s):
1. Variable 1: “Income”  It was required to be chosen by the assignment
2. Variable 2: “Marital Status “  Marital status having only two options allows for clearcut comparisons, even in relation to other variables
3. Variable 3: “Age of Head of Household“  the large range of sample size
4, Variable 4: “Annual Expenditures"  large range in samples
5. Variable 5: “Housing”  large range in samples
Data Set Description:
Proposed Data Analysis:
Measures of Central Tendency and Dispersion
Complete Table 2. Numerical Summaries of the Selected Variables and briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 2. Numerical Summaries of the Selected Variables
Variable Name 
Measures of Central Tendency and Dispersion 
Rationale for Why Appropriate

Variable 1: “Income” 

I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. The variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative. 
Variable 2: “Marital Status” 

I am using mode for the following reason: 1. As the data is nominal, the median and mean cannot be determined 
Variable 3: “Age of Head of Household” 
Standard Deviation 
I am using mean for the following reasons: 1. It is the most commonly used measure of central tendency. 2. The variable is quantitative. I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. the variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. it is the most commonly used measure of dispersion 3. The variable is quantitative. 
Variable 4: “Annual Expenditures” 
Standard Deviation 
I am using mean for the following reasons: 1. It is the most commonly used measure of central tendency. 2. The variable is quantitative. I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. the variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. it is the most commonly used measure of dispersion 3. The variable is quantitative. 
Variable 5: “Housing Expenditures” 
Standard Deviation 
I am using mean for the following reasons: 1. It is the most commonly used measure of central tendency. 2. The variable is quantitative. I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. the variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. it is the most commonly used measure of dispersion 3. The variable is quantitative. 
Graphs and/or Tables
Complete Table 3. Type of Graphs and/or Table for Selected Variables and briefly explain why you choose those graphs and/or tables. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 3. Type of Graphs and/or Tables for Selected Variables
Variable Name 
Graph and/or Table 
Rationale for why Appropriate? 
Variable 1: “Income” 
Graph: I will use the histogram to show the normal distribution of data. 
Histogram is one of the best plot to show the normal distribution of quantitative level data. 
Variable 2: “Marital Status” 
Graph: I will use the Pie Chart 
The Pie Chart is useful as there are only two categories and the data can be presented as a percentage. A Pie Chart is useful for qualitative data. 
Variable 3: “Age of Head of Household”

Graph: I will use the Box Plot 
The Box Plot is useful for large amounts of data and help to identify any outliers in the data. 
Variable 4: “Annual Expenditures” 
Graph: I will use the Histogram 
The histogram is useful shows normal distribution of quantitative data, and is easier to read than a frequency table. 
Variable 5: “Housing Expenditures” 
Graph: I will use the Histogram 
The histogram is useful shows normal distribution of quantitative data, and is easier to read than a frequency table. 
