Fill This Form To Receive Instant Help
Homework answers / question archive / University of Maryland University College STAT200 - Assignment #1: Descriptive Statistics Data Analysis Plan Identifying Information Student (Full Name): Class: Instructor: Date: Scenario: Please write a few lines describing your scenario and the four variables (in addition to income) you have selected
University of Maryland University College
STAT200 - Assignment #1: Descriptive Statistics Data Analysis Plan
Identifying Information
Student (Full Name):
Class:
Instructor:
Date:
Scenario: Please write a few lines describing your scenario and the four variables (in addition to income) you have selected.
I selected a 33 year-old married head of household with no children, as that describes myself. I chose head of household and marital status as the variables for socioeconomics. Marital status having only two options allows for clear-cut comparisons, even in relation to other variables. I chose Age of the head of household because of the large variation of sample size. For Expenditure variables I chose annual expenditures and housing for the large range in samples.
Use Table 1 to report the variables selected for this assignment. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 1. Variables Selected for the Analysis
Variable Name in the Data Set |
Description (See the data dictionary for describing the variables.) |
Type of Variable (Qualitative or Quantitative)
|
Variable 1: “Income” |
Annual household income in USD. |
Quantitative
|
Variable 2: “MaritalStatus” |
Marital Status of Head of Household |
Qualitative
|
Variable 3: “AgeHeadHousehold” |
Age of the Head of Household |
Quantitative |
Variable 4: “AnnualExpenditures” |
Total Amount of Annual Expenditures |
Quantitative |
Variable 5: “Housing”
|
Total Amount of Annual Expenditure on Housing |
Quantitative |
Reason(s) for Selecting the Variables and Expected Outcome(s):
1. Variable 1: “Income” - It was required to be chosen by the assignment
2. Variable 2: “Marital Status “ - Marital status having only two options allows for clear-cut comparisons, even in relation to other variables
3. Variable 3: “Age of Head of Household“ - the large range of sample size
4, Variable 4: “Annual Expenditures" - large range in samples
5. Variable 5: “Housing” - large range in samples
Data Set Description:
Proposed Data Analysis:
Measures of Central Tendency and Dispersion
Complete Table 2. Numerical Summaries of the Selected Variables and briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 2. Numerical Summaries of the Selected Variables
Variable Name |
Measures of Central Tendency and Dispersion |
Rationale for Why Appropriate
|
Variable 1: “Income” |
|
I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. The variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative. |
Variable 2: “Marital Status” |
|
I am using mode for the following reason: 1. As the data is nominal, the median and mean cannot be determined |
Variable 3: “Age of Head of Household” |
Standard Deviation |
I am using mean for the following reasons: 1. It is the most commonly used measure of central tendency. 2. The variable is quantitative. I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. the variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. it is the most commonly used measure of dispersion 3. The variable is quantitative. |
Variable 4: “Annual Expenditures” |
Standard Deviation |
I am using mean for the following reasons: 1. It is the most commonly used measure of central tendency. 2. The variable is quantitative. I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. the variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. it is the most commonly used measure of dispersion 3. The variable is quantitative. |
Variable 5: “Housing Expenditures” |
Standard Deviation |
I am using mean for the following reasons: 1. It is the most commonly used measure of central tendency. 2. The variable is quantitative. I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. the variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. it is the most commonly used measure of dispersion 3. The variable is quantitative. |
Graphs and/or Tables
Complete Table 3. Type of Graphs and/or Table for Selected Variables and briefly explain why you choose those graphs and/or tables. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 3. Type of Graphs and/or Tables for Selected Variables
Variable Name |
Graph and/or Table |
Rationale for why Appropriate? |
Variable 1: “Income” |
Graph: I will use the histogram to show the normal distribution of data. |
Histogram is one of the best plot to show the normal distribution of quantitative level data. |
Variable 2: “Marital Status” |
Graph: I will use the Pie Chart |
The Pie Chart is useful as there are only two categories and the data can be presented as a percentage. A Pie Chart is useful for qualitative data. |
Variable 3: “Age of Head of Household”
|
Graph: I will use the Box Plot |
The Box Plot is useful for large amounts of data and help to identify any outliers in the data. |
Variable 4: “Annual Expenditures” |
Graph: I will use the Histogram |
The histogram is useful shows normal distribution of quantitative data, and is easier to read than a frequency table. |
Variable 5: “Housing Expenditures” |
Graph: I will use the Histogram |
The histogram is useful shows normal distribution of quantitative data, and is easier to read than a frequency table. |
Scenario: Please write a few lines describing your scenario and the four variables (in addition to income) you have selected.
For my scenario, I will be looking at the difference between the cost of housing and food in single households, compared to married households. Looking at this data, I can come up with a household budget plan. This will present the question, do single households spend more or less money on housing and food than married households? The four variables I have chosen are: Marital Status, Family Size, Housing, and Food.
Use Table 1 to report the variables selected for this assignment. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 1. Variables Selected for the Analysis
Variable Name in the Data Set |
Description (See the data dictionary for describing the variables.) |
Type of Variable (Qualitative or Quantitative) |
Variable 1: “Income” |
Annual household income in USD. |
Quantitative |
Variable 2: “MaritalStatus” |
Marital Status of Head of Household. |
Qualitative |
Variable 3: “FamilySize” |
Total Number of People in Family (Both Adults and Children) |
Quantitative |
Variable 4: “Housing” |
Total Amount of Annual Expenditure on Housing |
Quantitative |
Variable 5: “Food” |
Total Amount of Annual Expenditure on Food |
Quantitative |
Reason(s) for Selecting the Variables and Expected Outcome(s):
1. Variable 1: “Income” – This variable is automatically chosen because the budget is based off of the household income.
2. Variable 2: “MaritalStatus“ – I have chosen marital status because being married typical means that the family is living off of two incomes.
3. Variable 3: “FamilySize “ - I chose this variable because I would like to know the correlation between increased family size with the potential of higher housing and food costs.
4. Variable 4: “Housing“ - This variable is important to the budget because it is of the largest expenses.
5. Variable 5: “Food“ - This variable was chosen because I believe that this will be the determining factor in my scenario.
Data Set Description: The data is a random sample from the US Department of Labor’s 2016 Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures.
Proposed Data Analysis: Using my chosen variables, I plan to use graphs and charts containing the data to look at the correlation between what I think will be the increase of housing and food expenses in married (assuming larger) families, versus the less expensive single (assuming smaller) families.
Measures of Central Tendency and Dispersion
Complete Table 2. Numerical Summaries of the Selected Variables and briefly explain why you choose those measurements. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 2. Numerical Summaries of the Selected Variables
Variable Name |
Measures of Central Tendency and Dispersion |
Rationale for Why Appropriate |
Variable 1: “Income” |
|
I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. The variable is quantitative. I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative. |
Variable 2: “MaritalStatus” |
|
I chose mode as the measure of central tendency because it can be used with all levels of data and the variable is qualitative. |
Variable 3: “FamilySize” |
|
I am using mean as the measure of central tendency because I am looking for the average family size within single households, versus married households. I am using variance as the measure of dispersion because it is calculated as the average squared deviation of each number from the mean of a data set.
|
Variable 4: “Housing” |
|
I am using mean as the measure of central tendency because I am looking for the average housing cost within single households, versus married households. I am using variance as the measure of dispersion because it is calculated as the average squared deviation of each number from the mean of a data set. |
Variable 5: “Food” |
|
I am using mean as the measure of central tendency because I am looking for the average food cost within single households, versus married households. I am using variance as the measure of dispersion because it is calculated as the average squared deviation of each number from the mean of a data set. |
Graphs and/or Tables
Complete Table 3. Type of Graphs and/or Table for Selected Variables and briefly explain why you choose those graphs and/or tables. Note: The information for the required variable, “Income,” has already been completed and can be used as a guide for completing information on the remaining variables.
Table 3. Type of Graphs and/or Tables for Selected Variables
Variable Name |
Graph and/or Table |
Rationale for why Appropriate? |
Variable 1: “Income” |
Graph: I will use the histogram to show the normal distribution of data. |
Histogram is one of the best plot to show the normal distribution of quantitative level data. |
Variable 2: “MaritalStatus” |
Graph: Pie chart. |
I will use the pie chart to show the percentage of each marital status. |
Variable 3: “FamilySize” |
Graph: Bar Chart. |
I will use the bar chart to show the different variations of family size. |
Variable 4: “Housing” |
Graph: Bar Chart. |
I will use the bar chart to show the housing costs relevant to family size. |
Variable 5: “Food” |
Graph: Bar Chart. |
I will use the bar chart to show the food costs relevant to family size. |