MAT 211 Chamberlain Statistics, Frequency Tables, Central Limit Theorem & Estimation Discussion

I’m working on a Statistics question and need guidance to help me study.

Discussion B: Frequency Tables

1. Initial Post (300+ words):

Think of some discrete random variable you observe on a regular basis. For example, it could be the (rounded) number of hours you sleep, how many gallons of gas are in your car when you get into it, how many boxes of cereal are in your house, how many days between grocery shopping, etc. (just make sure it takes only integer values). Try to list all of the possible values that this discrete random variable can take. If you can, collect some frequency data – give the relative frequency table and use this as an estimate of the probability distribution. Calculate the expected value and the standard deviation for this probability distribution. Interpret these parameters, and discuss whether they make sense based on your experience.

2. Responses (100+ words, x2): (Found in PDF file)

Look at your classmates’ distribution. Is there any well-known distribution that could be used to model their random phenomenon? Some well-known discrete distributions are: Uniform, Bernoulli, Binomial, Geometric, and Poisson (but there are others). Explain why this distribution might be appropriate. Post a picture of the discrete distribution and a histogram of the frequency data from the original post, and comment on what is similar and different. Are there any outliers that, if removed, would make the frequency data match the distribution really well?

*****Within the pdf file are the peer response posts to respond to!*****


Discussion C: Central Limit Theorem

Initial Post (300+ words):

Collect some quantitative data (if your data from week 1 is quantitative, you can use it). Find the sample mean and standard deviation. Plot it in a histogram. Does the data seem to follow the bell curve of the normal distribution? What features of the data do or do not fit in with the shape of the normal curve. How much deviation from the curve is to be expected?

Now perform a normality test on your data (Shapiro-Wilk test: http://sdittami.altervista.org/shapirotest/ShapiroTest.html or http://www.brianreedpowers.com/MAT240/stats/descriptiveStats.html)– the test will give you a p-value. The higher the p-value, the more closely your data follows the normal distribution. Based on the test, do you think your data could have been drawn from a normal distribution?

Responses (100+ words x2): (Found in PDF file)

Choose two of your classmates’ data sets. Take 30 random samples of 5 data points each (one way: Past the data here http://www.randomizelist.com/ randomize the list and take the first 5 numbers, or use the sampling feature at http://www.brianreedpowers.com/MAT240/stats/descriptiveStats.html), and calculate the average for each of these samples. You will now have 30 sample means. Create and post a histogram for your sample means. What is the mean of these means? What is the standard deviation? Does this make sense based on the Central Limit Theorem? Do the sample means follow a normal distribution? What p-value does the normality test give? How and why does this differ from the original data?



Discussion D: Estimation

Initial Post (300+ words):

Give an example of an interval estimate of an average or proportion you may use in your daily life. For instance, you may say that you are pretty sure your average commute time is between 25-30 minutes, or you are fairly confident that between 60-65% of the population love dogs. Collect some data to see how well your intuition is working. First, does your sample data meet all assumptions necessary to construct the confidence interval of the type you need? Even if it doesn’t, construct and interpret the confidence interval.

Week 8 Responses (100+ words x2): Choose two of your classmates’ posts. Collect your own data and find your own confidence interval and compare. If, for example, your point estimate is less than your classmate’s point estimate, can you be sure or confident that the corresponding parameter is less? Why or why not, and what could you do to try to figure it out? What impact do the bounds of the intervals have?