Need Help?

Get in touch with us

searchclose
bannerAd

Data distributions

Grade 10
Sep 16, 2022
link

Key Concepts

  • Understand how to find measures of Center and Spread
  • Understand how to use appropriate Statistics to compare Data sets
  • Understand how to recognize a normal Distribution
  • Understand how to Classify a Data distribution.

Critique and Explain 

Chen and Dakota were asked to estimate the mean and median of the following data set. 

Chen said, ’The middle value is 11. Both the mean and median are approximately 11.”  Dakota said  

“Most of the data are the left I think the mean and median will be about 9, with the mean slightly  

Larger.” 

Critique and Explain 
  1. Is either Chen or Dakota correct? Explain 
  1. What strategies could you to approximate the exact mean and median 
  1. Which measure of center is more representative in this case, the mean or the median? Explain. 

Solution: 

parallel
  1. Both Chen and Dakota are not correct. Because the mean of the histogram is 11(approx) and median is 8(approx). 
  1. I am going to follow the following strategies: 

Mean = ∑xifi / n where, 

xi= mid value of the class intervals 

fi= Frequency 

n = total frequency 

Median = l + n/2 −cf / f * h where, 

parallel

L = lower boundary of the median class  

n = total frequency  

cf = cumulative frequency of the median class 

f = frequency of the median class 

h = size of the class   

  1. The mean is more representable in the given case. We can determine the data values by above the mean or below the mean. The median is nothing but the middle value of the data when the data is written in ascending order. 

Example 1: Find measures of centre and spread 

  1. What are the mean and standard deviation of the following data set? 

4, 12, 15, 9, 14, 13, 6, 7, 6, 25, 3, 13, 17, 22, 4 

The mean, or average of a data set is the sum of the values in the data set divided by a number of values in the data set. The Standard Deviation is a measure of how much the values in a data set vary, or deviate, from the mean. It is the measure of variability or spread of data. 

You can use a spreadsheet to calculate the mean and standard deviation. 

  • The mean and the standard deviation are used together to measure the centre and spread of the data. 
The mean and the standard deviation are used together to measure the centre and spread of the data. 

The mean is

x ≈ 11.6, and the standard deviation is σ ≈ 6.3 

  1. What is the five-number summary of the data set? 

The five-number summary includes the minimum value, first quartile, median, third quartile  and maximum value. 

Step 1: Rearrange the data in ascending numerical order. 

3, 4, 4, 6, 6, 7, 9, 12, 13, 13, 14, 15, 16, 17, 22, 25 

Step 2: Note the minimum and maximum values: 

Minimum = 3  

Maximum = 25 

Step 3: 

Calculate the median, the number in the middle of the data set. Since there are an even number of values, the median is the average of the two middle values or 12.5. 

Step 3: 

Step 4: 

Calculate the first and third quartiles. The quartiles show how the data are disturbed. The first quartile 

is the median of the lower half of the data, 6. The third quartile is the median of the upper half  

of the data, 15.5. 

These data can be represented in a box-and-whisker plot. Notice that the one quartile is closer to the median than the other. 

Step 4: 

The five-number summary of this data set is: minimum = 3, 1st quartile = 6, median = 12.5 

3rd quartile = 15.5, maximum = 25 

Try it 

  1. List the mean, standard deviation, and five-number summary of the following data set 3, 4, 9, 12, 12, 14, 15, 19, 30, 32, 33, 34, 34, 35 

Solution: 

Mean = 3+4+9+12+12+14+15+19+30+32+33+34+34+35 / 14

            =286 / 14

            ≈ 20.4 

List the mean, standard deviation, and five-number summary of the following data set 3, 4, 9, 12, 12, 14, 15, 19, 30, 32, 33, 34, 34, 35 

Standard deviation =   ∑(x − x−)2 / n−1

                                      = √1883.38 / 13

                                      ≈12.03 

Five number summary: 

Minimum value = 3 

Maximum value = 35 

Median = mean of n/2th observation and n2 + 1th observation. 

               = mean of 7th and 8th observations 

               = 15+19 / 2 

= 17  

First quartile = 12 

Third quartile = 33 

Then, 

The five-number summary of this data set is: minimum = 3, 1st quartile = 12, median = 17 

3rd quartile = 33, maximum = 35 

Example 2 : Use appropriate statistics to compare data sets  

  1. How can you describe different types of distributions? 

To compare the different types of distributions, look at the shape, the center, and the spread of the distributions. 

The standard deviation, range, and the interquartile range are three measures of spread. The range of a data set is the difference.  

  • When measuring centre and spread, median and interquartile range are used together, and mean and standard deviation are used together between the maximum and minimum values. The interquartile range is the difference between the third quartile and the first quartile. 

A skewed distribution is one with a shape that is stretched out in either the positive or negative direction. A symmetrical distribution has a shape, when reflected across the mean, the display is roughly the same. 

The shape of a distribution can affect the measures of center and spread and determine which measures the center and spread best describes the data. 

Use appropriate statistics to compare data sets 

The mean, median, and mode are all about the same in a symmetric distribution. You can use the mean and the standard deviation to describe the center and spread. 

  1. What measures of center and spread would you use for the following data set? 

10, 13, 16, 21, 22, 26, 29, 29, 30, 32, 33, 33, 33, 35, 37 You can use a histogram to determine the shape. Since the mean is more affected than the median by a data distribution that is skewed, it is better to use the median and interquartile range as the measures of center and spread. Also, the quartiles show how the data are disturbed differently on either side of the center. 

data values

The data are already in numerical order. 

The data are already in numerical order. 

The range is 37 – 10 = 27, and the interquartile range is 33 – 21 = 12 

Try it  

  1. What are the better measures of center and spread of the following data sets? 
  1. 55, 55, 57, 57, 57, 58, 58, 59, 59, 61, 61 
  1. 110, 110, 110, 120, 120, 130,140, 150, 160, 170, 180, 190 

Solution: 

  1. Step 1: Make a histogram of the data set.  
Make the histogram for the data. 

The histogram is skewed to the left. So, it is better to use the median and interquartile range as 

the measures of center and spread. 

  1. 110, 110, 110, 120, 120, 130, 140, 150, 160, 170, 180, 190 

Solution: 

Step 1: Make the histogram for the data. 

Make the histogram for the data. 

The histogram is skewed to the right. So, it is better to use the median and interquartile range as the measures of center and spread. 

Example 3: Recognize a normal distribution 

Are the following variables likely to have a normal distribution

  1. The heights of all people in a large group. 

A normal distribution can be modeled by a particular bell-shaped curve that is symmetric about the mean. This is call the normal curve. 

Approximately normal distributions can be found in many real-world situations where the data are symmetric and mostly clustered near the mean.  

The heights of people in a large group are likely to be normally distributed. 

The heights of people in a large group are likely to be normally distributed. 
  1. The probability of landing on each of 8 equal parts of a spinner. 

This data set is not normally distributed because each outcome has the same probability of occurring as any other. 

The probability of landing on each of 8 equal parts of a spinner. 
  1. The scores on any test. 

The scores on any test are often skewed to the left and not normally distributed, because more students will receive higher scores. 

The scores on any test. 
  1. The number of children in a family.  

The number of children in a family is not normally distributed. The distribution is skewed to the right because many families have 0, 1, 2, or 3 children, but very few families have 10 or more children. 

The number of children in a family.  

Example 4: Classify a data distribution 

How would you classify the following the data set? Describe the shape of the distribution and  

the center and spread of the data. 

106, 96 ,86, 120, 98, 76, 112, 64, 99, 72, 119, 115, 76, 120, 97 

Step 1. Make a histogram of the data. 

Make a histogram of the data.

Step 2: Analyze the shape of the histogram.  

Since the data are bunched to the right and have a long tail to the left, the data are skewed left. 

Step 3: 

Determine the center and spread of the data. Use the median and inter-quartile range. 

64, 72, 76, 76, 86, 96, 98, 99, 106, 112, 115, 119, 120, 120 

1st quartile = 76, median = 98, 3rd quartile = 115  

The interquartile range is 115 – 76 = 39. Notice that the 3rd quartile is closer to the median than  

the first quartile. This is the characteristic of a distribution that is skewed left. 

The distribution is skewed left with median 98 and interquartile range 39. 

Try it  

  1. What is the type of distribution and the center and spread of the data ?  

20 , 17 , 17 , 12 , 18  , 21 , 19  , 18 , 13  , 14 , 17 , 23 , 25  

Solution: 

Solution: 

Ascending order of the data set is  

12, 13, 14, 17, 17, 17, 18, 18, 19, 20, 21, 23, 25 

The histogram is skewed right. So, it is better to use the median and interquartile range as 

the measures of center and spread. 

Step 2: 

Determine the center and spread of the data. Use the median and inter quartile range. 

12, 13, 14, 17, 17, 17, 18, 18, 19, 20, 21, 23, 25 

Median =

13+1 / 2 = 7th observation = 18 

1st quartile = 14 +17 / 2

= 15.5  

3rd quartile = 20+21 / 2

= 20.5 

Interquartile range = 20.5 – 15.5 = 5  

The distribution is skewed with the median 18 and interquartile range 5. 

Concept Summary 

Data Distributions 

Shapes  

For distributions that are approximately normal, use mean and standard deviation to describe the data. For skewed distributions, use median and quartiles to describe the data. 

Graphs 

Graphs 

Let’s check our knowledge: 

  1. Determine the mean, standard deviation and five-number summary to the following data set. 

5, 8, 5, 9, 6, 14, 9, 3, 8, 7, 10, 12 

  1. For each of data, describe the shape of the distribution and determine which measures of center and spread best represents the data. 
  1. 28, 13, 23, 34, 55, 38, 44, 65, 49, 33, 50, 59, 67, 45 
  1. 12, 2, 14, 4, 1, 6, 11, 7, 8, 5, 9, 10, 8, 15 

Answers: 

  1. Determine the mean, standard deviation and five number summary to the following data set. 

5, 8, 5, 9, 6, 14, 9, 3, 8, 7, 10, 12 

Solution: 

Mean =

5+8+5+9+6+14+9+3+8+7+10+12 / 12

           = 96 / 12

= 8 

Standard deviation: 

Standard deviation: 

Standard deviation =∑(x − x)2 / n−1

                                      =√106 / 11

                                      ≈ 3.10 

Five number summary: 

Minimum = 3 

Maximum = 14 

Ascending order of the data set: 3, 5, 5, 6, 7, 8, 8, 9, 9, 10, 12, 14 

Median = 8+8 / 2

= 8 

First quartile = 5 

Third quartile = 10 

  1. For each of data, describe the shape of the distribution and determine which measures of centre and spread best represents the data. 
  1. 28, 13, 23, 34, 55, 38, 44, 65, 49, 33, 50, 59, 67, 45 
  1. 12, 2, 14, 4, 1, 6, 11, 7, 8, 5, 9, 10, 8, 15 

Solution: 

  1. Step 1: Make the histogram for the data set  
Make the histogram of the data set  

The shape of the histogram is symmetric. 

So, it is better to use standard deviation and mean to describe center and spread. 

  1. 12, 2, 14, 4, 1, 6, 11, 7, 8, 5, 9, 10, 8, 15 

Solution: 

Step 1: Make the histogram of the data set  

Make the histogram of the data set  

The shape of the histogram is symmetric. 

So, it is better to use standard deviation and mean to describe the center and spread. 

Exercise

Determine if each situation is likely to be uniformly distributed, normally distributed, skewed left or skewed right.

  1. The age at which people die in United States
  2. Number of pets owned by students at your school.
  3. Selling price of cars in 2018.
  4. The test scores from a history test are 88, 95, 92, 60, 86, 78, 95, 98, 92, 96, 70, 80, 89, and 96
  5. Find the mean and the standard deviation
  6. Find the five number summary of the test scores
  7. Describe the type of distribution

Concept Summary    

Concept Summary   

Data distributions

Comments:

Related topics

card img

Square 1 to 20 : Chart, Table, Perfect Squares and Examples

Square 1 to 20 When you multiply a number by itself, the result is called a square. And when you’re preparing for exams, you need to have a foundation for algebra and quick mental math because you get a really short time to do your exam. Therefore, learning the squares from one to twenty is […]

Read More >>
Square 1 to 40

Square 1 to 40 : Table, Perfect Squares, Chart and Examples

Square 1 to 40 When you multiply a number by itself, the resulting number is a square, and if you are someone who is either appearing in a competitive exam or just wants to do well in math in school, knowing square 1 to 40 is a really important skill. But manually multiplying every time, […]

Read More >>
Square Root

Square Root : Definition, Formula, Methods and Types Explained

Square Root Square roots are one of those seemingly daunting maths topics that appear in many different situations, from algebra to geometry. Yet the concepts behind them aren’t as hard to grasp. It makes handling numbers far easier if you know the concept well. Let us understand how to find the square roots of a number […]

Read More >>
Cubes 1 to 20

Cubes 1 to 20 : Chart, Table, Memory Tricks and Examples

Most students don’t struggle much with smaller cubes like 2³ or 3³. Those usually come quickly. The hesitation starts with numbers like 11³ or 17³. Or when someone suddenly asks, what is 20 cubed? That pause is not a memory problem. It’s about the lack of proper understanding and hence confidence. Naturally, learning cubes 1 […]

Read More >>

Other topics