Investigating the data
Now that we have our data, we want to start to understand it.
Let’s start by looking at the Width of shell variable. I’ve rearranged the data so that the smallest width is given first and the width increases in order up to the largest width.
Width of shell |
9.4 |
9.6 |
9.7 |
9.8 |
9.9 |
10.0 |
10.0 |
10.1 |
10.1 |
10.2 |
10.2 |
10.2 |
10.3 |
10.4 |
10.6 |
We can plot these measurements. In the plot below, called a dot plot, each dot represents one measurement. The value of each measurement is shown by the numbers on the x-axis. For example, there are three measurements of 10.2cm. These are represented by the three dots stacked up at 10.2cm on the plot. Looking at the plot gives us an idea of the middle of the data, somewhere around 10cm, and the spread of the data, from around 9.5cm to around 10.5cm.

Let’s look at some descriptive statistics.
We know we measured 15 crabs so the count of crabs is 15. The count is also called n (the number of observations in the sample).
n = 15
Let’s add up or sum all of the widths. (You can do these calculations by hand or using a calculator to check my results.)
sum = 9.4 + 9.6 + 9.7 + 9.8 + 9.9 + 10.0 + 10.0 + 10.1 + 10.1 + 10.2 + 10.2 + 10.2 + 10.3 + 10.4 + 10.6= 150.5
The mean is the sum divided by the count. The mean is one of three averages that we will look at. An average is a measure of the centre of the data.
mean = 150.5/15 = 10.03
Another measure of the centre is called the median. The median is the middle value. The word median comes from the Latin word medianus which means middle.
Here are the 15 measurements in order. We need to find the middle value.

We can start crossing out the values at each end, the smallest and largest values.

Then cross out the next two

And so on until you are left with the value in the middle which is the median

median = 10.1
The last measure of the centre of the data awe are going to look at is called the mode. The mode is the most frequently occurring value in the data. We can see this in the dotplot above as well as in the highlighted numbers below.

mode = 10.2
We will also record the minimum (min) and maximum (max) values in the data
min = 9.4
max = 10.6
The range of the data is max – min. This shows how spread out, or wide the data is.
range = 10.6 – 9.4 = 1.2
Descriptive Statistic | |
n | 15 |
sum | 150.5 |
mean | 10.03 |
median | 10.1 |
mode | 10.2 |
min | 9.4 |
max | 10.6 |
range | 1.2 |