First, open the data set FISH, which you can obtain by logging into TritonEd, then selecting "Content" from the menu on the left and opening the folder "Data for the Labs".

We have data on 159 fish which were caught from Lake Laengelmavesi in Finland. The data were obtained from the Journal of Statistics Education Data Archive. The data set contains 159 rows (one for each fish) and the following three columns:

Variable Name |
Description |

Species | The species of fish |

Weight | The weight of the fish in grams. (Note: two measurements are missing) |

Length | The length of the fish in centimeters from the nose to the beginning of the tail |

A categorical variable is a variable that sorts cases into categories. If we had data on students enrolled in math 11, variables such as gender, class year, and major would be categorical. In your data set, the variable called "Species" is categorical, as it sorts fish into several species. Categorical data can be displayed in a bar chart or a pie chart.

Begin by making a pie chart of the "Species" variable. Go to

If you are using Minitab Express, you make a pie chart by going to

Now explore some different options for labeling your graph. You can change the title of the graph by double-clicking on the title. If you go to

In Minitab Express, you can still change the title by double clicking on it. Next try clicking inside the graph, then clicking in the plus sign inside a circle that appears to the right of the graph near the top. You can then click on the triangle beside "Slice Labels" to display the frequency or the percentage of times that each category appears in the data.

Another option is to display these data in a bar chart. Go to

In Minitab Express, you make a bar chart by going to

- Present one graph (either a pie chart or a bar chart) that you think gives a good depiction of the data. To accompany this graph, write a paragraph of two or three sentences describing the data. Mention how many species there are and which is the most common. (Note that this question, like many questions in these labs, does not have a single correct answer. There are two different charts you could choose from, and there are different features of the data that you could choose to focus on in your description.)

- How many Common Bream were caught from the lake? What percentage of the fish caught from the lake were Common Bream?

In Minitab Express, you copy the graph by clicking inside the graph and going to

A quantitative variable records numerical measurements about the cases. If we had data on students enrolled in Math 11, variables such as GPA or SAT scores would be quantitative variables. In the data set we are working with, the heights and lengths of the fish are quantitative variables. Quantitative data are most commonly displayed using a histogram.

Start by making a histogram of the weights of the fish. To do this, go to

In Minitab Express, you make a histogram by going to

- Present a histogram for the weights of the fish and a histogram for the lengths of the fish. As always, make sure the axes of your graphs are labeled appropriately. Supplement your graphs with a few sentences describing the distributions.

- Would you describe the distributions as symmetric (which would mean that the left and right sides of the histogram are approximately mirror images) or skewed?

- Are there any values that you would describe as outliers among either the weights or the lengths?

(including the quotation marks). After you click "OK" three times, you will get a histogram of the lengths of the Common Bream.

In Minitab Express, you have to do this by hand. Go to

- Present histograms of the lengths of the Common Bream and the lengths of the Perch. Summarize what you find in a few sentences. Discuss the center of the distributions (whether the Common Bream or the Perch tend to be longer), the spread of the distributions (whether there is more variability in the lengths of the Common Bream or the Perch), and the shape of the distributions.

Boxplots are also useful for displaying quantitative data. Although histograms are usually the best choice for displaying the distribution of one variable, boxplots can be useful for comparing two variables or for comparing one variable across several categories.

To make a boxplot just of the "Length" variable, go to

- The line in the middle of the box is the median.
- The top of the box is the upper (third) quartile and the bottom of the box is the lower (first) quartile.
- The top of the vertical line (whisker) is the largest value within 1.5 interquartile ranges (IQRs) of the upper quartile, while the bottom of the whisker is the smallest value within 1.5 IQRs of the lower quartile.
- All data values not within 1.5 IQRs of either the upper or lower quartile are plotted separately by an asterisk.

Now compare the weights and lengths of different species of fish. Start with the weights. To do this, go to

In Minitab Express, you do this by going to

Present the boxplots comparing the weights and lengths for different species of fish. Then use these boxplots to answer the following questions. In each case, explain briefly how you are able to obtain your answer from the boxplots.

- Which of the species of fish tends to be the lightest? Which tends to be the shortest?

- Which of the species has the highest median weight? Which of the species has the highest median length?

- For which of the species of fish do the weights have the highest interquartile range?

