Introducing Fathom

Adapted by permission from David Arnold:
http://online.redwoods.cc.ca.us/instruct/darnold/math15/aliaga/Chapter04/FathomHtml/index.html
by Rick Castrapel, Imperial Valley College 2007

Obtaining the Class Data Set

To obtain the data for this activity, right click on this link (ClassData.ftm) and save it to your desktop. Note that this data was gathered from a previous statistics class. You will get a chance to perform the same analysis with your class's data in the first exercise below.

Opening Fathom and the Class Data Set


To start Fathom, double-click the Fathom icon on the desktop. If no desktop icon is present on your machine, then open the Start Menu, select Programs, then select Fathom from the program list. If all goes well, Fathom will open on your machine as shown in Figure 1.

Figure 1.

Select Open from the File menu as shown in Figure 2.

Figure 2.

This will open the Open file box, shown in Figure 3.

Figure 3.

Use the usual procedure on your operating system to browse to the folder containing the downloaded data set classData.ftm. This will appear as a collection box in your Fathom screen as shown in Figure 4.

Figure 4.

Note that the collection ClassData is "selected." If yours is not selected, then click it once with your mouse to select it. Now, with the collection box selected, click  a "New Case Table" from the toolbar and drag into the Fathom window. Use the mouse to resize the Case Table as shown in Figure 5. This is done by clicking and dragging a border or corner of the Case Table with your mouse.

Figure 5.

Remember that each individual row is a case,  or in statistics jargon, an observation. For example, the first row represents the responses a student in our statistics class filled in on his Data Collection Form. There were 21 letters is her name, with a Scrabble score worth 43. The age of her mother at birth was 33 and the age of her father at birth was 30. Responses for her height, gender, the value of the coins on her person in cents, the number of keys on her person, the number of states she has visited, and her estimate of her instructor's age appear in the remaining columns.

The second row is another case or observation. It contains the responses of another person in our class.

Note that the first column is labeled NameLe... . If you use the mouse to expand the column to the right, you will see that the entire name of the column is NameLetters. In statistics parlance, NameLetters is a variable. For each subject in our population (students in our class) the variable NameLetters gives the number of letters in that subject's name. Thus, the first column of the Case Table contains the number of letters in the name of each subject or student in our class. Note that it varies, some students have names with 21 letters, others with 17 letters, and so on. Hence the name variable.

Thus, each column in the Case Table is a variable and the entries in the column are the values taken on by that variable for each observation (row). Fathom doesn't use the word variable, using instead the word attribute as a synonym for variable. Thus, each column of the case table holds a new attribute. The name of each attribute appears as a "header" for each column. In the work that follows, you will be asked to "drag the attribute" onto a graph. This is taken to mean that you should click and hold the mouse pointer on the name (header) of the attribute you want, then drag the attribute to a location in a graph, summary table, etc.
 

Summarizing Data Graphically

One of the best way to analyze a set of data is to construct graphs from the numbers in your data set (in Fathom, a data set is called a collection). We will explore several different types of graphs in this activity, including the bar graph, the dot plot, the histogram, and the scatterplot. Let's begin with a bar plot, which is useful for describing the distribution a qualitative (categorical) variable.

Bar Graphs

Let's begin with a simple bar graph that determines the proportion of men and women in this statistics class. First, use your mouse to shrink the Case Table a bit so that we can fit a graph on the Fathom screen. Next, click the Graph icon on the toolbar, hold down the mouse button, and drag a graph window into the main Fathom window, as shown in Figure 6.

Figure 6.

Next, click and drag the "scroll bar" at the bottom of the Case Table until the attribute Gender is visible in the Case Table, as shown in Figure 7.

Figure 7.

Finally, to construct our bar plot for gender, simply drag the attribute gender to the graph and drop it where it says "Drop an attribute here." When your mouse is correctly positioned over "Drop an attribute here," you will see a black rectangle highlight the drop region. This is your signal to release the mouse button, which will produce the result shown in Figure 8.

Figure 8.

Note that the vertical axis contains a "count" of the number of occurrences of "m" and "f", the two possible values for the qualitative, or categorical variable gender. It would appear from the scale on the vertical axis that the class consists of 13 female students and 13 male students. But how can we be absolutely sure? A summary table is a nice way to get a definitive answer.

Use your mouse to resize or "shrink" the Case Table to provide a bit of room for the Summary Table we will create in the Fathom window. Then, click and drag a Summary Table from the toolbar and place it in the position shown in Figure 9.

Figure 9.

Now, drag the attribute gender from the Case Table and "drop" it onto the downward pointing arrow in the Summary Table. This will result in a Summary Table detailing the count of male and female students in our class, as shown in Figure 10.

Figure 10.

Note that the Summary Table indicates that there are a total of 26 students in our statistics class, 13 of whom are female, the remaining 13 are male.
 

Proportion Instead of Count

Often, we prefer to have a proportion on the vertical axis instead of a count, as is shown in our bar graph in Figure 10. This is a simple matter to remedy. Take your mouse and right-click count() in the bar graph. This will "pop up" a so-called "context window" where you need to select Edit formula, as shown in Figure 11.

Figure 11.

This will open the Expression calculator, shown in Figure 12.

Figure 12.

Explore the Expression Calculator. See what values and functions are available to you. Delete the function count() and replace it with the function count()/grandTotal, as shown in Figure 13.

Figure 13.

Select OK to accept the change and close the Expression calculator. The result is that the count on the vertical axis of the bar graph has been replaced with a proportion, as shown in Figure 14.

Figure 14.

It would appear that the proportion of female students in our statistics class is 50%. But how can we be sure? Simple, right click the summary table to pop up the context menu shown in Figure 15. Note that we've selected "Add Formula" from the context menu.

Figure 15.

Selecting Add Formula from this context menu brings up the Expression calculator, as shown in Figure 16, where we've again entered the function count()/grandTotal.

Figure 16.

Select OK to close the Expression calculator and update the Summary Table as shown in Figure 17. Note that we've used the mouse to shrink the Case Table to an icon and move it out of the way to allow us room to resize the Summary Table.

Figure 17.

This resizing allows us to see all of the entries in the Summary Table. Note that the proportion of female students in our statistics class is exactly 0.5.

One last comment is in order before moving on to histograms. In Figure 17, take your mouse and click the bar representing the females in our statistics class. Several things will happen. It will be highlighted in red, as shown in Figure 18, the status bar at the bottom of the window will give you a summary of the count, all of the females in the Table will be selected, and the female summary values will be highlighted.

Figure 18.

Note the lower left end of the status bar at the bottom of the window. It says that there are 13 female members of the class and that represents 50% of the total roster.
 
 

Histograms

One of the most important graph types that we will use in statistics is the histogram. They can be quite time consuming to draw by hand, particularly for large data sets. Fortunately, Fathom makes the drawing of histograms a simple endeavor.

We begin by deleting the summary table and graph. This is accomplished by clicking on the frame at the top of the graph window and then pressing the "Delete" key on your keyboard. Similarly, click the Summary Table so the frame appears, then click the frame at the very top of the Summary Table window, then press the "Delete" key on your keyboard. Drag and resize the Case Table, then drag down a new Graph as shown in Figure 19.

Figure 19.

Click the Case Table so that the scroll bar is visible at the bottom of the Case Table, then use it to bring the attribute Instruct... into view. Now, drag the Instruct... attribute to the graph, dropping it where it says "Drop an attribute here." This will result in the Dot Plot shown in Figure 20.

Figure 20.

In the upper right corner of the Graph window, where it say Dot Plot, note the downward pointing arrow. This means that this is a "Drop Down" list box. Click on the arrow and a list of choice for graphs will appear, as shown in Figure 21.

Figure 21.

Select "Histogram" from the drop down list. This will change the graph in Figure 21 from a Dot Plot to a Histogram, as shown in Figure 22.

Figure 22.

In Figure 22, the width of each bar in the histogram is calculated automatically by Fathom. On the vertical axis, a count of the number of students in our class falling in this "bin" is given. Fathom can give you more detailed information on these "bins." Take your mouse and click the first bar. It will highlight in red, as shown in Figure 23.

Figure 23.

A number of other important things happen when you highlight a bar of the histogram like this. First, consider the lower left corner of Figure 23. In the status bar, Fathom informs us that data falling in this bin lie between 27 and 29. Thus, the "width" of the bin is 2. Further, Fathom informs us that there is one case that falls in this bin.

It is important to understand that Fathom is "dynamical" software. All objects on a Fathom screen are interconnected in real time. When a change is made in one object, all related objects are updated. In this case, click the first bar of the histogram highlights the bar in red and reports statistics in the status bar. But note that the case falling in this bin is also highlighted in the Case Table. The 21st observation or "case" in the Case Table estimated the instructor's age at 27 (extra credit for you!). If you resize the Case Table so that all observations are showing, then click the first "bin" of the histogram, you will be able to see both cases highlighted in the Case Table.
 

By Gender

Suppose that you wish to know whether the female members of the class have a different estimate of the instructors age than do the male members. Click the Case Table and use the horizontal scroll bar to make the gender attribute visible, as shown in Figure 24.

Figure 24.

Now, drag the gender attribute onto the vertical axis. When properly placed, a black rectangle will indicate that you should "drop" the attribute (release the mouse button). This will result in two histograms, one containing estimates of the instructor's age by the female members of our class, the other by male members, as shown in Figure 25.

Figure 25.

When you view the histograms in Figure 25, which group appears to have a higher "mean" (average) estimate of the instructor's age? The female students or the male students? Let's have Fathom help us with this question. Right click with your mouse in the graph window. The context menu shown in Figure 26 will "pop up."

Figure 26.

Select Plot Value from this menu and the Expression calculator will open. Enter the expression mean(InstructorAge, Gender="f") , as shown in Figure 27. This can be a bit tricky, as entering expressions in the Expression calculator involves a bit of a learning curve. In this instance, for example, you will discover that as soon as you type an opening quotes, the closing quotes are automatically added. This is just one of many things you will get used to the more you use the Expression calculator.

Figure 27.

Press OK and note that a vertical bar appears on the histogram of female estimates of the instructor's age, as shown in Figure 28.

Figure 28.

Note that the mean estimate of the instructor's age by female students is approximately 52.1941 years. Now, right click the graph window again and select Plot Value from the context menu. When the Expression calculator opens, enter mean(InstructorAge, Gender="m"), as shown in Figure 29.

Figure 29.





Click the OK button in the Expression calculator and a vertical line will be drawn representing the mean (average) estimate of the instructor's age by the male members of the class, as shown in Figure 30.

Figure 30.

Note that the mean value of the instructor's age, as judged by the male members of the class is higher, at approximately 45.6923 years. Of course, there is that one "outlier" who seems to think the instructor's age lies between 70 and 75 years. Select this outlier with your mouse and it will be come highlighted in red, as shown in Figure 31.

Figure 31.

In the lower left corner of the Fathom window, note that in the status bar it is reported that there is one case that thinks the instructor's age lies between 71 and 73 years. Perhaps this "outlier" is strongly influencing the mean and if it weren't for this one "outlier," the male and female mean estimate would be the same? Let's find out.

Remember, Fathom is "dynamical software." Here comes a demonstration that will clearly demonstrate the "dynamic" nature of Fathom. Click on the "outlier," highlighted in red in Figure 31, and "drag" the outlier to the left, watching everything update in Fathom as you change the value of the "outlier" dynamically, as shown in Figure 32.

Figure 32.

What would happen if we deleted this "outlier" from consideration? Then how would the mean estimates of the instructor's age compare? In Figure 32 right click on the outlier,  and a context window will "pop up" as shown in Figure 33.

Figure 33.

Select Delete Case from this context menu and the "outlier" will be removed, as shown in Figure 34.

Figure 34.

Note that the male mean estimate of the instructor's age is now at 43.5833. There is less of a disparity than we see in Figure 31. This is a valuable lesson about "outliers."
 

Exercises

  1. Download the census data NewClassData.ftm and open the file in Fathom. Use this data to repeat the exact histogram analysis on the Instructor's Age as shown above, only this time use NewClassData, which is the data gathered from your classmates.

  2. Download the census data AZ_Central112.ftm and open the file in Fathom. Double click the collection (or type Ctrl + I) to open the inspector, then select the Comment tab and read the description of the data set.

  3. Delete the graph and summary table, then drag a new graph into the Fathom window. This time, craft a histogram with income on the horizontal axis. Separate income into two groups, male and female, by dragging sex onto the vertical axis. Follow the lead in this activity to place the mean of both groups on the graph. Arrange objects in your window, Preview, then print if the preview looks good.

  4. Download the Fathom file SATGPA.ftm. Provide an analysis on SAT math scores, grouped by sex.

  5. Using SATGPA.ftm again, provide an analysis of First Year GPA, grouped by sex. Follow the lead of the previous exercise (Histogram, means, Summary Table) and obtain a printout.