Class 17 – Thursday, April 6, 2011

From Maura – I taught both sections since Ethan is out of town.

We had the exam on Tuesday and it turned out to be more difficult than we expected.  I know the students can do the problems, but something about the wording and the length of the exam threw them off track.  And they had to print from Excel, which is always a challenge in our labs with a central printing system.  So it was disappointing for both the professors and the students.  We’ll see how they do with the take-home version, where they had a chance to re-do problems or finish the exam.

Ethan’s class went a bit smoother than mine, possibly because it was my second time through the material, so I’ll focus on that.  First we reviewed linear functions and worked through the first exam problem.  The point I wanted to make was that this is an example of a linear model where the relationship between x (miles driven) and y (total cost) is completely understood.  If we drive more miles, we know exactly how much extra we add to the total cost of the car.  We also noted that in this case, there is a direct relationship (we’ll return to that later).

Then we looked at the leaning tower of Pisa data from 1975 to 1987.  This data set gives the amount of lean for the tower, measured in meters, over that time period. While people found the file we talked a bit about the leaning tower of Pisa – how old it is, why it leans, why it’s famous, Galileo, etc.  We made a scatter plot of the data and talked about the linear trend. The most basic interpretation is:  “as the years go by, the tower leans more”   Can we make this more precise?  If we could find  a line that all the points went through, then we’d be back to our car example and we would know everything.  But there’s no one line that completely captures this data, so the best we can do is find a line that captures the trend.  We drew one by hand, then had Excel do the work by fitting a trendline to the data.  Easy enough, and my point – as always – is that Excel does the computations but we do the thinking.  We looked at the linear function, talked about what the slope meant (“for every additional year, there is about 0.0009 meters additional lean”).  After 10 years, the lean would be 0.009 meters; after 10,000 years, it would be 9 meters.  Right?  Nonsense – and so we had the first of a list of cautions:  be careful when predicting with trend lines.  You should be very cautious about how far into the future (or the past) you predict. Next we went back in time to 1970 and predicted what the lean was then – an important assumption here is that the trend made sense in 1970.  Ethan’s class noticed right away that the number we got didn’t make sense with the rest of the data.  We put that on the side and tried a prediction into the future: what would the lean be in 1990, assuming the trend continues?  This time it really didn’t make sense (both classes noticed it right away).  This led to the next caution: Excel may not always give you the answer you need.

So what went wrong?  In this case it’s a subtle point: Excel rounded the slope and when we used the rounded number in our calculation, it threw off the answer.  We made Excel calculate the slope using the SLOPE function, then used that answer to re-do our calculations, and it made a lot more sense.  The caution could be revised to read:  don’t believe everything Excel tells you – you still need to think.

When we built the trendline (which I said was more commonly called a regression line), we displayed both the equation and the R-squared value. The students wondered what R-squared was so we talked about how well the line fit the data.  In this case R-squared is very close to 1, so the correlation is very strong.  In other words, it’s a very good fit for the data. We did not take the square root to talk about R, as we were running low on time.

It is important in this subject to talk about the dangers of over-interpreting (or mis-interpreting) regression and correlation.  This led to the next caution:  correlation is not the same as causation.  I gave them some rather absurd examples to think about, including the Flying Spaghetti Monster example (on the web at www.venganza.net).  A link on the website talks about how the decrease in the number of pirates has led to global warming. They make a clear and compelling connection!

We also looked at the crime rate vs. fear index graphs from the book.  I have a harder time using the examples here to make a convincing argument about correlation – the pirate example seems to capture their attention better.


blog home page