One important statistic for examining the relationship between two factors is the Pearson product-moment correlation (indicated by the symbol r), which is a measure of the strength of the relationship between the two. The r value can be within the range ____, with a higher value indicating a stronger relationship and a lower value indicating a weaker relationship.
For the purposes of this homework assignment, you will take what you did for the previous assignment (DataStatistics.java) and incorporate the relevant parts into PearsonCorrelation.java, adding other things besides. You will be expected to calculate and announce six statistics for this small data set:
Let the following serve as a guide to the meanings of the notations (some omitted where they can be easily inferred):
the number of data points (i.e., the number of x,y pairs) | |
the sum of all X values, corresponding to numbers 1 and 2 above | |
the sum of the products of X and Y values, corresponding to number 5 above | |
the sum of the squares of all X values, corresponding to numbers 3 and 4 above | |
The squared sum of all X values (and likewise for Y values) |
public class PearsonCorrelation { public static void main (String[] args) { double pearsonR = 0.0; // Data source: http://www.math.hope.edu/swanson/data/tests2.txt long[] xValues = { 85, 99, 99, 81, 69, 84, 79, 94, 85, 64, 69, 67, 53, 71, 70, 74, 89, 77, 90, 83, 92, 89, 70, 78, 96, 99, 77, 85, 57, 74, 94, 81, 53, 60, 73, 87, 82, 44, 82, 54 }; long[] yValues = { 94, 82, 95, 79, 90, 93, 81, 95, 91, 89, 92, 89, 82, 87, 42, 68, 84, 84, 95, 69, 84, 82, 70, 88, 69, 81, 98, 82, 59, 85, 97, 79, 69, 91, 62, 91, 98, 81, 99, 98 }; // See comments from DataStatistics.java long len = xValues.length; // See comments from DataStatistics.java /* for (int i = 0; i < xValues.length; i++) System.out.printf("DP #%2d: X = %d, Y = %d%n", i+1, xValues[i], yValues[i]); */ long sumX = 0, sumY = 0, sumXY = 0, sumXsq = 0, sumYsq = 0; // YOUR CODE GOES HERE // If you declare anymore numeric variables, let // them all be of the long type (for whole numbers) // or the double type (for decimals). This does not // apply to for-loop variables, which should still // be of type int System.out.println("Sum(X) = " + sumX); System.out.println("Sum(Y) = " + sumY); System.out.println("Sum(X^2) = " + sumXsq); System.out.println("Sum(Y^2) = " + sumYsq); System.out.println("Sum(X*Y) = " + sumXY); System.out.printf( "Pearson-r = %.4f%n", pearsonR); } }
Sum(X) = 3109 Sum(Y) = 3344 Sum(X^2) = 249187 Sum(Y^2) = 285494 Sum(X*Y) = 261410 Pearson-r = 0.2239Question: How did the process of editing this program go for you? What were your main challenges and how did you overcome them? What did you learn that may be of use as you move along in this class?