AnalysisOne variable analysisAverage income in Canada (1976-2013)For the average income in Canada between 1976 and 2013 the mean is $69,797, the median is $67,800 and the mode is $66,400. The range is $17,900, the maximum average income was in 2013 and was $81,400, while the minimum average income value was in 1993 and was $63,500. The mean and median are close in value but the mean is slightly higher than the median, this is due to outliers increasing the value the of the mean. Looking at Figure 4, the most significant outliers can be found when 1994×1998 and in the years of 1984 and 1985. The distribution (Figure 3) is skewed right distribution, this is supported by the box and whisker plot (Figure 2) which shows us that 75% of the data is found when 63500×73375(Q3=73375). This tells us that the data is heavily skewed to one side and the most appropriate measure of central tendency would be the median. The standard deviation is 5302 which means the data is dispersed about the mean because the standard deviation value is great. It also tells us that most of the average income values are 5302 less than and greater than the mean, meaning most points will be approximately between $64,500 and $75,099. The IQR gives us 50% of the data, as it is Q3-Q1= IQR = 73375 – 65400 = 7975. Canada’s suicide rates (1976-2013)For the suicide rate, the mean is 12.56, the median is 12.81 and the mode is 11.30. The mean and median are close in value, but the mean is slightly smaller in value than the median because of the mode, which is acting as an outlier. The mode is weighed heavily because by definition the mode is the most repeated/frequent value in the data set, and so it affects the mean and brings it’s value down. In this data set, the most appropriate measure of central tendency is the median, as it’s not as affected by other values like the mean is. The distribution type is “normal distribution” because as we can see, the histogram (Figure 6) has a “bell-like” shape and in the box and whisker plot (Figure 5), the box is in the middle of the graph, Q1 and Q3 are found in between 11.4 and 13.6 on the graph. Also, we know that the distribution is normal because the mean and median are close in value (mean=12.56, median=12.81), this means the data is roughly symmetrical, making the histogram (Figure 6) bell-shaped and the distribution of the data normal. The standard deviation tells us if most of the data is clustered or dispersed about the mean. The greater the value of standard deviation, the more dispersed about the mean the data is, the smaller the value, the more clustered about the mean the data is. In this case, the data is clustered about the mean, this means that there is little dispersion because the standard deviation value is so small, it is 1.40. In other words, most of the data is 1.40 greater than or less than the mean. 68% of the data falls between 11.16 and 13.96, it’s a very small range containing a great portion (68%) of the data points. The IQR gives us 50% of the data, as it is Q3-Q1= IQR = 13.6 – 11.4 = 2.2, so 50% of the data is between 13.6 and 11.4. Two variable analysisSuicide rates vs. Average income in Canada (1976-2013)According to the coefficient of correlation (r=-0.785), there is a strong negative linear correlation between suicide rates and average income in Canada from 1976 – 2013. This means that as the average income increases, suicide rates decrease, vice versa. The negative correlation is supported by the fact that the “r-value” is negative and the line of best fit on the scatterplot (Figure 1) is negative. We know that the correlation is strong because the r value is close to -1. The coefficient of determination (r2) measures the strength of the relationship between both variables (ie. tells us the strength of the correlation between two variables). In this case, the r2 value is 0.616, so approximately 62% of the variation in Canada’s average income is due to the variation in Canada’s suicide rates. This means that 38% of the variation (100%-62%=38%) is caused by external factors (discussed in conclusion).y=–20.7100,000 x+27 This equation is the equation of the line of best fit in Figure 1. The line of best fit goes through as many points on the scatterplot as possible or through the center of data points (Line of best fit, n.d). The line of best fit measures the strength between two variables. It also tells us their coefficient of determination and coefficient of correlation. For example if the data points are close to the line of best fit, then the coefficient of correlation (r) will be close to 1 or -1, and the correlation will be strong, but if the points are dispersed and scattered around the line, then the coefficient of correlation will be close to 0 and the correlation will be weak. If the line of best fit it negative then the coefficient of correlation will be negative, same as if the line of best fit was positive. The y-intercept is the value at which x=0. The equation allows us to predict points that aren’t on the graph (extrapolation). If you plug a point into the equation and the value you get is 0, then the point will be on the line of best fit, if the value is less than zero then the point is below the line and if the value is greater than zero, then the point is above the line of best fit (Re: Detecting whether a point is above or below a slope, 2013). In this particular equation, the numerator is so small compared to the denominator because the scale on the x-axis is increasing by 5,000, while on the y-axis the scale is increasing by 2. The equation is negative because the line of best fit is negative and because the correlation is negative. Canada’s suicide rates (1976-2013)Looking at Figure 7, we can see that there is a strong negative linear correlation (r= -0.885) because the r value is very close to -1, this means the correlation is strong. The correlation is negative because the line of best fit is negative and the r-value is negative. The negative correlation means that as years go by (i.e. time increases), suicide rates decrease. The coefficient of determination (r2= 0.783) tells us that 78% of the variation in years is caused by the variation in suicide rates. Average income in Canada (1976-2013) By analysing Figure 4 we can see that there is a strong positive correlation (r=0.801) between “year” and “average income”, this means that as the years go by (i.e. time increases), the average income in Canada increases. This is true because the r-value is 0.801, which is a positive value, which means the correlation is positive and because it is close to 1 the correlation is strong. The r2 value is 0.642, this tells us that 64.2% of the variation in years is due to the variation in the average income in Canada.