|
Graphing tips: Linear regression |
|
|
Graphing the regression line When Prism performs linear regression, it automatically superimposes the line on the graph. If you need to create additional graphs, or change which line is plotted on which graph, keep in mind that the line generated by linear regression is seen by Prism as a data set. You can add lines to a graph or remove lines from a graph on the 'Data sets on graph' tab of the Format Graph dialog. Confidence and prediction bands If you check the option box, Prism will calculate and graph either the 95% confidence band or 95% prediction band of the regression line. Confidence bands Two confidence bands surrounding the best-fit line define the confidence interval of the best-fit line.
The dashed confidence bands are curved. This does not mean that the confidence band includes the possibility of curves as well as straight lines. Rather, the curved lines are the boundaries of all possible straight lines. The figure below shows four possible linear regression lines (solid) that lie within the confidence band (dashed).
Given the assumptions of linear regression, you can be 95% confident that the two curved confidence bands enclose the true best-fit linear regression line, leaving a 5% chance that the true line is outside those boundaries. Many data points will be outside the 95% confidence bands. The confidence bands are 95% sure to contain the best-fit regression line. This is not the same as saying it will contain 95% of the data points. Prediction bands Prism can also plot the 95% prediction bands. The prediction bands are further from the best-fit line than the confidence bands, a lot further if you have many data points. The 95% prediction band is the area in which you expect 95% of all data points to fall. In contrast, the 95% confidence band is the area that has a 95% chance of containing the true regression line. This graph shows both prediction and confidence intervals (the curves defining the prediction intervals are further from the regression line).
When to plot confidence and prediction bands The confidence bands sort of combine the confidence intervals of the slope and intercept in a visual way. Use confidence bands to learn how precisely your data define the best-fit line. Prediction bands are wider, to also include the scatter of the data. Use prediction bands when your main goal is show the variation in your data. Fine-tuning the appearance of the confidence and prediction bands If you check the option on the Linear regression, Prism will automatically superimpose the confidence or prediction band on the graph. To adjust the appearance of the confidence or prediction bands, go to the Format Graph dialog, select the dataset that represents the best fit curve, and adjust the error bars and area fill settings. You can also choose to fill the area enclosed by the confidence or prediction bands.
Residuals If you check an option on the linear regression dialog, Prism will create a results table with residuals, which are the vertical distances of each point from the regression line. The X values in the residual table are identical to the X values you entered. The Y values are the residuals. A residual with a positive value means that the point is above the line; a residual with a negative value means the point is below the line. When Prism creates the table of residuals, it also automatically makes a new graph containing the residuals and nothing else. You can treat the residuals table like any other table, and do additional analyses or make additional graphs. If the assumptions of linear regression have been met, the residuals will be randomly scattered above and below the line at Y=0. The scatter should not vary with X. You also should not see large clusters of adjacent points that are all above or all below the Y=0 line. See an example of residuals from nonlinear regression. |