You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Laboratory Informatics Consultant, Data Science Graduate Student
18h • Edited
Linear Regression for prediction in R:
r = cor(X, y)
slope = r * sd(y) / sd(X)
intercept = mean(y) - slope * mean(X)
r is the correlation coefficient and represents how closely related two variables are. An r of 0 means that the two variables are not related, negative r tells us that as one variable gets big the other gets small, and positive r says that both variables get bigger together.
A regression line allows us to predict y based on the relationship between X and y. This is only really helpful if X and y are linearly related quantitative variables, so we can see that if there is no correlation between X and y then r ~0, slope ~ 0, and the prediction of y can only be assumed to be mean(y). This is why we want to check for independence.