Brian P. answered 04/21/15
Tutor
4.9
(205)
PhD Student Available to Tutor Math & Statistics - All Levels
You're right that logically the price could never be negative, but also a house would not have 0 bathrooms, 0 bedrooms... I wonder if square feet is another predictor - in that case, 0 square feet would also be meaningless.
The y-intercept is only meaningful if it is logically meaningful for all predictor variables to be zero.
You could make a change to your X variables - instead of using bathrooms, use "bathrooms beyond 1" or "bedrooms beyond 1" - so you subtract 1 from each of these and re-run your regression. My hunch is that if you do this "re-centering" of your data, you will get a y-intercept that corresponds to a "basic, budget, minimal" house" and it will not be negative anymore.
But, it could also be the case that your regression equation is better at predicting the price for the majority of your data points which are in the middle and not on the extremes of your data set. We are always cautioned about using a regression model to predict Y values from data points that are outside the domain of our data set - extrapolating is a dangerous game, and the model is often not well equipped to do that.
Those are the two explanations that come to mind. Hope it helps!
Brian P.
You would want to add a new column Bathrooms beyond 1, and it should be 1 less than the number in bathrooms. So if bathrooms was in column G, insert column H, the formula in, say H2 should be =G2-1
If I recall, though, excel requires all covariate columns to be adjacent, so you'd have to move the columns around. It might be a little bit of a hassle to do this.
Report
04/21/15
Ash O.
04/21/15