Using The Past To Predict The Future


3. Homoscedasticity in the Residuals

#Graph our residuals against our predictions, this will give us a sense if our model is off for certain priced homes
plt.scatter(df_predictions["price_Predicted"], df_Regression_No_Outliers_With_ScaledData["price_Residuals"])
plt.xlabel("Predicted Price")
plt.xticks(ticks=(300000, 500000, 700000, 900000, 1200000),labels= ('$300k', '$500k', '$700k', '$900k',$1.2M'))
plt.yticks(ticks=(-300000, -100000, 0, 300000, 500000),labels= ('$-300k', '$-100k', '$0k', '$300k','$500k'))
plt.plot(df_predictions["price_Predicted"], [0 for i in range(len(df_predictions["price_Predicted"]))],color="r");
# het_breuschpagan suggests heteroscedasticity in the data P-value> .05, null = homoscedasticity vs. heteroscedasticity
from statsmodels.compat import lzip
import statsmodels.stats.api as sms
name = ['LM Statistic', 'LM-Test p-value', 'F-Statistic', 'F-Test p-value']
test = sms.het_breuschpagan(model.resid, predictors_int)

4. Normality in Distribution of Residuals

# Import appropriate libraries
import statsmodels.api as sm
import scipy.stats as stats
import pylab
#Plot Histogram and QQP
sns.histplot(model.resid,ax=axes[0]), dist=stats.norm, line='45', fit=True, ax=axes[1])
axes[0].set_title('Histogram of Residuals')
axes[1].set_title('QQP of Residuals')
axes[1].set_ylabel("Residual Z Scores")




Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Russell Pihlstrom

Russell Pihlstrom

Innovation Leader and Insight Enthusiast !