Linear_Regressions
Exploring Boston Housing Data Set
- The first step is to import the required Python libraries into Ipython Notebook.
- This data set is available in sklearn Python module, so I will access it using scikitlearn. I am going to import Boston data set into Ipython notebook and store it in a variable called boston.
The object boston is a dictionary, so you can explore the keys of this dictionary.
I am going to print the feature names of boston data set.
- I will see the description of this data set to know more about it. In this data set I have 506 instances(rows) and 13 attributes or parameters(columns). The goal of this exercise is to predict the housing prices in boston region using the features given.
I am going to convert boston.data into a pandas data frame.
As you can see the column names are just numbers, so I am going to replace those numbers with the feature names.
boston.target contains the housing prices.
I am going to add these target prices to the bos data frame.
Scikit Learn
In this section I am going to fit a linear regression model and predict the Boston housing prices. I will use the least squares method as the way to estimate the coefficients.
Y = boston housing price(also called “target” data in Python)
and
X = all the other features (or independent variables)
First, I am going to import linear regression from sci-kit learn module. Then I am going to drop the price column as I want only the parameters as my X values. I am going to store linear regression object in a variable called lm.
- If you want to look inside the linear regression object, you can do so by typing LinearRegression. and the press
key. This will give a list of functions available inside linear regression object.
to learn more visit this link