Software engineering

Figure 4-1. Models for regression and classification

bet	2/21
Sana	20.12.2022
Hajmi	341,69 Kb.
	#1035265

1 2 3 4 5 6 7 8 9 ... 21

Bog'liq
MASHINA-LEARNING2

Linear Regression (Ordinary Least Squares)

Figure 4-1. Models for regression and classification
This section contains the following details about the models:
• Theory of the models.
• Implementation in Scikit-learn or Keras.
• Grid search for different models.
• Pros and cons of the models.
NOTE
In finance, a key focus is on models that extract signals from previously observed data in order to predict future values for the same time series. This family of time series models predicts continuous output and is more aligned with the supervised regression models. Time series models are covered separately in the supervised regression chapter (Chapter 5).
Linear Regression (Ordinary Least Squares)
Linear regression (Ordinary Least Squares Regression or OLS Regression) is perhaps one of the most well-known and best-understood algorithms in statistics and machine learning. Linear regression is a linear model, e.g., a model that assumes a linear relationship between the input
variables (x) and the single output variable (y). The goal of linear regression is to train a linear model to predict a new y given a previously unseen x with as little error as possible.
Our model will be a function that predicts y given x 1 , x 2 . . . x i :
y = p 0 + p 1 x 1 + . . . + p i x i
where, p 0 is called intercept and p 1 . . . p i are the coefficient of the regression.
Implementation in Python
from sklearn.linear_model import LinearRegression model = LinearRegression() model.fit(X, Y)
In the following section, we cover the training of a linear regression model and grid search of the model. However, the overall concepts and related approaches are applicable to all other supervised learning models.
Training a model
As we mentioned in Chapter 3, training a model basically means retrieving the model parameters by minimizing the cost (loss) function. The two steps for training a linear regression model are:
Define a cost function (or loss function)
Measures how inaccurate the model’s predictions are. The sum of squared residuals (RSS) as defined in Equation 4-1 measures the squared sum of the difference between the actual and predicted value and is the cost function for linear regression.
Equation 4-1. Sum of squared residuals R S S = £ i=1 n y i - p 0 - £ j=1 n p j x ij 2
In this equation, p 0 is the intercept; p j represents the coefficient; p 1 , . . , p j are the coefficients of the regression; and x ij represents the i th observation and j th variable.
Find the parameters that minimize loss
For example, make our model as accurate as possible. Graphically, in two dimensions, this results in a line of best fit as shown in Figure 4-2. In higher dimensions, we would have higher-dimensional hyperplanes. Mathematically, we look at the difference between each real data point (y) and our model’s prediction (y). Square these differences to avoid negative numbers and penalize larger differences, and then add them up and take the average. This is a measure of how well our data fits the line.
Grid search
The overall idea of the grid search is to create a grid of all possible hyperparameter combinations and train the model using each one of them. Hyperparameters are the external characteristic of the model, can be considered the model’s settings, and are not estimated based on data-like model parameters. These hyperparameters are tuned during grid search to achieve better model performance.
Due to its exhaustive search, a grid search is guaranteed to find the optimal parameter within the grid. The drawback is that the size of the grid grows exponentially with the addition of more parameters or more considered values.
The GridSearchev class in the modei_seiection module of the sklearn package facilitates the systematic evaluation of all combinations of the hyperparameter values that we would like to test.
The first step is to create a model object. We then define a dictionary where the keywords name the hyperparameters and the values list the parameter settings to be tested. For linear regression, the hyperparameter is fit_intercept, which is a boolean variable that determines whether or not to calculate the intercept for this model. If set to False, no intercept will be used in calculations:
model = LinearRegression()
param grid = { 'fit intercept' : [True, False]}
}
The second step is to instantiate the GridSearchCV object and provide the estimator object and parameter grid, as well as a scoring method and cross validation choice, to the initialization method. Cross validation is a resampling procedure used to evaluate machine learning models, and scoring parameter is the evaluation metrics of the models
With all settings in place, we can fit GridSearchCV:
grid = GridSearchCV(estimator=model, param grid=param grid, scoring= 'r2', \
cv=kfold)
grid result = grid.fit(X, Y)
Advantages and disadvantages
In terms of advantages, linear regression is easy to understand and interpret. However, it may not work well when there is a nonlinear relationship between predicted and predictor variables. Linear regression is prone to overfitting (which we will discuss in the next section) and when a large number of features are present, it may not handle irrelevant features well. Linear regression also requires the data to follow certain assumptions, such as the absence of multicollinearity. If the assumptions fail, then we cannot trust the results obtained.

Download 341,69 Kb.

Do'stlaringiz bilan baham:

1 2 3 4 5 6 7 8 9 ... 21