Scikit-Learn SVR Model

Support Vector Regression (SVR) is a powerful regression algorithm capable of modeling complex, non-linear relationships. It extends the concept of Support Vector Machines (SVM) to handle continuous target variables.

The key hyperparameters of SVR include the kernel (the function used to transform the input space), C (the regularization parameter), and epsilon (the margin of tolerance where no penalty is given).

SVR is suitable for regression problems, particularly when dealing with non-linear relationships between features and the target variable.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error

# generate regression dataset
X, y = make_regression(n_samples=100, n_features=1, noise=0.1, random_state=1)

# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# create SVR model
model = SVR(kernel='rbf', C=100, epsilon=0.1)

# fit the model
model.fit(X_train, y_train)

# evaluate model
yhat = model.predict(X_test)
mse = mean_squared_error(y_test, yhat)
print('MSE: %.3f' % mse)

# make a prediction
row = [[0.20]]
yhat = model.predict(row)
print('Prediction: %.3f' % yhat[0])

Running the example gives an output like:

MSE: 0.143
Prediction: 16.119

The steps are as follows:

First, a synthetic regression dataset is generated using the make_regression() function. This creates a dataset with a specified number of samples (n_samples), features (n_features), noise level (noise), and a fixed random seed (random_state) for reproducibility. The dataset is split into training and test sets using train_test_split().
Next, an SVR model is instantiated with an RBF kernel (kernel='rbf'), a regularization parameter of 100 (C=100), and an epsilon value of 0.1 (epsilon=0.1). The model is then fit on the training data using the fit() method.
The performance of the model is evaluated by comparing the predictions (yhat) to the actual values (y_test) using the mean squared error metric.
A single prediction can be made by passing a new data point to the predict() method.

This example demonstrates how to create and use an SVR model with an RBF kernel for regression tasks. It showcases SVR’s ability to handle non-linear relationships and the impact of hyperparameters on the model’s performance.

The choice of hyperparameters, such as the kernel function, regularization parameter (C), and epsilon value, can significantly influence the model’s behavior and performance. Adjusting these hyperparameters allows for fine-tuning the SVR model to the specific characteristics of the dataset and the desired balance between model complexity and generalization.

See Also