SKLearner Home | About | Contact | Examples

Configure KNeighborsRegressor "metric" Parameter

The metric parameter in scikit-learn’s KNeighborsRegressor specifies the distance metric used to compute the distance between points.

KNeighborsRegressor is a non-parametric method that predicts the target for a given query point based on the average of the target values of its k nearest neighbors. The metric parameter affects the distance calculations and, consequently, the predictions made by the model.

The default value for metric is ‘minkowski’, which corresponds to the Minkowski distance. Common alternatives include ’euclidean’, ‘manhattan’, and ‘chebyshev’.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic regression dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different metric values
metric_values = ['euclidean', 'manhattan', 'chebyshev', 'minkowski']
mse_scores = []

for metric in metric_values:
    knn = KNeighborsRegressor(metric=metric, n_neighbors=5)
    knn.fit(X_train, y_train)
    y_pred = knn.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"metric={metric}, Mean Squared Error: {mse:.3f}")

Running the example gives an output like:

metric=euclidean, Mean Squared Error: 3728.344
metric=manhattan, Mean Squared Error: 4261.118
metric=chebyshev, Mean Squared Error: 4363.928
metric=minkowski, Mean Squared Error: 3728.344

The key steps in this example are:

  1. Generate a synthetic regression dataset with features and noise.
  2. Split the data into training and testing sets.
  3. Train KNeighborsRegressor models with different metric values.
  4. Evaluate the mean squared error of each model on the test set.

Some tips and heuristics for setting metric:

Issues to consider:



See Also