SKLearner Home | About | Contact | Examples

Configure KNeighborsRegressor "metric_params" Parameter

The metric_params parameter in scikit-learn’s KNeighborsRegressor allows passing additional parameters to the distance metric used for finding nearest neighbors.

This is particularly useful when using a custom distance metric that accepts parameters or one of the built-in metrics that takes additional arguments, such as 'wminkowski'.

By default, metric_params is set to None, indicating that no additional parameters are passed to the metric.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=5, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define a custom distance metric that takes a parameter
def my_metric(x, y, pp=1):
    return np.sum(np.abs(x - y) ** pp) ** (1 / pp)

# Instantiate KNeighborsRegressor with custom metric and metric_params
knn_custom = KNeighborsRegressor(n_neighbors=5, metric=my_metric, metric_params={'pp': 2})
knn_custom.fit(X_train, y_train)
y_pred_custom = knn_custom.predict(X_test)
mse_custom = mean_squared_error(y_test, y_pred_custom)
print(f"Custom metric with pp=2, MSE: {mse_custom:.3f}")

# Compare with KNeighborsRegressor without custom metric_params
knn_default = KNeighborsRegressor(n_neighbors=5)
knn_default.fit(X_train, y_train)
y_pred_default = knn_default.predict(X_test)
mse_default = mean_squared_error(y_test, y_pred_default)
print(f"Default metric, MSE: {mse_default:.3f}")

Running the example gives an output like:

Custom metric with pp=2, MSE: 261.960
Default metric, MSE: 261.960

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Define a custom distance metric that takes a parameter p
  4. Instantiate KNeighborsRegressor with the custom metric and metric_params={'p': 2}
  5. Fit the model and evaluate its performance on the test set using mean squared error
  6. Compare with the performance of KNeighborsRegressor without custom metric_params

Some tips and heuristics for using metric_params:

Issues to consider:



See Also