The max_iter
parameter in scikit-learn’s SVC
class controls the maximum number of iterations the solver is allowed to run before terminating.
SVC
uses an iterative solver to find the optimal separating hyperplane for the given data. The max_iter
parameter sets an upper bound on the number of iterations the solver will perform.
If the solver reaches the maximum number of iterations without converging, it will terminate and issue a ConvergenceWarning
. This can indicate that the model has not found an optimal solution.
The default value for max_iter
is -1, which means there is no limit on the number of iterations.
In practice, values between 1000 and 10000 are commonly used depending on the size and complexity of the dataset.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import time
# Generate synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10,
n_redundant=5, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different max_iter values
max_iter_values = [100, 1000, 5000, 10000]
accuracies = []
train_times = []
for max_iter in max_iter_values:
start_time = time.time()
svc = SVC(max_iter=max_iter, random_state=42)
svc.fit(X_train, y_train)
train_time = time.time() - start_time
y_pred = svc.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracies.append(accuracy)
train_times.append(train_time)
print(f"max_iter={max_iter}, Accuracy: {accuracy:.3f}, Training Time: {train_time:.2f}s")
Running the example gives an output like:
max_iter=100, Accuracy: 0.709, Training Time: 0.07s
max_iter=1000, Accuracy: 0.943, Training Time: 0.54s
max_iter=5000, Accuracy: 0.944, Training Time: 0.64s
max_iter=10000, Accuracy: 0.944, Training Time: 0.65s
The key steps in this example are:
- Generate a synthetic binary classification dataset with informative and redundant features
- Split the data into train and test sets
- Train
SVC
models with differentmax_iter
values - Evaluate the accuracy and training time of each model on the test set
Some tips and heuristics for setting max_iter
:
- Increase
max_iter
if the solver is terminating due to reaching the iteration limit - Higher values allow the solver to run longer, which can lead to better solutions
- Consider the trade-off between model performance and computational cost
Issues to consider:
- Setting
max_iter
too low can result in premature termination and suboptimal solutions - Very high values can lead to long training times, especially for large datasets
- The optimal value depends on the size and complexity of the dataset