The max_iter
parameter in scikit-learn’s SGDClassifier
controls the maximum number of iterations for the stochastic gradient descent algorithm.
Stochastic Gradient Descent (SGD) is an optimization algorithm used to find the parameters that minimize the loss function of a model. It updates the parameters iteratively based on batches of training data.
The max_iter
parameter sets an upper limit on the number of passes over the training data. If the algorithm hasn’t converged within this limit, it will stop and may not have found the optimal solution.
The default value for max_iter
is 1000. In practice, values between 100 and 10000 are commonly used, depending on the complexity of the problem and the size of the dataset.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
n_redundant=5, n_classes=2, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different max_iter values
max_iter_values = [10, 100, 1000, 5000]
results = []
for max_iter in max_iter_values:
sgd = SGDClassifier(max_iter=max_iter, random_state=42, tol=1e-3)
sgd.fit(X_train, y_train)
y_pred = sgd.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
converged = sgd.n_iter_ < max_iter
results.append((max_iter, accuracy, converged))
print(f"max_iter={max_iter}, Accuracy: {accuracy:.3f}, Converged: {converged}")
Running the example gives an output like:
max_iter=10, Accuracy: 0.720, Converged: False
max_iter=100, Accuracy: 0.770, Converged: True
max_iter=1000, Accuracy: 0.770, Converged: True
max_iter=5000, Accuracy: 0.770, Converged: True
The key steps in this example are:
- Generate a synthetic binary classification dataset
- Split the data into train and test sets
- Train
SGDClassifier
models with differentmax_iter
values - Evaluate the accuracy and convergence status of each model
- Display results comparing the effect of different
max_iter
values
Some tips and heuristics for setting max_iter
:
- Start with the default value of 1000 and adjust based on convergence
- Increase
max_iter
if the model hasn’t converged and performance is still improving - Use early stopping with
early_stopping=True
to automatically determine the optimal number of iterations - Monitor the convergence using the
n_iter_
attribute after fitting
Issues to consider:
- Setting
max_iter
too low may result in underfitting if the model hasn’t converged - Very high
max_iter
values may lead to overfitting or unnecessary computation time - The optimal
max_iter
value depends on the dataset size, complexity, and learning rate - Consider using
tol
parameter in conjunction withmax_iter
to control convergence criteria