SKLearner Home | About | Contact | Examples

Scikit-Learn ExtraTreeClassifier Model

ExtraTreeClassifier is an extremely randomized tree algorithm used for classification problems. It builds a single tree where splits are chosen randomly, which can result in high variance but also simplicity in the model.

The key hyperparameters of ExtraTreeClassifier include the criterion (which measures the quality of a split), splitter (which strategy to use when splitting a node), and max_depth (the maximum depth of the tree).

The algorithm is appropriate for various classification tasks.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.tree import ExtraTreeClassifier
from sklearn.metrics import accuracy_score

# generate a synthetic binary classification dataset
X, y = make_classification(n_samples=100, n_features=5, n_classes=2, random_state=1)

# split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# create the ExtraTreeClassifier model
model = ExtraTreeClassifier()

# fit the model on the training set
model.fit(X_train, y_train)

# make predictions on the test set
yhat = model.predict(X_test)
acc = accuracy_score(y_test, yhat)
print('Accuracy: %.3f' % acc)

# make a prediction on a new sample
row = [[-1.10325445, -0.49821356, -0.05962247, -0.89224592, -0.70158632]]
yhat = model.predict(row)
print('Predicted: %d' % yhat[0])

Running the example gives an output like:

Accuracy: 0.900
Predicted: 0

The steps are as follows:

  1. Generate a synthetic binary classification dataset using make_classification() with specific parameters for reproducibility. The dataset is split into training and test sets using train_test_split().

  2. Instantiate an ExtraTreeClassifier model with default hyperparameters. The model is then fit on the training data using the fit() method.

  3. Evaluate the model’s performance by predicting the test set and calculating the accuracy score.

  4. Make a single prediction on a new data sample using the predict() method.

This example demonstrates the basic steps to set up and use ExtraTreeClassifier for a simple binary classification task, highlighting the ease of use and performance capabilities of this scikit-learn algorithm.



See Also