深度阅读

How to use the Train/Test Split method in scikit-learn?

作者
作者
2023年08月22日
更新时间
13.09 分钟
阅读时间
0
阅读量

To use the train/test split method in scikit-learn, you can follow these steps:

  1. First, import the necessary module and load your dataset into scikit-learn.

from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

iris = load_iris() # load the iris dataset
X = iris.data # feature matrix
y = iris.target # target vector


2. Next, split your data into training and testing sets using the `train_test_split()` function.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In this example, we are splitting the data into training and testing sets, with 20% of the data being used for testing, and a random seed of 42 to ensure repeatability.

3. Now you can train your machine learning model on the training set, for example a DecisionTreeClassifier:

from sklearn.tree import DecisionTreeClassifier

classifier = DecisionTreeClassifier()
classifier.fit(X_train, y_train)


4. Finally, evaluate the performance of your model on the testing set:

accuracy = classifier.score(X_test, y_test)
print(“Accuracy:”, accuracy)



By following these steps, you can use the train/test split method in scikit-learn to evaluate the performance of your machine learning models.

博客作者

热爱技术,乐于分享,持续学习。专注于Web开发、系统架构设计和人工智能领域。