在Python中,有多種方法可以優化機器學習模型的參數。以下是一些常用的方法:
sklearn.model_selection.GridSearchCV
實現網格搜索。from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
param_grid = {
'n_estimators': [10, 50, 100],
'max_depth': [None, 10, 20, 30],
'min_samples_split': [2, 5, 10]
}
rf = RandomForestClassifier()
grid_search = GridSearchCV(estimator=rf, param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)
sklearn.model_selection.RandomizedSearchCV
實現隨機搜索。from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
param_dist = {
'n_estimators': randint(10, 200),
'max_depth': randint(10, 50),
'min_samples_split': randint(2, 20)
}
rf = RandomForestClassifier()
random_search = RandomizedSearchCV(estimator=rf, param_distributions=param_dist, n_iter=100, cv=5)
random_search.fit(X_train, y_train)
sklearn.model_selection.BayesSearchCV
實現貝葉斯優化。from sklearn.model_selection import BayesSearchCV
from skopt import BayesSearchCV as BSCV
param_space = {
'n_estimators': (10, 200),
'max_depth': (None, 50),
'min_samples_split': (2, 20)
}
rf = RandomForestClassifier()
bayes_search = BSCV(estimator=rf, search_spaces=param_space, cv=5, n_iter=100)
bayes_search.fit(X_train, y_train)
sklearn.model_selection.GridSearchCV
或sklearn.model_selection.RandomizedSearchCV
結合學習率參數進行調整。param_grid = {
'n_estimators': [10, 50, 100],
'learning_rate': [0.01, 0.1, 0.2],
'max_depth': [None, 10, 20, 30],
'min_samples_split': [2, 5, 10]
}
rf = GradientBoostingClassifier(learning_rate=None)
grid_search = GridSearchCV(estimator=rf, param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)
optuna
和hyperopt
。總之,選擇哪種方法取決于您的具體需求和問題。在實際操作中,可以嘗試多種方法并比較它們的性能,以找到最適合您的模型參數的優化方法。