【机器学习】Optuna机器学习模型调参（LightGBM、XGBoost）

更新时间:2023-07-26 13:20:15 阅读：评论：0

现代言情【机器学习】Optuna 机器学习模型调参（LightGBM 、XGBoost ）

⽂章⽬录

1. optuna 简介

在Kaggle⽐赛的过程中我发现了⼀个问题（⼤家的Kernel模型中包含了众多c超参数设置，但是这些参数是如何设置的呢？），并在

Discussion中提出了我的问题，并得到了众多⼤佬的回答，如下：

关于回答我汇总后发现都提到了关于optuna库的使⽤，optuna是什么呢？optuna是⼀个使⽤python编写

的超参数调节框架。⼀个极简的optuna 的优化程序中只有三个最核⼼的概念，⽬标函数(objective)，单次试验(trial)，和研究(study). 其中 objective 负责定义待优化函数并指定参/超参数数范围，trial 对应着 objective 的单次执⾏，⽽ study 则负责管理优化，决定优化的⽅式，总试验的次数、试验结果的记录等功能。

下⾯举⼀个简单的栗⼦，有助于⼤家的理解：

定义，求取得最⼤值时，的取值？

import optuna

def objective (trial ):

x = trial .suggest_uniform ('x', -10, 10)

y = trial .suggest_uniform ('y', -10, 10)

return (x + y ) ** 2

study = optuna .create_study (direction ='maximize')

study .optimize (objective , n_trials =100)

print (study .best_params )

print (study .best_value )

2. LGBM 和XGBoost 调参汇总

2.1 LGBM

2.1.1 定义Objective

x ,y ∈(−10,10)f (x )=(x +y )2x ,y

from lightgbm import LGBMRegressor

import optuna

ics import mean_squared_error

del_lection import train_test_split, KFold

import optuna.integration.lightgbm as oplgb

def objective(trial):

X_train, X_test, y_train, y_test=train_test_split(data, target, train_size=0.3)# 数据集划分

param ={

'metric':'rm',

'random_state':48,

'n_estimators':20000,

'reg_alpha': trial.suggest_loguniform('reg_alpha',1e-3,10.0),

'reg_lambda': trial.suggest_loguniform('reg_lambda',1e-3,10.0),

'colsample_bytree': trial.suggest_categorical('colsample_bytree',[0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]), 'subsample': trial.suggest_categorical('subsample',[0.4,0.5,0.6,0.7,0.8,1.0]),

'learning_rate': trial.suggest_categorical('learning_rate',[0.006,0.008,0.01,0.014,0.017,0.02]), 'max_depth': trial.suggest_categorical('max_depth',[5,7,9,11,13,15,17,20,50]),

'num_leaves': trial.suggest_int('num_leaves',1,1000),

'min_child_samples': trial.suggest_int('min_child_samples',1,300),

开机花屏'cat_smooth': trial.suggest_int('cat_smooth',1,100)

}

lgb=LGBMRegressor(**param)

lgb.fit(X_train, y_train, eval_t=[(X_test, y_test)], early_stopping_rounds=100, verbo=Fal)

pred_lgb=lgb.predict(X_test)手绘技巧

rm = mean_squared_error(y_test, pred_lgb, squared=Fal)

return rm

熨烫2.1.2 调参try

ate_study(direction='minimize')

嘴唇黑是什么原因n_trials=50# try50次

study.optimize(objective, n_trials=n_trials)

2.1.3 绘图

optuna.visualization.plot_optimization_history(study)# 绘制

optuna.visualization.plot_parallel_coordinate(study)#

optuna.visualization.plot_param_importances(study)#

2.1.4 最佳参数

params=study.best_params

头尖额窄面相params['metric']='rm'

2.2 XGBOOST

2.2.1 定义Objectove

def objective(trial):

data = train.iloc[:,:-1]

target = train.target

train_x, test_x, train_y, test_y = train_test_split(data, target, test_size=0.3, random_state=42)

param ={

'lambda': trial.suggest_loguniform('lambda',1e-3,10.0),

'alpha': trial.suggest_loguniform('alpha',1e-3,10.0),

'colsample_bytree': trial.suggest_categorical('colsample_bytree',[0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]), 'subsample': trial.suggest_categorical('subsample',[0.4,0.5,0.6,0.7,0.8,1.0]),

'learning_rate': trial.suggest_categorical('learning_rate',

[0.008,0.009,0.01,0.012,0.014,0.016,0.018,0.02]),

'n_estimators':4000,

'max_depth': trial.suggest_categorical('max_depth',[5,7,9,11,13,15,17,20]),

'random_state': trial.suggest_categorical('random_state',[24,48,2020]),

'min_child_weight': trial.suggest_int('min_child_weight',1,300),

}

model = xgb.XGBRegressor(**param)

model.fit(train_x, train_y, eval_t=[(test_x, test_y)], early_stopping_rounds=100, verbo=Fal)

preds = model.predict(test_x)

rm = mean_squared_error(test_y, preds, squared=Fal)

return rm

2.2.2 调参try

study = ate_study(direction='minimize')

n_trials=1

study.optimize(objective, n_trials=n_trials)

print('Number of finished trials:',ials))

print("------------------------------------------------")

print('Best trial:', study.best_trial.params)

print("------------------------------------------------")

ials_dataframe())

print("------------------------------------------------")

2.2.3 绘图

optuna.visualization.plot_optimization_history(study).show()

猪肝瘦肉汤#plot_parallel_coordinate: interactively visualizes the hyperparameters and scores

optuna.visualization.plot_parallel_coordinate(study).show()

'''plot_slice: shows the evolution of the arch. You can e where in the hyperparameter space your arch went and which parts of the space were explored more.'''

optuna.visualization.plot_slice(study).show()

optuna.visualization.plot_contour(study, params=['alpha',

#'max_depth',

'lambda',

'subsample',

'learning_rate',

鸡蛋不能和什么一起吃'subsample']).show()

#Visualize parameter importances.

optuna.visualization.plot_param_importances(study).show()

#Visualize empirical distribution function

optuna.visualization.plot_edf(study).show()

2.2.4 最佳参数

params=study.best_params

本文发布于:2023-07-26 13:20:15，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/1097399.html

上一篇：心理学软件eprime问题总汇

下一篇：1986-Rules of Evidence and Clinical Recommendations-American College of Chest Physicians

标签：优化参数发现试验函数

留言与评论（共有 0 条评论）