指点成金-最美分享吧

登录

XGBoost 输出特征重要性以及筛选特征

佚名 举报

篇首语:本文由小编为大家整理,主要介绍了XGBoost 输出特征重要性以及筛选特征相关的知识,希望对你有一定的参考价值。

1.输出XGBoost特征的重要性

from matplotlib import pyplotpyplot.bar(range(len(model_XGB.feature_importances_)), model_XGB.feature_importances_)pyplot.show()

from matplotlib import pyplotpyplot.bar(range(len(model_XGB.feature_importances_)), model_XGB.feature_importances_)pyplot.show()

也可以使用XGBoost内置的特征重要性绘图函数

# plot feature importance using built-in functionfrom xgboost import plot_importanceplot_importance(model_XGB)pyplot.show()

# plot feature importance using built-in functionfrom xgboost import plot_importanceplot_importance(model_XGB)pyplot.show()

 

2.根据特征重要性筛选特征

from numpy import sortfrom sklearn.feature_selection import SelectFromModel# Fit model using each importance as a thresholdthresholds = sort(model_XGB.feature_importances_)for thresh in thresholds:  # select features using threshold  selection = SelectFromModel(model_XGB, threshold=thresh, prefit=True)  select_X_train = selection.transform(X_train)  # train model  selection_model = XGBClassifier()  selection_model.fit(select_X_train, y_train)# eval model  select_X_test = selection.transform(X_test)  y_pred = selection_model.predict(select_X_test)  predictions = [round(value) for value in y_pred]  accuracy = accuracy_score(y_test, predictions)  print("Thresh=%.3f, n=%d, Accuracy: %.2f%%" % (thresh, select_X_train.shape[1],      accuracy*100.0))

from numpy import sortfrom sklearn.feature_selection import SelectFromModel# Fit model using each importance as a thresholdthresholds = sort(model_XGB.feature_importances_)for thresh in thresholds:  # select features using threshold  selection = SelectFromModel(model_XGB, threshold=thresh, prefit=True)  select_X_train = selection.transform(X_train)  # train model  selection_model = XGBClassifier()  selection_model.fit(select_X_train, y_train)# eval model  select_X_test = selection.transform(X_test)  y_pred = selection_model.predict(select_X_test)  predictions = [round(value) for value in y_pred]  accuracy = accuracy_score(y_test, predictions)  print("Thresh=%.3f, n=%d, Accuracy: %.2f%%" % (thresh, select_X_train.shape[1],      accuracy*100.0))

 参考:https://blog.csdn.net/u011630575/article/details/79423162

以上是关于XGBoost 输出特征重要性以及筛选特征的主要内容,如果未能解决你的问题,请参考以下文章