hyperparameter_hunter下载hyperparameter

hyperparameter_hunter

其他源码

v3.0.0 (Artemis)

下载

超参数猎人

超参数猎人概述

自动保存并从实验结果中学习，从而导致长期的持续优化，以纪念您的所有测试。

HyperParameterHunter为机器学习算法提供了一个包装器，可保存所有重要数据。通过让HyperParameterHunter进行录制，组织和从测试中学习的辛勤工作来简化实验和超参数调整过程，同时使用了您已经做过的相同库。不要让任何实验浪费，然后开始按原定的方式进行超参数优化。

安装： pip install hyperparameter-hunter
资料来源： https：//github.com/huntermcgushion/hyperparameter_hunter
文档： https：//hyperparameter-hunter.readthedocs.io

特征

自动记录实验结果
真正知情的超参数优化，自动使用过去的实验
消除用于交叉验证环的样板代码，预测和评分
不再担心跟踪超参数，分数或重新运行相同的实验
使用您已经喜欢的图书馆和公用事业

如何使用超参数猎人

不要将超级参数猎人视为另一个优化库，只有在进行超参数优化的时间时才出现。当然，它确实进行了优化，但是最好将HyperParameterHunter视为您自己的个人机器学习工具箱/助手。

这个想法是立即开始使用HyperGaremeterHunter。通过它运行所有基准/一次性实验。

您使用HyperParameterHunter的越多，结果就越好。当然，如果您只将其用于优化，它将完成您想要的工作，但这缺少了HyperParameterHunter的点。

如果您一直在整个项目的整个过程中都使用它进行实验和优化，那么当您决定进行超参数优化时，HyperparameterHunter已经意识到您已经完成的所有工作，而这是HyperParameterHunter做一些出色的事情。它不会像其他库那样从头开始优化。它从您已经通过它运行的所有实验和以前的优化回合开始。

入门

1）环境：

建立一个环境来组织实验和优化结果。
我们执行的任何实验或优化回合都将使用我们的主动环境。

 from hyperparameter_hunter import Environment , CVExperiment
import pandas as pd
from sklearn . datasets import load_breast_cancer
from sklearn . model_selection import StratifiedKFold

data = load_breast_cancer ()
df = pd . DataFrame ( data = data . data , columns = data . feature_names )
df [ 'target' ] = data . target

env = Environment (
    train_dataset = df ,  # Add holdout/test dataframes, too
    results_path = 'path/to/results/directory' ,  # Where your result files will go
    metrics = [ 'roc_auc_score' ],  # Callables, or strings referring to `sklearn.metrics`
    cv_type = StratifiedKFold ,  # Class, or string in `sklearn.model_selection`
    cv_params = dict ( n_splits = 5 , shuffle = True , random_state = 32 )
)

2）个人实验：

仅通过提供模型初始化器和超级参数来对您喜欢的库进行实验

凯拉斯

 # Same format used by `keras.wrappers.scikit_learn`. Nothing new to learn
def build_fn ( input_shape ):  # `input_shape` calculated for you
    model = Sequential ([
        Dense ( 100 , kernel_initializer = 'uniform' , input_shape = input_shape , activation = 'relu' ),
        Dropout ( 0.5 ),
        Dense ( 1 , kernel_initializer = 'uniform' , activation = 'sigmoid' )
    ])  # All layer arguments saved (whether explicit or Keras default) for future use
    model . compile ( optimizer = 'adam' , loss = 'binary_crossentropy' , metrics = [ 'accuracy' ])
    return model

experiment = CVExperiment (
    model_initializer = KerasClassifier ,
    model_init_params = build_fn ,  # We interpret your build_fn to save hyperparameters in a useful, readable format
    model_extra_params = dict (
        callbacks = [ ReduceLROnPlateau ( patience = 5 )],  # Use Keras callbacks
        batch_size = 32 , epochs = 10 , verbose = 0  # Fit/predict arguments
    )
)

Sklearn

 experiment = CVExperiment (
    model_initializer = LinearSVC ,  # (Or any of the dozens of other SK-Learn algorithms)
    model_init_params = dict ( penalty = 'l1' , C = 0.9 )  # Default values used and recorded for kwargs not given
)

xgboost

 experiment = CVExperiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict ( objective = 'reg:linear' , max_depth = 3 , n_estimators = 100 , subsample = 0.5 )
)

Lightgbm

 experiment = CVExperiment (
    model_initializer = LGBMClassifier ,
    model_init_params = dict ( boosting_type = 'gbdt' , num_leaves = 31 , max_depth = - 1 , min_child_samples = 5 , subsample = 0.5 )
)

catboost

 experiment = CVExperiment (
    model_initializer = CatboostClassifier ,
    model_init_params = dict ( iterations = 500 , learning_rate = 0.01 , depth = 7 , allow_writing_files = False ),
    model_extra_params = dict ( fit = dict ( verbose = True ))  # Send kwargs to `fit` and other extra methods
)

RGF

 experiment = CVExperiment (
    model_initializer = RGFClassifier ,
    model_init_params = dict ( max_leaf = 1000 , algorithm = 'RGF' , min_samples_leaf = 10 )
)

3）超参数优化：

就像实验一样，但是如果您想优化超参数，请使用以下导入的类

 from hyperparameter_hunter import Real , Integer , Categorical
from hyperparameter_hunter import optimization as opt

凯拉斯

 def build_fn ( input_shape ):
    model = Sequential ([
        Dense ( Integer ( 50 , 150 ), input_shape = input_shape , activation = 'relu' ),
        Dropout ( Real ( 0.2 , 0.7 )),
        Dense ( 1 , activation = Categorical ([ 'sigmoid' , 'softmax' ]))
    ])
    model . compile (
        optimizer = Categorical ([ 'adam' , 'rmsprop' , 'sgd' , 'adadelta' ]),
        loss = 'binary_crossentropy' , metrics = [ 'accuracy' ]
    )
    return model

optimizer = opt . RandomForestOptPro ( iterations = 7 )
optimizer . forge_experiment (
    model_initializer = KerasClassifier ,
    model_init_params = build_fn ,
    model_extra_params = dict (
        callbacks = [ ReduceLROnPlateau ( patience = Integer ( 5 , 10 ))],
        batch_size = Categorical ([ 32 , 64 ]),
        epochs = 10 , verbose = 0
    )
)
optimizer . go ()

Sklearn

 optimizer = opt . DummyOptPro ( iterations = 42 )
optimizer . forge_experiment (
    model_initializer = AdaBoostClassifier ,  # (Or any of the dozens of other SKLearn algorithms)
    model_init_params = dict (
        n_estimators = Integer ( 75 , 150 ),
        learning_rate = Real ( 0.8 , 1.3 ),
        algorithm = 'SAMME.R'
    )
)
optimizer . go ()

xgboost

 optimizer = opt . BayesianOptPro ( iterations = 10 )
optimizer . forge_experiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict (
        max_depth = Integer ( low = 2 , high = 20 ),
        learning_rate = Real ( 0.0001 , 0.5 ),
        n_estimators = 200 ,
        subsample = 0.5 ,
        booster = Categorical ([ 'gbtree' , 'gblinear' , 'dart' ]),
    )
)
optimizer . go ()

Lightgbm

 optimizer = opt . BayesianOptPro ( iterations = 100 )
optimizer . forge_experiment (
    model_initializer = LGBMClassifier ,
    model_init_params = dict (
        boosting_type = Categorical ([ 'gbdt' , 'dart' ]),
        num_leaves = Integer ( 5 , 20 ),
        max_depth = - 1 ,
        min_child_samples = 5 ,
        subsample = 0.5
    )
)
optimizer . go ()

catboost

 optimizer = opt . GradientBoostedRegressionTreeOptPro ( iterations = 32 )
optimizer . forge_experiment (
    model_initializer = CatBoostClassifier ,
    model_init_params = dict (
        iterations = 100 ,
        eval_metric = Categorical ([ 'Logloss' , 'Accuracy' , 'AUC' ]),
        learning_rate = Real ( low = 0.0001 , high = 0.5 ),
        depth = Integer ( 4 , 7 ),
        allow_writing_files = False
    )
)
optimizer . go ()

RGF

 optimizer = opt . ExtraTreesOptPro ( iterations = 10 )
optimizer . forge_experiment (
    model_initializer = RGFClassifier ,
    model_init_params = dict (
        max_leaf = 1000 ,
        algorithm = Categorical ([ 'RGF' , 'RGF_Opt' , 'RGF_Sib' ]),
        l2 = Real ( 0.01 , 0.3 ),
        normalize = Categorical ([ True , False ]),
        learning_rate = Real ( 0.3 , 0.7 ),
        loss = Categorical ([ 'LS' , 'Expo' , 'Log' , 'Abs' ])
    )
)
optimizer . go ()

输出文件结构

这是您可以期望Experiment生成的文件结构的简单说明。有关目录结构和各种文件内容的深入说明，请参见文档中的文件结构概述部分。但是，必需品如下：

一个Experiment将文件添加到每个高参数hunterassets/实验子目录中，由experiment_id命名
每个Experiment还为HyperParameterHunterAssets/排行榜/GlobAlleaderboard.csv添加了一个条目
自定义哪些文件是通过Environment的file_blacklist和do_full_save kwargs创建的（此处记录）

 HyperparameterHunterAssets
|   Heartbeat.log
|
└───Experiments
|   |
|   └───Descriptions
|   |   |   <Files describing Experiment results, conditions, etc.>.json
|   |
|   └───Predictions<OOF/Holdout/Test>
|   |   |   <Files containing Experiment predictions for the indicated dataset>.csv
|   |
|   └───Heartbeats
|   |   |   <Files containing the log produced by the Experiment>.log
|   |
|   └───ScriptBackups
|       |   <Files containing a copy of the script that created the Experiment>.py
|
└───Leaderboards
|   |   GlobalLeaderboard.csv
|   |   <Other leaderboards>.csv
|
└───TestedKeys
|   |   <Files named by Environment key, containing hyperparameter keys>.json
|
└───KeyAttributeLookup
    |   <Files linking complex objects used in Experiments to their hashes>

安装

 pip install hyperparameter-hunter

如果您喜欢登顶，并且想要所有最新的发展，请运行：

 pip install git+https://github.com/HunterMcGushion/hyperparameter_hunter.git

如果您想为HyperparameterHunter做出贡献，请在此处开始。

我还是不明白

没关系。不要感到难过。缠绕您的头有些奇怪。这是一个示例，说明了一切如何相关：

 from hyperparameter_hunter import Environment , CVExperiment , BayesianOptPro , Integer
from hyperparameter_hunter . utils . learning_utils import get_breast_cancer_data
from xgboost import XGBClassifier

# Start by creating an `Environment` - This is where you define how Experiments (and optimization) will be conducted
env = Environment (
    train_dataset = get_breast_cancer_data ( target = 'target' ),
    results_path = 'HyperparameterHunterAssets' ,
    metrics = [ 'roc_auc_score' ],
    cv_type = 'StratifiedKFold' ,
    cv_params = dict ( n_splits = 10 , shuffle = True , random_state = 32 ),
)

# Now, conduct an `Experiment`
# This tells HyperparameterHunter to use the settings in the active `Environment` to train a model with these hyperparameters
experiment = CVExperiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict (
        objective = 'reg:linear' ,
        max_depth = 3
    )
)

# That's it. No annoying boilerplate code to fit models and record results
# Now, the `Environment`'s `results_path` directory will contain new files describing the Experiment just conducted

# Time for the fun part. We'll set up some hyperparameter optimization by first defining the `OptPro` (Optimization Protocol) we want
optimizer = BayesianOptPro ( verbose = 1 )

# Now we're going to say which hyperparameters we want to optimize.
# Notice how this looks just like our `experiment` above
optimizer . forge_experiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict (
        objective = 'reg:linear' ,  # We're setting this as a constant guideline - Not one to optimize
        max_depth = Integer ( 2 , 10 )  # Instead of using an int like the `experiment` above, we provide a space to search
    )
)
# Notice that our range for `max_depth` includes the `max_depth=3` value we used in our `experiment` earlier

optimizer . go ()  # Now, we go

assert experiment . experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
# Here we're verifying that the `experiment` we conducted first was found by `optimizer` and used as learning material
# You can also see via the console that we found `experiment`'s saved files, and used it to start optimization

last_experiment_id = optimizer . current_experiment . experiment_id
# Let's save the id of the experiment that was just conducted by `optimizer`

optimizer . go ()  # Now, we'll start up `optimizer` again...

# And we can see that this second optimization round learned from both our first `experiment` and our first optimization round
assert experiment . experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
assert last_experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
# It even did all this without us having to tell it what experiments to learn from

# Now think about how much better your hyperparameter optimization will be when it learns from:
# - All your past experiments, and
# - All your past optimization rounds
# And the best part: HyperparameterHunter figures out which experiments are compatible all on its own
# You don't have to worry about telling it that KFold=5 is different from KFold=10,
# Or that max_depth=12 is outside of max_depth=Integer(2, 10)

经过测试的库

凯拉斯
Scikit-Learn
Lightgbm
catboost
xgboost
RGF_PYTHON
...更多

GOTCHAS/FAQS

这些可能会“ getcha”

一般的：

无法为OptPro提供初始搜索点？
- 这是故意的。如果您OptPro优化回合CVExperiment特定的搜索点（尚未记录）开始
- 假设两者具有相同的指南超标仪，并且Experiment适合您的OptPro定义的搜索空间，则优化器将在Experiment结果中找到和读取
- 请记住，完成一次后，您可能需要删除Experiment ，因为结果已保存。离开它只会一遍又一遍地执行相同的Experiment
在我的“ HyperareterHunterassets”目录中更改了事情之后，一切都停止了工作
- 是的，不要那样做。尤其不是“描述”，“排行榜”或“测试者”
- HyperParameter -Hunter直接读取这些文件来弄清楚发生了什么。
- 删除它们或更改其内容可能会破坏大量HyperPareMeterHunter的功能

凯拉斯：

找不到具有简单密集/激活神经网络的类似实验吗？
- 这很可能是由于使用单独的Activation层之间的切换而引起的，并提供一个Dense层的activation kwarg
- 每一层都被视为其自己的一组小参数集（以及超参数本身），这意味着就超参数猎人而言，以下两个示例并非等效：
  - Dense(10, activation='sigmoid')
  - Dense(10); Activation('sigmoid')
- 我们正在为此努力，但是就目前而言，解决方法只是与您如何在模型中添加激活的方式一致
  - 要么使用单独的Activation层，要么在其他层提供activation kwargs，然后坚持使用！
无法优化model.compile参数： optimizer和optimizer_params同时？
- 发生这种情况是因为Keras的optimizers期望不同的论点
- 例如，当optimizer=Categorical(['adam', 'rmsprop'])时， optimizer_params有两种不同的命令
- 目前，您只能优化optimizer ，而optimizer_params则可以分别优化
- 做到这一点的好方法可能是选择要测试的一些优化器，并且不提供optimizer_params值。这样，每个optimizer将使用其默认参数
  - 然后，您可以选择哪个optimizer是最佳的，并设置optimizer=<best optimizer> ，然后继续调整optimizer_params ，并使用特定于您选择的optimizer参数

catboost：

找不到类似的Catboost实验？
- 这可能是因为Catboost模型__init__方法在其他地方定义了kwargs None默认值
- 因此，HyperParameterHunter假定如果您不明确地为该参数提供None
- 显然不是这样，但是我似乎无法弄清Catboost使用的实际默认值所在的位置，因此，如果有人知道如何补救这种情况，我会很喜欢您的帮助！

展开

附加信息

版本 v3.0.0 (Artemis)
类型其他源码
更新时间 2025-02-24
大小 4.09MB
来自于 Github

hyperparameter_hunter

超参数猎人

特征

如何使用超参数猎人

入门

1）环境：

2）个人实验：

3）超参数优化：

输出文件结构

安装

我还是不明白

经过测试的库

GOTCHAS/FAQS

一般的：

凯拉斯：

catboost：

Hunter x Hunter Mobile手机版

Deep Hunter游戏

甲板猎人

种子猎人

迷雾猎人

多边形猎人

chat.petals.dev

GPT Prompt Templates

GPTyped

Google Dorks

shepherd

hidusbf

Google Dorks

shepherd

hidusbf