hyperparameter_hunter下載hyperparameter

hyperparameter_hunter

其他源碼

v3.0.0 (Artemis)

下載

超參數獵人

超參數獵人概述

自動保存並從實驗結果中學習，從而導致長期的持續優化，以紀念您的所有測試。

HyperParameterHunter為機器學習算法提供了一個包裝器，可保存所有重要數據。通過讓HyperParameterHunter進行錄製，組織和從測試中學習的辛勤工作來簡化實驗和超參數調整過程，同時使用了您已經做過的相同庫。不要讓任何實驗浪費，然後開始按原定的方式進行超參數優化。

安裝： pip install hyperparameter-hunter
資料來源： https：//github.com/huntermcgushion/hyperparameter_hunter
文檔： https：//hyperparameter-hunter.readthedocs.io

特徵

自動記錄實驗結果
真正知情的超參數優化，自動使用過去的實驗
消除用於交叉驗證環的樣板代碼，預測和評分
不再擔心跟踪超參數，分數或重新運行相同的實驗
使用您已經喜歡的圖書館和公用事業

如何使用超參數獵人

不要將超級參數獵人視為另一個優化庫，只有在進行超參數優化的時間時才出現。當然，它確實進行了優化，但是最好將HyperParameterHunter視為您自己的個人機器學習工具箱/助手。

這個想法是立即開始使用HyperGaremeterHunter。通過它運行所有基準/一次性實驗。

您使用HyperParameterHunter的越多，結果就越好。當然，如果您只將其用於優化，它將完成您想要的工作，但這缺少了HyperParameterHunter的點。

如果您一直在整個項目的整個過程中都使用它進行實驗和優化，那麼當您決定進行超參數優化時，HyperparameterHunter已經意識到您已經完成的所有工作，而這是HyperParameterHunter做一些出色的事情。它不會像其他庫那樣從頭開始優化。它從您已經通過它運行的所有實驗和以前的優化回合開始。

入門

1）環境：

建立一個環境來組織實驗和優化結果。
我們執行的任何實驗或優化回合都將使用我們的主動環境。

 from hyperparameter_hunter import Environment , CVExperiment
import pandas as pd
from sklearn . datasets import load_breast_cancer
from sklearn . model_selection import StratifiedKFold

data = load_breast_cancer ()
df = pd . DataFrame ( data = data . data , columns = data . feature_names )
df [ 'target' ] = data . target

env = Environment (
    train_dataset = df ,  # Add holdout/test dataframes, too
    results_path = 'path/to/results/directory' ,  # Where your result files will go
    metrics = [ 'roc_auc_score' ],  # Callables, or strings referring to `sklearn.metrics`
    cv_type = StratifiedKFold ,  # Class, or string in `sklearn.model_selection`
    cv_params = dict ( n_splits = 5 , shuffle = True , random_state = 32 )
)

2）個人實驗：

僅通過提供模型初始化器和超級參數來對您喜歡的庫進行實驗

凱拉斯

 # Same format used by `keras.wrappers.scikit_learn`. Nothing new to learn
def build_fn ( input_shape ):  # `input_shape` calculated for you
    model = Sequential ([
        Dense ( 100 , kernel_initializer = 'uniform' , input_shape = input_shape , activation = 'relu' ),
        Dropout ( 0.5 ),
        Dense ( 1 , kernel_initializer = 'uniform' , activation = 'sigmoid' )
    ])  # All layer arguments saved (whether explicit or Keras default) for future use
    model . compile ( optimizer = 'adam' , loss = 'binary_crossentropy' , metrics = [ 'accuracy' ])
    return model

experiment = CVExperiment (
    model_initializer = KerasClassifier ,
    model_init_params = build_fn ,  # We interpret your build_fn to save hyperparameters in a useful, readable format
    model_extra_params = dict (
        callbacks = [ ReduceLROnPlateau ( patience = 5 )],  # Use Keras callbacks
        batch_size = 32 , epochs = 10 , verbose = 0  # Fit/predict arguments
    )
)

Sklearn

 experiment = CVExperiment (
    model_initializer = LinearSVC ,  # (Or any of the dozens of other SK-Learn algorithms)
    model_init_params = dict ( penalty = 'l1' , C = 0.9 )  # Default values used and recorded for kwargs not given
)

xgboost

 experiment = CVExperiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict ( objective = 'reg:linear' , max_depth = 3 , n_estimators = 100 , subsample = 0.5 )
)

Lightgbm

 experiment = CVExperiment (
    model_initializer = LGBMClassifier ,
    model_init_params = dict ( boosting_type = 'gbdt' , num_leaves = 31 , max_depth = - 1 , min_child_samples = 5 , subsample = 0.5 )
)

catboost

 experiment = CVExperiment (
    model_initializer = CatboostClassifier ,
    model_init_params = dict ( iterations = 500 , learning_rate = 0.01 , depth = 7 , allow_writing_files = False ),
    model_extra_params = dict ( fit = dict ( verbose = True ))  # Send kwargs to `fit` and other extra methods
)

RGF

 experiment = CVExperiment (
    model_initializer = RGFClassifier ,
    model_init_params = dict ( max_leaf = 1000 , algorithm = 'RGF' , min_samples_leaf = 10 )
)

3）超參數優化：

就像實驗一樣，但是如果您想優化超參數，請使用以下導入的類

 from hyperparameter_hunter import Real , Integer , Categorical
from hyperparameter_hunter import optimization as opt

凱拉斯

 def build_fn ( input_shape ):
    model = Sequential ([
        Dense ( Integer ( 50 , 150 ), input_shape = input_shape , activation = 'relu' ),
        Dropout ( Real ( 0.2 , 0.7 )),
        Dense ( 1 , activation = Categorical ([ 'sigmoid' , 'softmax' ]))
    ])
    model . compile (
        optimizer = Categorical ([ 'adam' , 'rmsprop' , 'sgd' , 'adadelta' ]),
        loss = 'binary_crossentropy' , metrics = [ 'accuracy' ]
    )
    return model

optimizer = opt . RandomForestOptPro ( iterations = 7 )
optimizer . forge_experiment (
    model_initializer = KerasClassifier ,
    model_init_params = build_fn ,
    model_extra_params = dict (
        callbacks = [ ReduceLROnPlateau ( patience = Integer ( 5 , 10 ))],
        batch_size = Categorical ([ 32 , 64 ]),
        epochs = 10 , verbose = 0
    )
)
optimizer . go ()

Sklearn

 optimizer = opt . DummyOptPro ( iterations = 42 )
optimizer . forge_experiment (
    model_initializer = AdaBoostClassifier ,  # (Or any of the dozens of other SKLearn algorithms)
    model_init_params = dict (
        n_estimators = Integer ( 75 , 150 ),
        learning_rate = Real ( 0.8 , 1.3 ),
        algorithm = 'SAMME.R'
    )
)
optimizer . go ()

xgboost

 optimizer = opt . BayesianOptPro ( iterations = 10 )
optimizer . forge_experiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict (
        max_depth = Integer ( low = 2 , high = 20 ),
        learning_rate = Real ( 0.0001 , 0.5 ),
        n_estimators = 200 ,
        subsample = 0.5 ,
        booster = Categorical ([ 'gbtree' , 'gblinear' , 'dart' ]),
    )
)
optimizer . go ()

Lightgbm

 optimizer = opt . BayesianOptPro ( iterations = 100 )
optimizer . forge_experiment (
    model_initializer = LGBMClassifier ,
    model_init_params = dict (
        boosting_type = Categorical ([ 'gbdt' , 'dart' ]),
        num_leaves = Integer ( 5 , 20 ),
        max_depth = - 1 ,
        min_child_samples = 5 ,
        subsample = 0.5
    )
)
optimizer . go ()

catboost

 optimizer = opt . GradientBoostedRegressionTreeOptPro ( iterations = 32 )
optimizer . forge_experiment (
    model_initializer = CatBoostClassifier ,
    model_init_params = dict (
        iterations = 100 ,
        eval_metric = Categorical ([ 'Logloss' , 'Accuracy' , 'AUC' ]),
        learning_rate = Real ( low = 0.0001 , high = 0.5 ),
        depth = Integer ( 4 , 7 ),
        allow_writing_files = False
    )
)
optimizer . go ()

RGF

 optimizer = opt . ExtraTreesOptPro ( iterations = 10 )
optimizer . forge_experiment (
    model_initializer = RGFClassifier ,
    model_init_params = dict (
        max_leaf = 1000 ,
        algorithm = Categorical ([ 'RGF' , 'RGF_Opt' , 'RGF_Sib' ]),
        l2 = Real ( 0.01 , 0.3 ),
        normalize = Categorical ([ True , False ]),
        learning_rate = Real ( 0.3 , 0.7 ),
        loss = Categorical ([ 'LS' , 'Expo' , 'Log' , 'Abs' ])
    )
)
optimizer . go ()

輸出文件結構

這是您可以期望Experiment生成的文件結構的簡單說明。有關目錄結構和各種文件內容的深入說明，請參見文檔中的文件結構概述部分。但是，必需品如下：

一個Experiment將文件添加到每個高參數hunterassets/實驗子目錄中，由experiment_id命名
每個Experiment還為HyperParameterHunterAssets/排行榜/GlobAlleaderboard.csv添加了一個條目
自定義哪些文件是通過Environment的file_blacklist和do_full_save kwargs創建的（此處記錄）

 HyperparameterHunterAssets
|   Heartbeat.log
|
└───Experiments
|   |
|   └───Descriptions
|   |   |   <Files describing Experiment results, conditions, etc.>.json
|   |
|   └───Predictions<OOF/Holdout/Test>
|   |   |   <Files containing Experiment predictions for the indicated dataset>.csv
|   |
|   └───Heartbeats
|   |   |   <Files containing the log produced by the Experiment>.log
|   |
|   └───ScriptBackups
|       |   <Files containing a copy of the script that created the Experiment>.py
|
└───Leaderboards
|   |   GlobalLeaderboard.csv
|   |   <Other leaderboards>.csv
|
└───TestedKeys
|   |   <Files named by Environment key, containing hyperparameter keys>.json
|
└───KeyAttributeLookup
    |   <Files linking complex objects used in Experiments to their hashes>

安裝

 pip install hyperparameter-hunter

如果您喜歡登頂，並且想要所有最新的發展，請運行：

 pip install git+https://github.com/HunterMcGushion/hyperparameter_hunter.git

如果您想為HyperparameterHunter做出貢獻，請在此處開始。

我還是不明白

沒關係。不要感到難過。纏繞您的頭有些奇怪。這是一個示例，說明了一切如何相關：

 from hyperparameter_hunter import Environment , CVExperiment , BayesianOptPro , Integer
from hyperparameter_hunter . utils . learning_utils import get_breast_cancer_data
from xgboost import XGBClassifier

# Start by creating an `Environment` - This is where you define how Experiments (and optimization) will be conducted
env = Environment (
    train_dataset = get_breast_cancer_data ( target = 'target' ),
    results_path = 'HyperparameterHunterAssets' ,
    metrics = [ 'roc_auc_score' ],
    cv_type = 'StratifiedKFold' ,
    cv_params = dict ( n_splits = 10 , shuffle = True , random_state = 32 ),
)

# Now, conduct an `Experiment`
# This tells HyperparameterHunter to use the settings in the active `Environment` to train a model with these hyperparameters
experiment = CVExperiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict (
        objective = 'reg:linear' ,
        max_depth = 3
    )
)

# That's it. No annoying boilerplate code to fit models and record results
# Now, the `Environment`'s `results_path` directory will contain new files describing the Experiment just conducted

# Time for the fun part. We'll set up some hyperparameter optimization by first defining the `OptPro` (Optimization Protocol) we want
optimizer = BayesianOptPro ( verbose = 1 )

# Now we're going to say which hyperparameters we want to optimize.
# Notice how this looks just like our `experiment` above
optimizer . forge_experiment (
    model_initializer = XGBClassifier ,
    model_init_params = dict (
        objective = 'reg:linear' ,  # We're setting this as a constant guideline - Not one to optimize
        max_depth = Integer ( 2 , 10 )  # Instead of using an int like the `experiment` above, we provide a space to search
    )
)
# Notice that our range for `max_depth` includes the `max_depth=3` value we used in our `experiment` earlier

optimizer . go ()  # Now, we go

assert experiment . experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
# Here we're verifying that the `experiment` we conducted first was found by `optimizer` and used as learning material
# You can also see via the console that we found `experiment`'s saved files, and used it to start optimization

last_experiment_id = optimizer . current_experiment . experiment_id
# Let's save the id of the experiment that was just conducted by `optimizer`

optimizer . go ()  # Now, we'll start up `optimizer` again...

# And we can see that this second optimization round learned from both our first `experiment` and our first optimization round
assert experiment . experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
assert last_experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
# It even did all this without us having to tell it what experiments to learn from

# Now think about how much better your hyperparameter optimization will be when it learns from:
# - All your past experiments, and
# - All your past optimization rounds
# And the best part: HyperparameterHunter figures out which experiments are compatible all on its own
# You don't have to worry about telling it that KFold=5 is different from KFold=10,
# Or that max_depth=12 is outside of max_depth=Integer(2, 10)

經過測試的庫

凱拉斯
Scikit-Learn
Lightgbm
catboost
xgboost
RGF_PYTHON
...更多

GOTCHAS/FAQS

這些可能會“ getcha”

一般的：

無法為OptPro提供初始搜索點？
- 這是故意的。如果您OptPro優化回合CVExperiment特定的搜索點（尚未記錄）開始
- 假設兩者俱有相同的指南超標儀，並且Experiment適合您的OptPro定義的搜索空間，則優化器將在Experiment結果中找到和讀取
- 請記住，完成一次後，您可能需要刪除Experiment ，因為結果已保存。離開它只會一遍又一遍地執行相同的Experiment
在我的“ HyperareterHunterassets”目錄中更改了事情之後，一切都停止了工作
- 是的，不要那樣做。尤其不是“描述”，“排行榜”或“測試者”
- HyperParameter -Hunter直接讀取這些文件來弄清楚發生了什麼。
- 刪除它們或更改其內容可能會破壞大量HyperPareMeterHunter的功能

凱拉斯：

找不到具有簡單密集/激活神經網絡的類似實驗嗎？
- 這很可能是由於使用單獨的Activation層之間的切換而引起的，並提供一個Dense層的activation kwarg
- 每一層都被視為其自己的一組小參數集（以及超參數本身），這意味著就超參數獵人而言，以下兩個示例並非等效：
  - Dense(10, activation='sigmoid')
  - Dense(10); Activation('sigmoid')
- 我們正在為此努力，但是就目前而言，解決方法只是與您如何在模型中添加激活的方式一致
  - 要么使用單獨的Activation層，要么在其他層提供activation kwargs，然後堅持使用！
無法優化model.compile參數： optimizer和optimizer_params同時？
- 發生這種情況是因為Keras的optimizers期望不同的論點
- 例如，當optimizer=Categorical(['adam', 'rmsprop'])時， optimizer_params有兩種不同的命令
- 目前，您只能優化optimizer ，而optimizer_params則可以分別優化
- 做到這一點的好方法可能是選擇要測試的一些優化器，並且不提供optimizer_params值。這樣，每個optimizer將使用其默認參數
  - 然後，您可以選擇哪個optimizer是最佳的，並設置optimizer=<best optimizer> ，然後繼續調整optimizer_params ，並使用特定於您選擇的optimizer參數

catboost：

找不到類似的Catboost實驗？
- 這可能是因為Catboost模型__init__方法在其他地方定義了kwargs None默認值
- 因此，HyperParameterHunter假定如果您不明確地為該參數提供None
- 顯然不是這樣，但是我似乎無法弄清Catboost使用的實際默認值所在的位置，因此，如果有人知道如何補救這種情況，我會很喜歡您的幫助！

展開

附加信息

版本 v3.0.0 (Artemis)
類型其他源碼
更新時間 2025-02-24
大小 4.09MB
來自於 Github

相關應用

Hunter x Hunter Mobile手機版

2024-08-19
Deep Hunter遊戲

2023-07-03
甲板獵人

2022-08-27
種子獵人

2022-08-27
霧獵人

2022-08-17
多邊形獵人

2022-07-27

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
hidusbf

其他源碼

1.0.0
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
hidusbf

其他源碼

1.0.0

相關資訊全部