
Secara otomatis menyimpan dan belajar dari hasil percobaan, yang mengarah ke optimasi jangka panjang dan persisten yang mengingat semua tes Anda.
Hyperparameterhunter menyediakan pembungkus untuk algoritma pembelajaran mesin yang menyimpan semua data penting. Sederhanakan proses penyetelan eksperimen dan hyperparameter dengan membiarkan Hyperparameterhunter melakukan kerja keras untuk merekam, mengatur, dan belajar dari tes Anda - semua saat menggunakan perpustakaan yang sama yang sudah Anda lakukan. Jangan biarkan eksperimen Anda sia -sia, dan mulailah melakukan optimasi hiperparameter seperti yang seharusnya.
pip install hyperparameter-hunterJangan menganggap Hyperparameterhunter sebagai perpustakaan optimasi lain yang Anda keluarkan hanya ketika waktunya untuk melakukan optimasi hyperparameter. Tentu saja, itu memang optimisasi, tetapi lebih baik untuk melihat Hyperparameterhunter sebagai kotak alat/asisten pembelajaran mesin pribadi Anda.
Idenya adalah mulai menggunakan Hyperparameterhunter segera. Jalankan semua percobaan benchmark/satu kali Anda melalui itu.
Semakin banyak Anda menggunakan Hyperparameterhunter, semakin baik hasil Anda. Jika Anda hanya menggunakannya untuk optimasi, tentu saja, itu akan melakukan apa yang Anda inginkan, tetapi itu tidak ada gunanya Hyperparameterhunter.
Jika Anda telah menggunakannya untuk eksperimen dan optimasi sepanjang seluruh proyek Anda, maka ketika Anda memutuskan untuk melakukan optimasi hiperparameter, Hyperparameterhunter sudah mengetahui semua yang telah Anda lakukan, dan saat itulah Hyperparameterhunter melakukan sesuatu yang luar biasa. Itu tidak memulai optimasi dari awal seperti perpustakaan lainnya. Ini dimulai dari semua percobaan dan putaran optimasi sebelumnya yang sudah Anda jalankan.
Siapkan lingkungan untuk mengatur eksperimen dan hasil optimasi.
Eksperimen atau putaran optimasi yang kami lakukan akan menggunakan lingkungan aktif kami.
from hyperparameter_hunter import Environment , CVExperiment
import pandas as pd
from sklearn . datasets import load_breast_cancer
from sklearn . model_selection import StratifiedKFold
data = load_breast_cancer ()
df = pd . DataFrame ( data = data . data , columns = data . feature_names )
df [ 'target' ] = data . target
env = Environment (
train_dataset = df , # Add holdout/test dataframes, too
results_path = 'path/to/results/directory' , # Where your result files will go
metrics = [ 'roc_auc_score' ], # Callables, or strings referring to `sklearn.metrics`
cv_type = StratifiedKFold , # Class, or string in `sklearn.model_selection`
cv_params = dict ( n_splits = 5 , shuffle = True , random_state = 32 )
)Lakukan eksperimen dengan perpustakaan favorit Anda hanya dengan menyediakan inisialisasi model dan hyperparameters
# Same format used by `keras.wrappers.scikit_learn`. Nothing new to learn
def build_fn ( input_shape ): # `input_shape` calculated for you
model = Sequential ([
Dense ( 100 , kernel_initializer = 'uniform' , input_shape = input_shape , activation = 'relu' ),
Dropout ( 0.5 ),
Dense ( 1 , kernel_initializer = 'uniform' , activation = 'sigmoid' )
]) # All layer arguments saved (whether explicit or Keras default) for future use
model . compile ( optimizer = 'adam' , loss = 'binary_crossentropy' , metrics = [ 'accuracy' ])
return model
experiment = CVExperiment (
model_initializer = KerasClassifier ,
model_init_params = build_fn , # We interpret your build_fn to save hyperparameters in a useful, readable format
model_extra_params = dict (
callbacks = [ ReduceLROnPlateau ( patience = 5 )], # Use Keras callbacks
batch_size = 32 , epochs = 10 , verbose = 0 # Fit/predict arguments
)
) experiment = CVExperiment (
model_initializer = LinearSVC , # (Or any of the dozens of other SK-Learn algorithms)
model_init_params = dict ( penalty = 'l1' , C = 0.9 ) # Default values used and recorded for kwargs not given
) experiment = CVExperiment (
model_initializer = XGBClassifier ,
model_init_params = dict ( objective = 'reg:linear' , max_depth = 3 , n_estimators = 100 , subsample = 0.5 )
) experiment = CVExperiment (
model_initializer = LGBMClassifier ,
model_init_params = dict ( boosting_type = 'gbdt' , num_leaves = 31 , max_depth = - 1 , min_child_samples = 5 , subsample = 0.5 )
) experiment = CVExperiment (
model_initializer = CatboostClassifier ,
model_init_params = dict ( iterations = 500 , learning_rate = 0.01 , depth = 7 , allow_writing_files = False ),
model_extra_params = dict ( fit = dict ( verbose = True )) # Send kwargs to `fit` and other extra methods
) experiment = CVExperiment (
model_initializer = RGFClassifier ,
model_init_params = dict ( max_leaf = 1000 , algorithm = 'RGF' , min_samples_leaf = 10 )
)Sama seperti eksperimen, tetapi jika Anda ingin mengoptimalkan hiperparameter, gunakan kelas yang diimpor di bawah ini
from hyperparameter_hunter import Real , Integer , Categorical
from hyperparameter_hunter import optimization as opt def build_fn ( input_shape ):
model = Sequential ([
Dense ( Integer ( 50 , 150 ), input_shape = input_shape , activation = 'relu' ),
Dropout ( Real ( 0.2 , 0.7 )),
Dense ( 1 , activation = Categorical ([ 'sigmoid' , 'softmax' ]))
])
model . compile (
optimizer = Categorical ([ 'adam' , 'rmsprop' , 'sgd' , 'adadelta' ]),
loss = 'binary_crossentropy' , metrics = [ 'accuracy' ]
)
return model
optimizer = opt . RandomForestOptPro ( iterations = 7 )
optimizer . forge_experiment (
model_initializer = KerasClassifier ,
model_init_params = build_fn ,
model_extra_params = dict (
callbacks = [ ReduceLROnPlateau ( patience = Integer ( 5 , 10 ))],
batch_size = Categorical ([ 32 , 64 ]),
epochs = 10 , verbose = 0
)
)
optimizer . go () optimizer = opt . DummyOptPro ( iterations = 42 )
optimizer . forge_experiment (
model_initializer = AdaBoostClassifier , # (Or any of the dozens of other SKLearn algorithms)
model_init_params = dict (
n_estimators = Integer ( 75 , 150 ),
learning_rate = Real ( 0.8 , 1.3 ),
algorithm = 'SAMME.R'
)
)
optimizer . go () optimizer = opt . BayesianOptPro ( iterations = 10 )
optimizer . forge_experiment (
model_initializer = XGBClassifier ,
model_init_params = dict (
max_depth = Integer ( low = 2 , high = 20 ),
learning_rate = Real ( 0.0001 , 0.5 ),
n_estimators = 200 ,
subsample = 0.5 ,
booster = Categorical ([ 'gbtree' , 'gblinear' , 'dart' ]),
)
)
optimizer . go () optimizer = opt . BayesianOptPro ( iterations = 100 )
optimizer . forge_experiment (
model_initializer = LGBMClassifier ,
model_init_params = dict (
boosting_type = Categorical ([ 'gbdt' , 'dart' ]),
num_leaves = Integer ( 5 , 20 ),
max_depth = - 1 ,
min_child_samples = 5 ,
subsample = 0.5
)
)
optimizer . go () optimizer = opt . GradientBoostedRegressionTreeOptPro ( iterations = 32 )
optimizer . forge_experiment (
model_initializer = CatBoostClassifier ,
model_init_params = dict (
iterations = 100 ,
eval_metric = Categorical ([ 'Logloss' , 'Accuracy' , 'AUC' ]),
learning_rate = Real ( low = 0.0001 , high = 0.5 ),
depth = Integer ( 4 , 7 ),
allow_writing_files = False
)
)
optimizer . go () optimizer = opt . ExtraTreesOptPro ( iterations = 10 )
optimizer . forge_experiment (
model_initializer = RGFClassifier ,
model_init_params = dict (
max_leaf = 1000 ,
algorithm = Categorical ([ 'RGF' , 'RGF_Opt' , 'RGF_Sib' ]),
l2 = Real ( 0.01 , 0.3 ),
normalize = Categorical ([ True , False ]),
learning_rate = Real ( 0.3 , 0.7 ),
loss = Categorical ([ 'LS' , 'Expo' , 'Log' , 'Abs' ])
)
)
optimizer . go () Ini adalah ilustrasi sederhana dari struktur file yang dapat Anda harapkan untuk Experiment . Untuk deskripsi mendalam tentang struktur direktori dan isi dari berbagai file, lihat bagian Tinjauan Struktur File dalam dokumentasi. Namun, hal -hal penting adalah sebagai berikut:
Experiment menambahkan file ke setiap subdirektori hyperparameterhuntersets/eksperimen , dinamai oleh experiment_idExperiment juga menambahkan entri ke hyperparameterhuntersets/leadeboards/globalleaderboard.csvfile_blacklist Environment dan do_full_save kwargs (didokumentasikan di sini) HyperparameterHunterAssets
| Heartbeat.log
|
└───Experiments
| |
| └───Descriptions
| | | <Files describing Experiment results, conditions, etc.>.json
| |
| └───Predictions<OOF/Holdout/Test>
| | | <Files containing Experiment predictions for the indicated dataset>.csv
| |
| └───Heartbeats
| | | <Files containing the log produced by the Experiment>.log
| |
| └───ScriptBackups
| | <Files containing a copy of the script that created the Experiment>.py
|
└───Leaderboards
| | GlobalLeaderboard.csv
| | <Other leaderboards>.csv
|
└───TestedKeys
| | <Files named by Environment key, containing hyperparameter keys>.json
|
└───KeyAttributeLookup
| <Files linking complex objects used in Experiments to their hashes>
pip install hyperparameter-hunter
Jika Anda suka berada di tepi mutakhir, dan Anda ingin semua perkembangan terbaru, jalankan:
pip install git+https://github.com/HunterMcGushion/hyperparameter_hunter.git
Jika Anda ingin berkontribusi pada Hyperparameterhunter, mulailah di sini.
Tidak apa -apa. Tidak merasa tidak enak. Agak aneh membungkus kepala Anda. Berikut adalah contoh yang menggambarkan bagaimana semuanya terkait:
from hyperparameter_hunter import Environment , CVExperiment , BayesianOptPro , Integer
from hyperparameter_hunter . utils . learning_utils import get_breast_cancer_data
from xgboost import XGBClassifier
# Start by creating an `Environment` - This is where you define how Experiments (and optimization) will be conducted
env = Environment (
train_dataset = get_breast_cancer_data ( target = 'target' ),
results_path = 'HyperparameterHunterAssets' ,
metrics = [ 'roc_auc_score' ],
cv_type = 'StratifiedKFold' ,
cv_params = dict ( n_splits = 10 , shuffle = True , random_state = 32 ),
)
# Now, conduct an `Experiment`
# This tells HyperparameterHunter to use the settings in the active `Environment` to train a model with these hyperparameters
experiment = CVExperiment (
model_initializer = XGBClassifier ,
model_init_params = dict (
objective = 'reg:linear' ,
max_depth = 3
)
)
# That's it. No annoying boilerplate code to fit models and record results
# Now, the `Environment`'s `results_path` directory will contain new files describing the Experiment just conducted
# Time for the fun part. We'll set up some hyperparameter optimization by first defining the `OptPro` (Optimization Protocol) we want
optimizer = BayesianOptPro ( verbose = 1 )
# Now we're going to say which hyperparameters we want to optimize.
# Notice how this looks just like our `experiment` above
optimizer . forge_experiment (
model_initializer = XGBClassifier ,
model_init_params = dict (
objective = 'reg:linear' , # We're setting this as a constant guideline - Not one to optimize
max_depth = Integer ( 2 , 10 ) # Instead of using an int like the `experiment` above, we provide a space to search
)
)
# Notice that our range for `max_depth` includes the `max_depth=3` value we used in our `experiment` earlier
optimizer . go () # Now, we go
assert experiment . experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
# Here we're verifying that the `experiment` we conducted first was found by `optimizer` and used as learning material
# You can also see via the console that we found `experiment`'s saved files, and used it to start optimization
last_experiment_id = optimizer . current_experiment . experiment_id
# Let's save the id of the experiment that was just conducted by `optimizer`
optimizer . go () # Now, we'll start up `optimizer` again...
# And we can see that this second optimization round learned from both our first `experiment` and our first optimization round
assert experiment . experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
assert last_experiment_id in [ _ [ 2 ] for _ in optimizer . similar_experiments ]
# It even did all this without us having to tell it what experiments to learn from
# Now think about how much better your hyperparameter optimization will be when it learns from:
# - All your past experiments, and
# - All your past optimization rounds
# And the best part: HyperparameterHunter figures out which experiments are compatible all on its own
# You don't have to worry about telling it that KFold=5 is different from KFold=10,
# Or that max_depth=12 is outside of max_depth=Integer(2, 10) Ini adalah beberapa hal yang mungkin "getcha"
OptPro ?CVExperiment sebelum menginisialisasi OptPro AndaExperiment sesuai dengan ruang pencarian yang ditentukan oleh OptPro Anda, pengoptimal akan menemukan dan membaca dalam hasil ExperimentExperiment setelah Anda melakukannya sekali, karena hasilnya telah disimpan. Meninggalkannya di sana hanya akan menjalankan Experiment yang sama berulang kaliActivation yang terpisah, dan memberikan lapisan Dense dengan activation kwargDense(10, activation='sigmoid')Dense(10); Activation('sigmoid')Activation terpisah, atau berikan kwarg activation ke lapisan lain, dan tetap menggunakannya!optimizer optimizer_params model.compile argumen.optimizers keras mengharapkan argumen yang berbedaoptimizer=Categorical(['adam', 'rmsprop']) , ada dua kemungkinan berbeda dari optimizer_paramsoptimizer , dan optimizer_params secara terpisahoptimizer_params . Dengan begitu, setiap optimizer akan menggunakan parameter defaultnyaoptimizer mana yang terbaik, dan mengatur optimizer=<best optimizer> , kemudian beralih ke tuning optimizer_params , dengan argumen khusus untuk optimizer yang Anda pilih__init__ didefinisikan di tempat lain, dan diberi nilai placeholder None dalam tanda tangan merekaNone jika Anda tidak secara eksplisit memberikan nilai untuk argumen itu