Auto PyTorch 다운로드 - Auto PyTorch 소스 코드 다운로드

Auto-Pytorch

초기 Automl 프레임 워크는 기존 ML 파이프 라인과 하이퍼 파라미터를 최적화하는 데 중점을 두었지만 Automl의 또 다른 추세는 신경 아키텍처 검색에 중점을 두는 것입니다. 이 두 세계 중 최고를 모으기 위해 우리는 자동 파이터를 개발했는데, 이는 자동화 된 딥 러닝 (AutoDL)을 가능하게하는 네트워크 아키텍처와 교육 초라미터를 공동으로 강력하게 최적화합니다.

Auto-Pytorch는 주로 표식 데이터 (분류, 회귀) 및 시계열 데이터 (예측)를 지원하기 위해 개발되었습니다. 테이블 데이터에 대한 자동 파이터의 최신 기능은 논문 "Auto-Pytorch Tabular : 효율적이고 강력한 AutoDl을위한 다중 충실도 메탈 링"에 설명되어 있습니다 (Bibtex Ref의 경우 아래 참조). 다국적 시계열 예측 작업을위한 자동 파이터에 대한 자세한 내용은 "시계열 예측을위한 효율적인 자동화 된 딥 러닝"(Bibtex Ref의 경우 아래 참조)에서 찾을 수 있습니다.

또한 여기에서 문서를 찾으십시오.

V0.1.0에서 Autopytorch는 SMAC를 기본 최적화 패키지로 사용하고 코드 구조를 변경하여 유용성, 견고성 및 효율성을 더욱 향상시키기 위해 업데이트되었습니다. 따라서 V0.0.2에서 V0.1.0으로 이동하면 호환성이 나됩니다. 이전 API를 사용하려면 master_old 에서 찾을 수 있습니다.

워크 플로

Auto-Pytorch의 워크 플로에 대한 대략적인 설명은 다음 그림에서 그려집니다.

그림에서 데이터는 사용자가 제공하며 포트폴리오는 다양한 데이터 세트에서 잘 작동하는 신경망의 구성 집합입니다. 현재 버전은 Paper Auto-Pytorch Tabular : 효율적이고 강력한 Autodl을위한 다중 충실도 금속 감소 에 설명 된 바와 같이 탐욕스러운 포트폴리오 만 지원합니다.이 포트폴리오는 SMAC 최적화를 따뜻하게 시작하는 데 사용됩니다. 다시 말해, 제공된 데이터의 포트폴리오를 초기 구성으로 평가합니다. 그런 다음 API는 다음 절차를 시작합니다.

입력 데이터 유효성 검사 : 자동 파이터가 처리 할 수 있도록 각 데이터 유형을 처리하는 것과 예를 들어 범주 형 데이터를 인코딩합니다.
데이터 세트 작성 : 교차 검증 또는 홀드 아웃 스플릿을 선택 하여이 API에서 처리 할 수있는 데이터 세트를 만듭니다.
기준선을 평가하십시오
- Tabular DataSet *1 : 사전 정의 된 풀의 각 알고리즘을 고정 된 하이퍼 파라미터 구성 및 sklearn.dummy 의 더미 모델로 최악의 성능을 나타내는 더미를 훈련시킵니다.
- 시계열 예측 데이터 세트 : 각 시리즈에서 마지막 관찰 된 값을 반복하는 더미 예측 변수를 훈련시킵니다.
SMAC 검색 :
에이. 하이퍼 밴드로 예산 및 컷오프 규칙을 결정하십시오
비. SMAC의 파이프 라인 하이퍼 파라미터 구성 *2를 샘플링하십시오
기음. 얻은 결과로 관찰을 업데이트하십시오
디. 반복 a. - c. 예산이 떨어질 때까지
앙상블의 관찰 및 모델 선택에서 제공된 데이터 세트에 가장 적합한 앙상블을 구축하십시오.

*1 : 기준은 제공된 데이터 세트에서 회귀 또는 분류 작업을 해결하기 위해 미리 정의 된 머신 러닝 알고리즘 (예 : LightgBM 및 지원 벡터 머신 풀입니다.

*2 : 파이프 라인 하이퍼 파라미터 구성은 구성 요소의 선택, 예를 들어 각 단계에서 대상 알고리즘, 신경망의 형태 및 각 단계에서 구성 요소의 선택을 지정합니다.

설치

PYPI 설치

pip install autoPyTorch

시계열 예측을위한 Auto-Pytorch에는 추가 종속성이 필요합니다

pip install autoPyTorch[forecasting]

수동 설치

다음과 같이 개발을 위해 Anaconda를 사용하는 것이 좋습니다.

 # Following commands assume the user is in a cloned directory of Auto-Pytorch

# We also need to initialize the automl_common repository as follows
# You can find more information about this here:
# https://github.com/automl/automl_common/
git submodule update --init --recursive

# Create the environment
conda create -n auto-pytorch python=3.8
conda activate auto-pytorch
conda install swig
python setup.py install

마찬가지로 자동 파이터 타임즈 시리즈 강화에 대한 모든 종속성을 설치합니다.

git submodule update --init --recursive

conda create -n auto-pytorch python=3.8
conda activate auto-pytorch
conda install swig
pip install -e[forecasting]

예

간단히 말해서 :

 from autoPyTorch . api . tabular_classification import TabularClassificationTask

# data and metric imports
import sklearn . model_selection
import sklearn . datasets
import sklearn . metrics
X , y = sklearn . datasets . load_digits ( return_X_y = True )
X_train , X_test , y_train , y_test = 
        sklearn . model_selection . train_test_split ( X , y , random_state = 1 )

# initialise Auto-PyTorch api
api = TabularClassificationTask ()

# Search for an ensemble of machine learning algorithms
api . search (
    X_train = X_train ,
    y_train = y_train ,
    X_test = X_test ,
    y_test = y_test ,
    optimize_metric = 'accuracy' ,
    total_walltime_limit = 300 ,
    func_eval_time_limit_secs = 50
)

# Calculate test accuracy
y_pred = api . predict ( X_test )
score = api . score ( y_pred , y_test )
print ( "Accuracy score" , score )

시계열 예측 작업

 from autoPyTorch . api . time_series_forecasting import TimeSeriesForecastingTask

# data and metric imports
from sktime . datasets import load_longley
targets , features = load_longley ()

# define the forecasting horizon
forecasting_horizon = 3

# Dataset optimized by APT-TS can be a list of np.ndarray/ pd.DataFrame where each series represents an element in the 
# list, or a single pd.DataFrame that records the series
# index information: to which series the timestep belongs? This id can be stored as the DataFrame's index or a separate
# column
# Within each series, we take the last forecasting_horizon as test targets. The items before that as training targets
# Normally the value to be forecasted should follow the training sets
y_train = [ targets [: - forecasting_horizon ]]
y_test = [ targets [ - forecasting_horizon :]]

# same for features. For uni-variant models, X_train, X_test can be omitted and set as None
X_train = [ features [: - forecasting_horizon ]]
# Here x_test indicates the 'known future features': they are the features known previously, features that are unknown
# could be replaced with NAN or zeros (which will not be used by our networks). If no feature is known beforehand,
# we could also omit X_test
known_future_features = list ( features . columns )
X_test = [ features [ - forecasting_horizon :]]

start_times = [ targets . index . to_timestamp ()[ 0 ]]
freq = '1Y'

# initialise Auto-PyTorch api
api = TimeSeriesForecastingTask ()

# Search for an ensemble of machine learning algorithms
api . search (
    X_train = X_train ,
    y_train = y_train ,
    X_test = X_test , 
    optimize_metric = 'mean_MAPE_forecasting' ,
    n_prediction_steps = forecasting_horizon ,
    memory_limit = 16 * 1024 ,  # Currently, forecasting models use much more memories
    freq = freq ,
    start_times = start_times ,
    func_eval_time_limit_secs = 50 ,
    total_walltime_limit = 60 ,
    min_num_test_instances = 1000 ,  # proxy validation sets. This only works for the tasks with more than 1000 series
    known_future_features = known_future_features ,
)

# our dataset could directly generate sequences for new datasets
test_sets = api . dataset . generate_test_seqs ()

# Calculate test accuracy
y_pred = api . predict ( test_sets )
score = api . score ( y_pred , y_test )
print ( "Forecasting score" , score )

검색 공간 사용자 정의, 코드 패러 럴링 등을 포함한 더 많은 예를 보려면 examples 폴더를 확인하십시오.

$ cd examples/

논문 코드는 TPAMI.2021.3067763 지점의 examples/ensemble 아래에 있습니다.

기여

Auto-Pytorch에 기여하려면 저장소를 복제하고 현재 개발 지점을 확인하십시오.

$ git checkout development

특허

이 프로그램은 무료 소프트웨어입니다. Apache 라이센스 2.0의 조건에 따라이를 재분배하거나 수정할 수 있습니다 (라이센스 파일 참조).

이 프로그램은 유용 할 것이지만 보증이 없다는 희망으로 배포됩니다. 상업성 또는 특정 목적에 대한 적합성에 대한 묵시적 보증조차 없습니다.

이 프로그램과 함께 Apache License 2.0 사본을 받아야합니다 (라이센스 파일 참조).

참조

효율적이고 강력한 자동 autodl을 위해 종이 자동 파이터 ch 테이블 : 다중 실산 금속 세척을 재현하려면 Branch TPAMI.2021.3067763 을 참조하십시오.

  @article { zimmer-tpami21a ,
  author = { Lucas Zimmer and Marius Lindauer and Frank Hutter } ,
  title = { Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL } ,
  journal = { IEEE Transactions on Pattern Analysis and Machine Intelligence } ,
  year = { 2021 } ,
  note = { also available under https://arxiv.org/abs/2006.13799 } ,
  pages = { 3079 - 3090 }
}

 @incollection { mendoza-automlbook18a ,
  author    = { Hector Mendoza and Aaron Klein and Matthias Feurer and Jost Tobias Springenberg and Matthias Urban and Michael Burkart and Max Dippel and Marius Lindauer and Frank Hutter } ,
  title     = { Towards Automatically-Tuned Deep Neural Networks } ,
  year      = { 2018 } ,
  month     = dec,
  editor    = { Hutter, Frank and Kotthoff, Lars and Vanschoren, Joaquin } ,
  booktitle = { AutoML: Methods, Sytems, Challenges } ,
  publisher = { Springer } ,
  chapter   = { 7 } ,
  pages     = { 141--156 }
}

 @article { deng-ecml22 ,
  author    = { Difan Deng and Florian Karl and Frank Hutter and Bernd Bischl and Marius Lindauer } ,
  title     = { Efficient Automated Deep Learning for Time Series Forecasting } ,
  year      = { 2022 } ,
  booktitle = { Machine Learning and Knowledge Discovery in Databases. Research Track
               - European Conference, {ECML} {PKDD} 2022 } ,
  url       = { https://doi.org/10.48550/arXiv.2205.05511 } ,
}