Pytorch Forecasting adalah paket berbasis Pytorch untuk peramalan dengan arsitektur pembelajaran mendalam yang canggih. Ini memberikan API tingkat tinggi dan menggunakan Pytorch Lightning untuk meningkatkan pelatihan GPU atau CPU, dengan penebangan otomatis.
| Dokumentasi · Tutorial · Catatan Rilis | |
|---|---|
| Open source | |
| Masyarakat | |
| CI/CD | |
| Kode |
Artikel kami tentang Ilmu Data memperkenalkan paket dan memberikan informasi latar belakang.
Peramalan PyTorch bertujuan untuk meringankan waktu yang canggih dengan jaringan saraf untuk kasus dan penelitian dunia nyata. Tujuannya adalah untuk memberikan API tingkat tinggi dengan fleksibilitas maksimum untuk para profesional dan default yang wajar untuk pemula. Secara khusus, paket menyediakan
Paket ini dibangun di atas Pytorch-Lightning untuk memungkinkan pelatihan pada CPU, GPU tunggal dan beberapa di luar kotak.
Jika Anda sedang mengerjakan Windows, Anda harus terlebih dahulu menginstal Pytorch dengan
pip install torch -f https://download.pytorch.org/whl/torch_stable.html .
Jika tidak, Anda dapat melanjutkan
pip install pytorch-forecasting
Atau, Anda dapat menginstal paket melalui conda
conda install pytorch-forecasting pytorch -c pytorch>=1.7 -c conda-forge
Peramalan PyTorch sekarang dipasang dari saluran Conda-Forge sementara Pytorch dipasang dari saluran Pytorch.
Untuk menggunakan kehilangan MQF2 (kehilangan kuantil multivariat), juga pasang pip install pytorch-forecasting[mqf2]
Kunjungi https://pytorch-forecasting.readthedocs.io untuk membaca dokumentasi dengan tutorial terperinci.
Dokumentasi memberikan perbandingan model yang tersedia.
Untuk mengimplementasikan model baru atau komponen khusus lainnya, lihat Tutorial Model Baru. Ini mencakup arsitektur dasar maupun canggih.
Jaringan dapat dilatih dengan pelatih Pytorch Lighning di Pandas DataFrames yang pertama kali dikonversi menjadi TimeseriesDataSet.
# imports for training
import lightning . pytorch as pl
from lightning . pytorch . loggers import TensorBoardLogger
from lightning . pytorch . callbacks import EarlyStopping , LearningRateMonitor
# import dataset, network to train and metric to optimize
from pytorch_forecasting import TimeSeriesDataSet , TemporalFusionTransformer , QuantileLoss
from lightning . pytorch . tuner import Tuner
# load data: this is pandas dataframe with at least a column for
# * the target (what you want to predict)
# * the timeseries ID (which should be a unique string to identify each timeseries)
# * the time of the observation (which should be a monotonically increasing integer)
data = ...
# define the dataset, i.e. add metadata to pandas dataframe for the model to understand it
max_encoder_length = 36
max_prediction_length = 6
training_cutoff = "YYYY-MM-DD" # day for cutoff
training = TimeSeriesDataSet (
data [ lambda x : x . date <= training_cutoff ],
time_idx = ..., # column name of time of observation
target = ..., # column name of target to predict
group_ids = [ ... ], # column name(s) for timeseries IDs
max_encoder_length = max_encoder_length , # how much history to use
max_prediction_length = max_prediction_length , # how far to predict into future
# covariates static for a timeseries ID
static_categoricals = [ ... ],
static_reals = [ ... ],
# covariates known and unknown in the future to inform prediction
time_varying_known_categoricals = [ ... ],
time_varying_known_reals = [ ... ],
time_varying_unknown_categoricals = [ ... ],
time_varying_unknown_reals = [ ... ],
)
# create validation dataset using the same normalization techniques as for the training dataset
validation = TimeSeriesDataSet . from_dataset ( training , data , min_prediction_idx = training . index . time . max () + 1 , stop_randomization = True )
# convert datasets to dataloaders for training
batch_size = 128
train_dataloader = training . to_dataloader ( train = True , batch_size = batch_size , num_workers = 2 )
val_dataloader = validation . to_dataloader ( train = False , batch_size = batch_size , num_workers = 2 )
# create PyTorch Lighning Trainer with early stopping
early_stop_callback = EarlyStopping ( monitor = "val_loss" , min_delta = 1e-4 , patience = 1 , verbose = False , mode = "min" )
lr_logger = LearningRateMonitor ()
trainer = pl . Trainer (
max_epochs = 100 ,
accelerator = "auto" , # run on CPU, if on multiple GPUs, use strategy="ddp"
gradient_clip_val = 0.1 ,
limit_train_batches = 30 , # 30 batches per epoch
callbacks = [ lr_logger , early_stop_callback ],
logger = TensorBoardLogger ( "lightning_logs" )
)
# define network to train - the architecture is mostly inferred from the dataset, so that only a few hyperparameters have to be set by the user
tft = TemporalFusionTransformer . from_dataset (
# dataset
training ,
# architecture hyperparameters
hidden_size = 32 ,
attention_head_size = 1 ,
dropout = 0.1 ,
hidden_continuous_size = 16 ,
# loss metric to optimize
loss = QuantileLoss (),
# logging frequency
log_interval = 2 ,
# optimizer parameters
learning_rate = 0.03 ,
reduce_on_plateau_patience = 4
)
print ( f"Number of parameters in network: { tft . size () / 1e3 :.1f } k" )
# find the optimal learning rate
res = Tuner ( trainer ). lr_find (
tft , train_dataloaders = train_dataloader , val_dataloaders = val_dataloader , early_stop_threshold = 1000.0 , max_lr = 0.3 ,
)
# and plot the result - always visually confirm that the suggested learning rate makes sense
print ( f"suggested learning rate: { res . suggestion () } " )
fig = res . plot ( show = True , suggest = True )
fig . show ()
# fit the model on the data - redefine the model with the correct learning rate if necessary
trainer . fit (
tft , train_dataloaders = train_dataloader , val_dataloaders = val_dataloader ,
)