Pytorch预测是一种基于Pytorch的软件包,可通过最先进的深度学习体系结构进行预测。它提供了高级API,并使用Pytorch Lightning进行自动记录的GPU或CPU训练。
| 文档·教程·发行说明 | |
|---|---|
| 开源 | |
| 社区 | |
| CI/CD | |
| 代码 |
我们有关数据科学的文章介绍了包装并提供背景信息。
Pytorch的预测旨在减轻对现实情况和研究的神经网络的最新时间预测。目的是为专业人士提供最大的灵活性,并为初学者提供最大的灵活性。具体而言,软件包提供了
该软件包是基于pytorch的构建,可以允许对CPU,单个GPU和多个开箱即用的培训。
如果您在Windows上工作,则需要先安装Pytorch
pip install torch -f https://download.pytorch.org/whl/torch_stable.html 。
否则,您可以继续
pip install pytorch-forecasting
另外,您可以通过Conda安装包裹
conda install pytorch-forecasting pytorch -c pytorch>=1.7 -c conda-forge
Pytorch的预测现在是从Conda-Forge频道安装的,而Pytorch是从Pytorch频道安装的。
要使用MQF2损失(多元分位数损失),也安装pip install pytorch-forecasting[mqf2]
请访问https://pytorch-forecasting.readthedocs.io,以详细的教程阅读文档。
该文档提供了可用模型的比较。
要实现新模型或其他自定义组件,请参见如何实现新模型教程。它涵盖了基本和高级体系结构。
可以在PANDAS DataFrames上使用Pytorch Lighning Trainer进行网络培训,这些培训师首先转换为timeseriesdataset。
# imports for training
import lightning . pytorch as pl
from lightning . pytorch . loggers import TensorBoardLogger
from lightning . pytorch . callbacks import EarlyStopping , LearningRateMonitor
# import dataset, network to train and metric to optimize
from pytorch_forecasting import TimeSeriesDataSet , TemporalFusionTransformer , QuantileLoss
from lightning . pytorch . tuner import Tuner
# load data: this is pandas dataframe with at least a column for
# * the target (what you want to predict)
# * the timeseries ID (which should be a unique string to identify each timeseries)
# * the time of the observation (which should be a monotonically increasing integer)
data = ...
# define the dataset, i.e. add metadata to pandas dataframe for the model to understand it
max_encoder_length = 36
max_prediction_length = 6
training_cutoff = "YYYY-MM-DD" # day for cutoff
training = TimeSeriesDataSet (
data [ lambda x : x . date <= training_cutoff ],
time_idx = ..., # column name of time of observation
target = ..., # column name of target to predict
group_ids = [ ... ], # column name(s) for timeseries IDs
max_encoder_length = max_encoder_length , # how much history to use
max_prediction_length = max_prediction_length , # how far to predict into future
# covariates static for a timeseries ID
static_categoricals = [ ... ],
static_reals = [ ... ],
# covariates known and unknown in the future to inform prediction
time_varying_known_categoricals = [ ... ],
time_varying_known_reals = [ ... ],
time_varying_unknown_categoricals = [ ... ],
time_varying_unknown_reals = [ ... ],
)
# create validation dataset using the same normalization techniques as for the training dataset
validation = TimeSeriesDataSet . from_dataset ( training , data , min_prediction_idx = training . index . time . max () + 1 , stop_randomization = True )
# convert datasets to dataloaders for training
batch_size = 128
train_dataloader = training . to_dataloader ( train = True , batch_size = batch_size , num_workers = 2 )
val_dataloader = validation . to_dataloader ( train = False , batch_size = batch_size , num_workers = 2 )
# create PyTorch Lighning Trainer with early stopping
early_stop_callback = EarlyStopping ( monitor = "val_loss" , min_delta = 1e-4 , patience = 1 , verbose = False , mode = "min" )
lr_logger = LearningRateMonitor ()
trainer = pl . Trainer (
max_epochs = 100 ,
accelerator = "auto" , # run on CPU, if on multiple GPUs, use strategy="ddp"
gradient_clip_val = 0.1 ,
limit_train_batches = 30 , # 30 batches per epoch
callbacks = [ lr_logger , early_stop_callback ],
logger = TensorBoardLogger ( "lightning_logs" )
)
# define network to train - the architecture is mostly inferred from the dataset, so that only a few hyperparameters have to be set by the user
tft = TemporalFusionTransformer . from_dataset (
# dataset
training ,
# architecture hyperparameters
hidden_size = 32 ,
attention_head_size = 1 ,
dropout = 0.1 ,
hidden_continuous_size = 16 ,
# loss metric to optimize
loss = QuantileLoss (),
# logging frequency
log_interval = 2 ,
# optimizer parameters
learning_rate = 0.03 ,
reduce_on_plateau_patience = 4
)
print ( f"Number of parameters in network: { tft . size () / 1e3 :.1f } k" )
# find the optimal learning rate
res = Tuner ( trainer ). lr_find (
tft , train_dataloaders = train_dataloader , val_dataloaders = val_dataloader , early_stop_threshold = 1000.0 , max_lr = 0.3 ,
)
# and plot the result - always visually confirm that the suggested learning rate makes sense
print ( f"suggested learning rate: { res . suggestion () } " )
fig = res . plot ( show = True , suggest = True )
fig . show ()
# fit the model on the data - redefine the model with the correct learning rate if necessary
trainer . fit (
tft , train_dataloaders = train_dataloader , val_dataloaders = val_dataloader ,
)