Estos códigos pueden ayudarlo a detectar la memoria de su GPU durante el entrenamiento con Pytorch.
Un blog sobre esta herramienta y explique los detalles: https://oldpan.me/archives/pytorch-gpu-memory-sage-track
Pon modelsize_estimate.py o gpu_mem_track.py en su directorio de trabajo actual e importe.
Model Sequential : params: 0.450304M
Model Sequential : intermedite variables: 336.089600 M (without backward)
Model Sequential : intermedite variables: 672.179200 M (with backward)
# 30-Apr-21-20:25:29-gpu_mem_track.txt
GPU Memory Track | 30-Apr-21-20:25:29 | Total Tensor Used Memory:0.0 Mb Total Used Memory:0.0 Mb
At main.py line 10: < module > Total Tensor Used Memory:0.0 Mb Total Allocated Memory:0.0 Mb
+ | 1 * Size:(64, 64, 3, 3) | Memory: 0.1406 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(128, 128, 3, 3) | Memory: 0.5625 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(256, 128, 3, 3) | Memory: 1.125 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(512, 256, 3, 3) | Memory: 4.5 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 3 * Size:(256, 256, 3, 3) | Memory: 6.75 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 8 * Size:(512,) | Memory: 0.0156 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 2 * Size:(64,) | Memory: 0.0004 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 7 * Size:(512, 512, 3, 3) | Memory: 63.0 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 4 * Size:(256,) | Memory: 0.0039 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(128, 64, 3, 3) | Memory: 0.2812 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 2 * Size:(128,) | Memory: 0.0009 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(64, 3, 3, 3) | Memory: 0.0065 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
At main.py line 12: < module > Total Tensor Used Memory:76.4 Mb Total Allocated Memory:76.4 Mb
+ | 1 * Size:(60, 3, 512, 512) | Memory: 180.0 M | <class 'torch.Tensor'> | torch.float32
+ | 1 * Size:(40, 3, 512, 512) | Memory: 120.0 M | <class 'torch.Tensor'> | torch.float32
+ | 1 * Size:(30, 3, 512, 512) | Memory: 90.0 M | <class 'torch.Tensor'> | torch.float32
At main.py line 18: < module > Total Tensor Used Memory:466.4 Mb Total Allocated Memory:466.4 Mb
+ | 1 * Size:(120, 3, 512, 512) | Memory: 360.0 M | <class 'torch.Tensor'> | torch.float32
+ | 1 * Size:(80, 3, 512, 512) | Memory: 240.0 M | <class 'torch.Tensor'> | torch.float32
At main.py line 23: < module > Total Tensor Used Memory:1066.4 Mb Total Allocated Memory:1066.4 Mb
- | 1 * Size:(40, 3, 512, 512) | Memory: 120.0 M | <class 'torch.Tensor'> | torch.float32
- | 1 * Size:(120, 3, 512, 512) | Memory: 360.0 M | <class 'torch.Tensor'> | torch.float32
At main.py line 29: < module > Total Tensor Used Memory:586.4 Mb Total Allocated Memory:586.4 MbEjemplo simple:
import torch
from torchvision import models
from gpu_mem_track import MemTracker
device = torch . device ( 'cuda:0' )
gpu_tracker = MemTracker () # define a GPU tracker
gpu_tracker . track () # run function between the code line where uses GPU
cnn = models . vgg19 ( pretrained = True ). features . to ( device ). eval ()
gpu_tracker . track () # run function between the code line where uses GPU
dummy_tensor_1 = torch . randn ( 30 , 3 , 512 , 512 ). float (). to ( device ) # 30*3*512*512*4/1024/1024 = 90.00M
dummy_tensor_2 = torch . randn ( 40 , 3 , 512 , 512 ). float (). to ( device ) # 40*3*512*512*4/1024/1024 = 120.00M
dummy_tensor_3 = torch . randn ( 60 , 3 , 512 , 512 ). float (). to ( device ) # 60*3*512*512*4/1024/1024 = 180.00M
gpu_tracker . track ()
dummy_tensor_4 = torch . randn ( 120 , 3 , 512 , 512 ). float (). to ( device ) # 120*3*512*512*4/1024/1024 = 360.00M
dummy_tensor_5 = torch . randn ( 80 , 3 , 512 , 512 ). float (). to ( device ) # 80*3*512*512*4/1024/1024 = 240.00M
gpu_tracker . track ()
dummy_tensor_4 = dummy_tensor_4 . cpu ()
dummy_tensor_2 = dummy_tensor_2 . cpu ()
gpu_tracker . clear_cache () # or torch.cuda.empty_cache()
gpu_tracker . track () Esto emitirá un .txt a la DIR actual y el contenido de salida está arriba (contenido de impresión).
La memoria total asignada es el pico del uso de la memoria. Cuando elimina algunos tensores, Pytorch no liberará el espacio al dispositivo, hasta que llame gpu_tracker.clear_cache() como el script de ejemplo.
El núcleo Cuda tomará algo de espacio. Ver Pytorch/Pytorch#12873
Parte del código se hace referencia desde:
http://jacobkimmel.github.io/pytorch_estimating_model_size/ https://gist.github.com/minner/8968b3b120c95d3f50b8a2a74bf66bc