Pytorch Memory Utils
1.0.0
이 코드는 Pytorch로 훈련하는 동안 GPU 메모리를 감지하는 데 도움이 될 수 있습니다.
이 도구에 대한 블로그 및 세부 사항을 설명하십시오 : https://oldpan.me/archives/pytorch-gpu-mory-usage-track
현재 작업 디렉토리에 modelsize_estimate.py 또는 gpu_mem_track.py 넣고 가져 오십시오.
Model Sequential : params: 0.450304M
Model Sequential : intermedite variables: 336.089600 M (without backward)
Model Sequential : intermedite variables: 672.179200 M (with backward)
# 30-Apr-21-20:25:29-gpu_mem_track.txt
GPU Memory Track | 30-Apr-21-20:25:29 | Total Tensor Used Memory:0.0 Mb Total Used Memory:0.0 Mb
At main.py line 10: < module > Total Tensor Used Memory:0.0 Mb Total Allocated Memory:0.0 Mb
+ | 1 * Size:(64, 64, 3, 3) | Memory: 0.1406 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(128, 128, 3, 3) | Memory: 0.5625 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(256, 128, 3, 3) | Memory: 1.125 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(512, 256, 3, 3) | Memory: 4.5 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 3 * Size:(256, 256, 3, 3) | Memory: 6.75 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 8 * Size:(512,) | Memory: 0.0156 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 2 * Size:(64,) | Memory: 0.0004 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 7 * Size:(512, 512, 3, 3) | Memory: 63.0 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 4 * Size:(256,) | Memory: 0.0039 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(128, 64, 3, 3) | Memory: 0.2812 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 2 * Size:(128,) | Memory: 0.0009 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
+ | 1 * Size:(64, 3, 3, 3) | Memory: 0.0065 M | <class 'torch.nn.parameter.Parameter'> | torch.float32
At main.py line 12: < module > Total Tensor Used Memory:76.4 Mb Total Allocated Memory:76.4 Mb
+ | 1 * Size:(60, 3, 512, 512) | Memory: 180.0 M | <class 'torch.Tensor'> | torch.float32
+ | 1 * Size:(40, 3, 512, 512) | Memory: 120.0 M | <class 'torch.Tensor'> | torch.float32
+ | 1 * Size:(30, 3, 512, 512) | Memory: 90.0 M | <class 'torch.Tensor'> | torch.float32
At main.py line 18: < module > Total Tensor Used Memory:466.4 Mb Total Allocated Memory:466.4 Mb
+ | 1 * Size:(120, 3, 512, 512) | Memory: 360.0 M | <class 'torch.Tensor'> | torch.float32
+ | 1 * Size:(80, 3, 512, 512) | Memory: 240.0 M | <class 'torch.Tensor'> | torch.float32
At main.py line 23: < module > Total Tensor Used Memory:1066.4 Mb Total Allocated Memory:1066.4 Mb
- | 1 * Size:(40, 3, 512, 512) | Memory: 120.0 M | <class 'torch.Tensor'> | torch.float32
- | 1 * Size:(120, 3, 512, 512) | Memory: 360.0 M | <class 'torch.Tensor'> | torch.float32
At main.py line 29: < module > Total Tensor Used Memory:586.4 Mb Total Allocated Memory:586.4 Mbsimple example:
import torch
from torchvision import models
from gpu_mem_track import MemTracker
device = torch . device ( 'cuda:0' )
gpu_tracker = MemTracker () # define a GPU tracker
gpu_tracker . track () # run function between the code line where uses GPU
cnn = models . vgg19 ( pretrained = True ). features . to ( device ). eval ()
gpu_tracker . track () # run function between the code line where uses GPU
dummy_tensor_1 = torch . randn ( 30 , 3 , 512 , 512 ). float (). to ( device ) # 30*3*512*512*4/1024/1024 = 90.00M
dummy_tensor_2 = torch . randn ( 40 , 3 , 512 , 512 ). float (). to ( device ) # 40*3*512*512*4/1024/1024 = 120.00M
dummy_tensor_3 = torch . randn ( 60 , 3 , 512 , 512 ). float (). to ( device ) # 60*3*512*512*4/1024/1024 = 180.00M
gpu_tracker . track ()
dummy_tensor_4 = torch . randn ( 120 , 3 , 512 , 512 ). float (). to ( device ) # 120*3*512*512*4/1024/1024 = 360.00M
dummy_tensor_5 = torch . randn ( 80 , 3 , 512 , 512 ). float (). to ( device ) # 80*3*512*512*4/1024/1024 = 240.00M
gpu_tracker . track ()
dummy_tensor_4 = dummy_tensor_4 . cpu ()
dummy_tensor_2 = dummy_tensor_2 . cpu ()
gpu_tracker . clear_cache () # or torch.cuda.empty_cache()
gpu_tracker . track () 이것은 현재 DIR에 .txt 출력하고 출력의 내용이 위에 있습니다 (인쇄 내용).
총 할당 된 메모리는 메모리 사용의 피크입니다. 텐서를 삭제하면 Pytorch는 예제 스크립트와 같이 gpu_tracker.clear_cache() 호출 할 때까지 공간을 장치에 공개하지 않습니다.
Cuda 커널은 공간을 차지할 것입니다. Pytorch/Pytorch#12873을 참조하십시오
코드의 일부는 다음에서 참조됩니다.
http://jacobkimmel.github.io/pytorch_estimating_model_size/ https.github.com/minner/8968b3b120c95d3f50b8a22a74bf66bc