请访问 https://vmprof.readthedocs.org 了解更多信息!
pip install vmprof
python -m vmprof <your program> <your program args>我们的构建系统将轮子运送到 PyPI(Linux、Mac OS X)。如果从源代码构建,则需要安装 CPython 开发标头和 libunwind 标头(仅限 Linux)。在 Windows 上,这意味着您的 Python 版本需要 Microsoft Visual C++ 编译器。
可以使用以下命令来完成开发设置:
$ virtualenv -p /usr/bin/python3 vmprof3
$ source vmprof3/bin/activate
$ python setup.py develop
需要安装python开发包。例如,对于 Debian 或 Ubuntu,您需要的软件包是python3-dev和libunwind-dev 。现在是时候编写测试并实现您的功能了。如果您希望所做的更改影响 vmprof.com,请访问 https://github.com/vmprof/vmprof-server 并按照设置说明进行操作。
有关更多信息,请参阅 https://vmprof.readthedocs.org 上的开发部分。
vmprofshow是VMProf附带的命令行工具。它可以读取配置文件并生成格式化的输出。
以下是如何使用vmprofshow的示例:
运行那个会消耗 CPU 周期的小程序(启用 vmprof):
$ pypy vmprof/test/cpuburn.py # you can find cpuburn.py in the vmprof-python repo这将生成一个配置文件vmprof_cpuburn.dat 。现在使用vmprofshow显示配置文件。 vmprofshow有多种显示数据的模式。我们将从基于树的模式开始。
$ vmprofshow vmprof_cpuburn.dat tree您将看到(彩色)输出:
$ vmprofshow vmprof_cpuburn.dat tree
100.0% <module> 100.0% tests/cpuburn.py:1
100.0% .. test 100.0% tests/cpuburn.py:35
100.0% .... burn 100.0% tests/cpuburn.py:26
99.2% ...... _iterate 99.2% tests/cpuburn.py:19
97.7% ........ _iterate 98.5% tests/cpuburn.py:19
22.9% .......... _next_rand 23.5% tests/cpuburn.py:14
22.9% ............ JIT code 100.0% 0x7fa7dba57a10
74.7% .......... JIT code 76.4% 0x7fa7dba57a10
0.1% .......... JIT code 0.1% 0x7fa7dba583b0
0.5% ........ _next_rand 0.5% tests/cpuburn.py:14
0.0% ........ JIT code 0.0% 0x7fa7dba583b0还有一个选项--html可以发出与 HTML 相同的信息以在浏览器中查看。在这种情况下,树枝可以交互地展开和折叠。
vmprof 支持行分析模式,该模式可以收集和显示函数内单独行的统计信息。
要启用行统计信息的收集,请将--lines参数添加到 vmprof:
$ python -m vmprof --lines -o < output-file > < your program > < your program args >或者在从代码调用 vmprof 时,将lines=True参数传递给vmprof.enable函数。
要查看所有函数的行统计信息,请使用vmprofshow的lines模式:
$ vmprofshow < output-file > lines要查看特定函数的行统计信息,请使用--filter参数和函数名称:
$ vmprofshow < output-file > lines --filter < function-name >您将看到结果:
$ vmprofshow vmprof_cpuburn.dat lines --filter _next_rand
Total hits: 1170 s
File: tests/cpuburn.py
Function: _next_rand at line 14
Line # Hits % Hits Line Contents
=======================================
14 38 3.2 def _next_rand(self):
15 # http://rosettacode.org/wiki/Linear_congruential_generator
16 835 71.4 self._rand = (1103515245 * self._rand + 12345) & 0x7fffffff
17 297 25.4 return self._randvmprofshow还具有flat模式。
虽然vmprofshow的基于树和基于行的输出样式可以很好地了解从调用图的“根”查看时花费在何处的时间,但有时需要从“叶子”获取视图。当存在从多个位置调用的函数时,这特别有用,其中每次调用不会消耗太多时间,但所有调用加在一起确实会产生巨大的成本。
$ vmprofshow vmprof_cpuburn.dat flat andreask_work@dunkel 15:24
28.895% - _PyFunction_Vectorcall:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/call.c:389
18.076% - _iterate:cpuburn.py:20
17.298% - _next_rand:cpuburn.py:15
5.863% - <native symbol 0x563a5f4eea51>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/longobject.c:3707
5.831% - PyObject_SetAttr:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/object.c:1031
4.924% - <native symbol 0x563a5f43fc01>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:787
4.762% - PyObject_GetAttr:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/object.c:931
4.373% - <native symbol 0x563a5f457eb1>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:1071
3.758% - PyNumber_Add:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:957
3.110% - <native symbol 0x563a5f47c291>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/longobject.c:4848
1.587% - PyNumber_Multiply:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:988
1.166% - _PyObject_GetMethod:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/object.c:1139
0.356% - <native symbol 0x563a5f4ed8f1>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/longobject.c:3432
0.000% - <native symbol 0x7f0dce8cca80>:-:0
0.000% - test:cpuburn.py:36
0.000% - burn:cpuburn.py:27有时可能需要排除“本机”函数:
$ vmprofshow vmprof_cpuburn.dat flat --no-native andreask_work@dunkel 15:27
53.191% - _next_rand:cpuburn.py:15
46.809% - _iterate:cpuburn.py:20
0.000% - test:cpuburn.py:36
0.000% - burn:cpuburn.py:27请注意,输出表示每个函数花费的时间,不包括调用的函数。 (在--no-native模式下,本机代码被调用者仍包含在总数中。)
有时,可能还需要获得包含被调用函数的计时:
$ vmprofshow vmprof_cpuburn.dat flat --include-callees andreask_work@dunkel 15:31
100.000% - <native symbol 0x7f0dce8cca80>:-:0
100.000% - test:cpuburn.py:36
100.000% - burn:cpuburn.py:27
100.000% - _iterate:cpuburn.py:20
53.191% - _next_rand:cpuburn.py:15
28.895% - _PyFunction_Vectorcall:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/call.c:389
7.807% - PyNumber_Multiply:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:988
7.483% - <native symbol 0x563a5f457eb1>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:1071
6.220% - <native symbol 0x563a5f4eea51>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/longobject.c:3707
5.831% - PyObject_SetAttr:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/object.c:1031
4.924% - <native symbol 0x563a5f43fc01>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:787
4.762% - PyObject_GetAttr:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/object.c:931
3.758% - PyNumber_Add:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/abstract.c:957
3.110% - <native symbol 0x563a5f47c291>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/longobject.c:4848
1.166% - _PyObject_GetMethod:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/object.c:1139
0.356% - <native symbol 0x563a5f4ed8f1>:/home/conda/feedstock_root/build_artifacts/python-split_1608956461873/work/Objects/longobject.c:3432
该视图与“树”视图非常相似,只是没有嵌套。