3-6-对迭代的各种方法进行计时

发表于 2020-12-20 更新于 2022-02-04

1. 对迭代的各种方法进行计时

列表解析要比 for 循环语句有速度方面的性能优势，而且 map 会依据调用方法的不同表现出更好或更差的性能。生成器函数和表达式比列表解析速度慢一些，但是它们把内存需求降到了最小，并且不会延迟结果的生成。

1.1 对模块计时

获取多次调用函数的总时间的简单函数：

# timer0.py 文件
import time
def timer(func, *args):
    start = time.clock()
    for i in range(1000):               # 调用 1000 次
        func(*args)
    return time.clock() - start

1 2	from timer0 import timer timer(pow, 2, 1000)

0.0007808002220942854

1	timer(str.upper, 'spam')

8.50489130801435e-05

一个依旧简单，但是更有用的计时器函数：

# timer.py 文件
import time, sys
timer = time.clock if sys.platform[:3] == 'win' else time.time

def total(reps, func, *pargs, **kargs):
    """
    Total time to run func() reps times.  
    Return (total time, last result)  
    """
    repslist = list(range(reps))
    start = timer()
    for i in repslist:
        ret = func(*pargs, **kargs)
    elapsed = timer() - start
    return (elapsed, ret)

def bestof(reps, func, *pargs, **kargs):
    """
    Quickest func() among reps runs.
    Returns (best time, last result)
    """
    best = 2 ** 32 
    for i in range(reps):                     # range 在这里不被计时
        start = timer()
        ret = func(*pargs, **kargs)
        elapsed = timer() - start 
        if elapsed < best: best = elapsed
    return (best, ret)

def bestoftotal(reps1, reps2, func, *pargs, **kargs):
    """
    Best of totals:  
    (best of reps1 runs of (total of reps2 runs of func))  
    """
    return bestof(reps1, total, reps2, func, *pargs, **kargs)

这个版本解决最初版本缺点的一些要点：

Python 的 time 模块获取当前的时间，精度随着每个平台而有所不同。在 Windows 上的 clock 函数调用能够达到微秒的精度。因为 time 函数在 Unix 上可能更好，这个脚本通过平台自动选择函数。
range 调用在 total 函数中被放到了计时循环之外，因此，它的构建成本不会计算到 Python 2.X 的计时函数中，在 Python 3.X 的 range 是一个迭代器，因此这个步骤是不需要也无害的。
reps 计数作为一个参数传入，在测试函数和参数之前，允许每次调用中重复不同次数。
任意数量的位置和关键字参数都会被带星号的参数收集，因此它们必须独立地传入，而不是以序列或字典的形式。如果需要，调用者可以将参数集合解包为独立参数。
第一个函数返回元组中所有调用的经过时间，和函数最终的返回值。
第二个函数类似，但是返回最优（最小）时间而不是总时间。
最后一个函数在最优测试中嵌套完全测试，得到完全最优时间。

1 2	import timer timer.total(1000, pow, 2, 1000)[0]

0.0007961602264572321

1	timer.total(1000, str.upper, 'spam')

(7.651557731946923e-05, 'SPAM')

1	timer.bestof(1000, str.upper, 'spam')

(0.0, 'SPAM')

1	timer.bestof(1000, pow, 2, 1000000)[0]

0.0029186852746647673

1	timer.bestof(50, timer.total, 1000, str.upper, 'spam')

(8.760891378756241e-05, (7.623113279464633e-05, 'SPAM'))

1	timer.bestoftotal(50, 1000, str.upper, 'spam')

(8.789335831238532e-05, (7.594668826982343e-05, 'SPAM'))

最后两个计算完全最优时间——在 50 次运行中的最优时间，每次计算调用 str.upper 1000 次的总时间。

使用基于生成器的 min 可以产生类似的效果：

1	min(timer.total(1000, str.upper, 'spam') for i in range(50))

(7.566224371657881e-05, 'SPAM')

1.2 计时脚本

测试五种构建结果列表的替代方法：

# timeseqs.py 文件
import sys, timer
reps = 10000
repslist = list(range(reps))

def forloop():
    res = []
    for x in repslist:
        res.append(abs(x))
    return res

def listComp():
    return [abs(x) for x in repslist]

def mapCall():
    return list(map(abs, repslist))

def genExpr():
    return list(abs(x) for x in repslist)

def genFunc():
    def gen():
        for x in repslist:
            yield abs(x)
    return list(gen())

print(sys.version)
for test in (forloop, listComp, mapCall, genExpr, genFunc):
    (bestof, (total, result)) = timer.bestoftotal(5, 1000, test)
    print('%-9s: %.5f => [%s...%s]'%
          (test.__name__, bestof, result[0], result[-1]))

3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
forloop  : 1.02399 => [0...9999]
listComp : 0.57107 => [0...9999]
mapCall  : 0.25094 => [0...9999]
genExpr  : 0.91124 => [0...9999]
genFunc  : 0.97723 => [0...9999]

1.3 计时结果

map 比列表解析略微快一点，但二者都比 for 循环要快很多，生成器表达式和函数速度居中。

函数调用的影响：map
当在每次迭代上执行一个真正的操作，而不是调用内置函数时：

# timeseqs.py 文件
import sys, timer
reps = 10000
repslist = list(range(reps))

def forloop():
    res = []
    for x in repslist:
        res.append(x + 10)
    return res

def listComp():
    return [x + 10 for x in repslist]

def mapCall():
    return list(map((lambda x: x + 10), repslist))

def genExpr():
    return list(x + 10 for x in repslist)

def genFunc():
    def gen():
        for x in repslist:
            yield x + 10
    return list(gen())

print(sys.version)
for test in (forloop, listComp, mapCall, genExpr, genFunc):
    (bestof, (total, result)) = timer.bestoftotal(5, 1000, test)
    print('%-9s: %.5f => [%s...%s]'%
          (test.__name__, bestof, result[0], result[-1]))

3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
forloop  : 0.86569 => [10...10009]
listComp : 0.37758 => [10...10009]
mapCall  : 0.94133 => [10...10009]
genExpr  : 0.71446 => [10...10009]
genFunc  : 0.74132 => [10...10009]

如果需要针对 map 调用来调用一个用户定义的函数，会使它比 for 循环语句慢。

1.4 计时模块替代方案

前面的计时模块可以更加用户友好，最明显的，它的函数需要传递一个重复计数作为第一个参数，没有提供默认参数，也可以利用 min 技术来简化返回值。

如下的替代实现解决了这些问题，允许重复计数作为一个名为 _reps 的关键字参数传入：

# timer2.py 文件
"""
total(spam, 1, 2, a=3, b=4, _reps=1000) calls and times spam(1, 2, a=3, b=4)
_reps times, and return total time for all runs, with final result.

bestof(spam, 1, 2, a=3, b=4, _reps=5) runs best-of-N timer to attempt to
filter out system load variation, and returns best time among _reps tests.

bestoftotal(spam, 1, 2, a=3, b=4, _reps1=5, reps=1000) runs best-of-totals
test, which takes the best among _reps1 runs of (the total of _reps runs);
"""

import time, sys
timer = time.clock if sys.platform[:3] == 'win' else time.time

def total(func, *pargs, **kargs):
    _reps = kargs.pop('_reps', 1000)           # 传入或者默认 reps
    repslist = list(range(_reps))
    start = timer()
    for i in repslist:
        ret = func(*pargs, **kargs)
    elapsed = timer() - start
    return (elapsed, ret)

def bestof(func, *pargs, **kargs):
    _reps = kargs.pop('_reps', 5)
    best = 2 ** 32
    for i in range(_reps):
        start = timer()
        ret = func(*pargs, **kargs)
        elapsed = timer() - start
        if elapsed < best: best = elapsed
    return (best, ret)

def bestoftotal(func, *pargs, **kargs):
    _reps1 = kargs.pop('_reps1', 5)
    return min(total(func, *pargs, **kargs) for i in range(_reps1))

使用字典的 pop 操作移除 _reps 参数，用于测试函数和提供默认值。

1 2	from timer2 import total, bestof, bestoftotal total(pow, 2, 1000)[0] # 使用默认重复次数

0.0007907557774160523

1	total(pow, 2, 1000, _reps=100000)[0] # 1M 次重复

0.08083116966008674

1	bestof(pow, 2, 100000, _reps=30)[0]

0.00022954673113417812

1	bestoftotal(str.upper, 'spam', _reps1=30, _reps=1000)

(7.566224303445779e-05, 'SPAM')

在 Python 3.X 中使用 keyword-only 参数
可以使用 Python 3.X 的 keyword-only 参数来简化计时器模块代码：

# timer3.py 文件
"""
Same usage as timer2.py, but uses 3.X keyword-only default arguments
instead of dict pops for simpler code. No need to hoist range() out
of tests in 3.X: always a generator in 3.X, and this can't run on 2.X."""
import time, sys
timer = time.clock if sys.platform[:3] == 'win' else time.time

def total(func, *pargs, _reps=1000, **kargs):
    start = timer()
    for i in range(_reps):
        ret = func(*pargs, **kargs)
    elapsed = timer() - start
    return (elapsed, ret)

def bestof(func, *pargs, _reps=5, **kargs):
    best = 2 ** 32
    for i in range(_reps):
        start = timer()
        ret = func(*pargs, **kargs)
        elapsed = timer() - start
        if elapsed < best: best = elapsed
    return (best, ret)

def bestoftotal(func, *pargs, _reps1=5, **kargs):
    return min(total(func, *pargs, **kargs) for i in range(_reps1))

2. 使用 timeit 对迭代和 Python 进行计时

标准函数提供了一个 timeit 模块对函数进行计时。

2.1 基本 timeit 用法

使用 timeit，测试方法由可调用对象或者语句字符串指定；对于后者，可以使用 ; 分隔符或者 \n 换行符来执行多个语句，使用空格或制表符在嵌套块中缩进语句。测试可以通过命令行和 API 调用执行。

交互式用法和 API 调用

1 2	import timeit min(timeit.repeat(stmt='[x ** 2 for x in range(1000)]', number=1000, repeat=5))

0.2696741744839528

命令行用法
在这个模式中，timeit 报告 n 次循环的平均用时，单位为微秒(usec)、毫秒(msec)或秒(sec)。

1	!E:\Anaconda\Lib\timeit.py -n 1000 "[x ** 2 for x in range(1000)]"

1	!Python -m timeit -n 1000 "[x ** 2 for x in range(1000)]"

1000 loops, best of 3: 262 usec per loop

1	!Python -m timeit -n 1000 -r 5 "[x ** 2 for x in range(1000)]"

1000 loops, best of 5: 274 usec per loop

对多行语句计时
使用换行符、制表符或者空格来满足 Python 的语法。

1
2
3

import timeit
min(timeit.repeat(number=10000, repeat=3,
    stmt='L = [1, 2, 3, 4, 5]\nfor i in range(len(L)): L[i] += 1'))

0.006462304362332779

1 2	min(timeit.repeat(number=10000, repeat=3, stmt='L = [1, 2, 3, 4, 5]\ni=0\nwhile i < len(L):\n\tL[i] += 1\n\ti += 1'))

0.008284458583247556

1 2	min(timeit.repeat(number=10000, repeat=3, stmt='L = [1, 2, 3, 4, 5]\nM = [x + 1 for x in L]'))

0.004370780792811502

在命令行模式中，每一个语句作为一个单独的参数，使用空格作为缩进。

1	!python -m timeit -n 1000 -r 3 "L = [1, 2, 3, 4, 5]" "i=0" "while i < len(L):" " L[i] += 1" " i += 1"

1000 loops, best of 3: 0.759 usec per loop

2.2 基准模块和脚本：timeit

"""
pybench.py: Test speed of one or more Pythons on a set of simple
code-string benchmarks. A function, to allow stmts to vary.
This system itself runs on both 2.X and 3.X, and may spawn both.
Uses timeit to test either the Python running this script by API
calls, or a set of Pythons by reading spawned command-line outputs
(os.popen) with Python's -m flag to find timeit on module search path.
Replaces $listif3 with a list() around generators for 3.X and an
empty string for 2.X, so 3.X does same work as 2.X. In command-line
mode only, must split multiline statements into one separate quoted
argument per line so all will be run (else might run/time first line
only), and replace all \t in indentation with 4 spaces for uniformity.
Caveats: command-line mode (only) may fail if test stmt embeds double
quotes, quoted stmt string is incompatible with shell in general, or
command-line exceeds a length limit on platform's shell--use API call
mode or homegrown timer; does not yet support a setup statement: as is,
time of all statements in the test stmt are charged to the total time.
"""

import sys, os, timeit
defnum, defrep = 1000, 5

def runner(stmts, pythons=None, tracecmd=False):
    """
    Main logic: run tests per input lists, caller handles usage modes.
    stmts: [(number?, repeat?, stmt-string)], replaces $listif3 in stmt
    pythons: None=this python only, or [(ispy3?, python-executable-path)]
    """
    print(sys.version)
    for (number, repeat, stmt) in stmts:
        number = number or defnum
        repeat = repeat or defrep
        
        if not pythons:
            # Run stmt on this python: API call
            # No need to split lines or quote here
            ispy3 = sys.version[0] == '3'
            stmt = stmt.replace('$listif3', 'list' if ispy3 else '')
            best = min(timeit.repeat(stmt=stmt, number=number, repeat=repeat))
            print('%.4f [%r]' % (best, stmt[:70]))
        
        else:
            # Run stmt on all pythons: command line
            # Split lines into quoted arguments
            print('-' * 80)
            print('[%r]' % stmt)
            for (ispy3, python) in pythons:
                stmt1 = stmt.replace('$listif3', 'list' if ispy3 else '')
                stmt1 = stmt1.replace('\t', ' ' * 4)
                lines = stmt1.split('\n')
                args = ' '.join('"%s"' % line for line in lines)
                cmd = '%s -m timeit -n %s -r %s %s' % (python, number, repeat, args)
                print(python)
                if tracecmd: print(cmd)
                print('\t' + os.popen(cmd).read().rstrip())

"""
pybench_cases.py: Run pybench on a set of pythons and statements.
Select modes by editing this script or using command-line arguments (in
sys.argv): e.g., run a "C:\python27\python pybench_cases.py" to test just
one specific version on stmts, "pybench_cases.py -a" to test all pythons
listed, or a "py −3 pybench_cases.py -a -t" to trace command lines too.
"""
import pybench, sys
pythons = [(1, "E:\Anaconda")]            # (ispy3?, path)
stmts = [                                        # (num,rpt,stmt)
    (0, 0, "[x ** 2 for x in range(1000)]"),         # Iterations
    (0, 0, "res=[]\nfor x in range(1000): res.append(x ** 2)"),    # \n=multistmt
    (0, 0, "$listif3(map(lambda x: x ** 2, range(1000)))"),        # \n\t=indent
    (0, 0, "list(x ** 2 for x in range(1000))"),                   # $=list or ''
    (0, 0, "s = 'spam' * 2500\nx = [s[i] for i in range(10000)]"), # String ops
    (0, 0, "s = '?'\nfor i in range(10000): s += '?'"),
]
tracecmd = '-t' in sys.argv                           # -t: trace command lines?
pythons = pythons if '-a' in sys.argv else None       # -a: all in list, else one?
pybench.runner(stmts, pythons, tracecmd)

3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
0.2585 ['[x ** 2 for x in range(1000)]']
0.2967 ['res=[]\nfor x in range(1000): res.append(x ** 2)']
0.3181 ['list(map(lambda x: x ** 2, range(1000)))']
0.2980 ['list(x ** 2 for x in range(1000))']
0.4162 ["s = 'spam' * 2500\nx = [s[i] for i in range(10000)]"]
2.5342 ["s = '?'\nfor i in range(10000): s += '?'"]

1	!python pybench_cases.py -a

3. 函数陷阱

3.1 本地变量是静态检测的

3.2 默认可变对象

默认参数是在 def 语句运行时评估并保存的，而不是在这个函数调用时。

从内部来讲，Python 会将每一个默认参数保存成一个对象，附加在这个函数本身。

3.3 没有 return 语句的函数

如果没有提供 return 语句，函数将自动返回 None 对象。