ThreadPoolExecutor 奇怪的内存增长

2021-06-18 02:20:37 +08:00
 Multicom

看到此贴 2021 年了,requests 内存泄露的问题解决了吗?如果没解决,怎么解决? , 便去测试了下

from concurrent.futures import ThreadPoolExecutor, wait, ALL_COMPLETED
import requests
from memory_profiler import profile

s = requests.Session()

def get(i):
    s.get('URL')

@profile
def test():
    executor = ThreadPoolExecutor(max_workers = 100)
    task = [executor.submit(get, (i)) for i in range(5000)]
    wait(task, return_when = ALL_COMPLETED)
    s.close()

if __name__ == '__main__':
    test()

5000 任务,116.3 MB

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
    10     25.6 MiB     25.6 MiB           1   @profile
    11                                         def test():
    12     25.6 MiB      0.0 MiB           1       executor = ThreadPoolExecutor(max_workers = 100)
    13    115.9 MiB     90.3 MiB        5003       task = [executor.submit(get, (i)) for i in range(5000)]
    14    116.3 MiB      0.3 MiB           1       wait(task, return_when = ALL_COMPLETED)

500 任务 10 次,141.3 MB,逐渐增长

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
    10     25.7 MiB     25.7 MiB           1   @profile
    11                                         def test():
    12     25.7 MiB      0.0 MiB           1       executor = ThreadPoolExecutor(max_workers = 100)
    13     94.8 MiB     69.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    14    106.2 MiB     11.4 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    15    109.6 MiB      3.4 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    16    111.7 MiB      2.1 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    17    114.3 MiB      2.6 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    18    115.8 MiB      1.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    19    120.0 MiB      4.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    20    121.0 MiB      1.0 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    21    124.1 MiB      3.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    22    124.6 MiB      0.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    23    126.7 MiB      2.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    24    127.4 MiB      0.8 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    25    130.5 MiB      3.1 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    26    131.8 MiB      1.3 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    27    135.2 MiB      3.4 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    28    136.7 MiB      1.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    29    137.7 MiB      1.0 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    30    138.0 MiB      0.3 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    31    139.8 MiB      1.8 MiB         503       all_task = [executor.submit(get, (i)) for i in range(500)]
    32    141.3 MiB      1.5 MiB           1       wait(all_task, return_when=ALL_COMPLETED)
    33    141.3 MiB      0.0 MiB           1       s.close()
1687 次点击
所在节点    Python
5 条回复
0x0208v0
2021-06-18 07:59:48 +08:00
requests 的 session 是非线程安全的,这么用似乎也不太对
ospider
2021-06-18 09:21:40 +08:00
requests 就是个内存泄漏的坑爹货,建议尽早划成 httpx
warcraft1236
2021-06-18 11:22:33 +08:00
@ospider requests 为啥会有内存泄漏呢?
Multicom
2021-06-18 20:54:33 +08:00
@ospider 将 requests.Session() 更换为 httpx.Client() ,内存占用降低,但仍持续增长
```
Line # Mem usage Increment Occurences Line Contents
============================================================
10 25.2 MiB 25.2 MiB 1 @profile
11 def test():
12 25.2 MiB 0.0 MiB 1 executor = ThreadPoolExecutor(max_workers = 100)
13 28.2 MiB 3.1 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
14 28.5 MiB 0.3 MiB 1 wait(task)
15 28.8 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
16 28.8 MiB 0.0 MiB 1 wait(task)
17 29.0 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
18 31.3 MiB 2.3 MiB 1 wait(task)
19 31.6 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
20 35.9 MiB 4.3 MiB 1 wait(task)
21 35.9 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
22 39.0 MiB 3.0 MiB 1 wait(task)
23 39.2 MiB 0.3 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
24 41.0 MiB 1.8 MiB 1 wait(task)
25 41.0 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
26 43.8 MiB 2.8 MiB 1 wait(task)
27 43.8 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
28 45.8 MiB 2.0 MiB 1 wait(task)
29 45.8 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
30 48.1 MiB 2.3 MiB 1 wait(task)
31 48.1 MiB 0.0 MiB 503 task = [executor.submit(get, (i)) for i in range(500)]
32 49.4 MiB 1.3 MiB 1 wait(task)
33 49.4 MiB 0.0 MiB 1 s.close()
```
Multicom
2021-06-18 20:56:34 +08:00
@v2exblog 刚学所以不太了解,原来这是错误用法,受教了

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/784102

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX