V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
lzjunika
V2EX  ›  Python

celery 为何部署到服务器上变成同步阻塞了

  •  
  •   lzjunika · 320 天前 · 1424 次点击
    这是一个创建于 320 天前的主题,其中的信息可能已经有所发展或是发生改变。

    最近整个小应用

    业务服务:gunicorn + flask

    后台服务:celery

    主业务主文件转换与上传,比较耗时,所以用 celery 跑后台服务

    本地(Mac book): 能正常异步跑,自动切换多个 worker,无阻塞,是理想效果:

    [2023-08-29 16:37:14,279: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[54e1286f-74c4-48d4-98e5-99937c65714c] received
    [2023-08-29 16:37:14,417: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[5b4d53fc-afc2-4e50-9b9c-905d5eddddde] received
    [2023-08-29 16:37:14,486: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[8fb95642-6900-4aaa-b666-11f98e3a0eea] received
    [2023-08-29 16:37:14,531: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[4df12aa2-458c-4946-8aad-3ed25e68c5e0] received
    [2023-08-29 16:37:14,583: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[93192a4d-7569-44c7-a09b-7035ea331901] received
    [2023-08-29 16:37:14,618: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[6897f139-bc7a-4b9f-aab8-22c6d7a07a85] received
    [2023-08-29 16:37:14,660: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[d301e702-accd-44b2-b85e-2f7d3c3a4e4f] received
    [2023-08-29 16:37:14,690: WARNING/ForkPoolWorker-8] requestId:
    [2023-08-29 16:37:14,693: WARNING/ForkPoolWorker-8] 0000018A4070950C5A0294A2CECAB8DF
    [2023-08-29 16:37:14,701: WARNING/ForkPoolWorker-8] [2023-08-29 16:37:14,701] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:37:14,701: WARNING/ForkPoolWorker-8] obs_upload_file:OK
    [2023-08-29 16:37:14,702: WARNING/ForkPoolWorker-8] test_1.png
    [2023-08-29 16:37:14,736: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[42c63363-9528-4f59-9e21-b2816907141f] received
    [2023-08-29 16:37:14,737: INFO/ForkPoolWorker-8] Task pss.api.offce.put_content_to_obs[54e1286f-74c4-48d4-98e5-99937c65714c] succeeded in 0.4250246670001161s: True
    [2023-08-29 16:37:14,755: WARNING/ForkPoolWorker-1] requestId:
    [2023-08-29 16:37:14,756: WARNING/ForkPoolWorker-1] 0000018A407095555502052E5A386783
    [2023-08-29 16:37:14,763: WARNING/ForkPoolWorker-1] [2023-08-29 16:37:14,761] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:37:14,761: WARNING/ForkPoolWorker-1] obs_upload_file:OK
    [2023-08-29 16:37:14,767: WARNING/ForkPoolWorker-1] test_2.png
    [2023-08-29 16:37:14,785: INFO/ForkPoolWorker-1] Task pss.api.offce.put_content_to_obs[5b4d53fc-afc2-4e50-9b9c-905d5eddddde] succeeded in 0.3451121250000142s: True
    [2023-08-29 16:37:14,788: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[835c3945-2396-4490-94d0-421298d1813f] received
    [2023-08-29 16:37:14,890: WARNING/ForkPoolWorker-2] requestId:
    [2023-08-29 16:37:14,891: WARNING/ForkPoolWorker-2] 0000018A407095E8540AE00FC334E409
    [2023-08-29 16:37:14,892: WARNING/ForkPoolWorker-2] [2023-08-29 16:37:14,892] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:37:14,892: WARNING/ForkPoolWorker-2] obs_upload_file:OK
    [2023-08-29 16:37:14,893: WARNING/ForkPoolWorker-2] test_3.png
    [2023-08-29 16:37:14,895: INFO/ForkPoolWorker-2] Task pss.api.offce.put_content_to_obs[8fb95642-6900-4aaa-b666-11f98e3a0eea] succeeded in 0.3848593749999054s: True
    

    服务器(Centos7.6): 只能同步跑,只有一个固定 worder 在运行,同步阻塞:

    [2023-08-29 16:25:58,664: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[873c3f38-98b4-47cc-98e8-6f65a58c3269] received
    [2023-08-29 16:25:58,733: WARNING/ForkPoolWorker-7] requestId:
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] 0000018A406644C054084BB9021C6A9B
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] [2023-08-29 16:25:58,734] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] obs_upload_file:OK
    [2023-08-29 16:25:58,734: WARNING/ForkPoolWorker-7] test_8.png
    [2023-08-29 16:25:58,735: INFO/ForkPoolWorker-7] Task pss.api.offce.put_content_to_obs[873c3f38-98b4-47cc-98e8-6f65a58c3269] succeeded in 0.07009824365377426s: True
    [2023-08-29 16:26:00,287: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[7a4868f2-305f-4f6b-992c-6ea0791f3427] received
    [2023-08-29 16:26:00,370: WARNING/ForkPoolWorker-7] requestId:
    [2023-08-29 16:26:00,370: WARNING/ForkPoolWorker-7] 0000018A40664B17550A56827D6506B2
    [2023-08-29 16:26:00,371: WARNING/ForkPoolWorker-7] [2023-08-29 16:26:00,371] WARNING in offce: obs_upload_file:OK
    [2023-08-29 16:26:00,371: WARNING/ForkPoolWorker-7] obs_upload_file:OK
    [2023-08-29 16:26:00,371: WARNING/ForkPoolWorker-7] test_9.png
    [2023-08-29 16:26:00,372: INFO/ForkPoolWorker-7] Task pss.api.offce.put_content_to_obs[7a4868f2-305f-4f6b-992c-6ea0791f3427] succeeded in 0.08343333378434181s: True
    

    如上运行日志

    本地正常运行,有 3 个 worker:ForkPoolWorker-8 、ForkPoolWorker-1 和 ForkPoolWorker-2 ,并行处理,服务不阻塞

    远程服务器只能同步运行,且只有一个固定的 worker: ForkPoolWorker-7,服务阻塞的

    是不是远程哪里的配置不对,远程服务器启动信息摘要也跟本地一样:

     -------------- celery@xg-003 v5.2.7 (dawn-chorus)
    --- ***** -----
    -- ******* ---- Linux-3.10.0-1062.1.2.el7.x86_64-x86_64-with-centos-7.7.1908-Core 2023-08-29 15:48:54
    - *** --- * ---
    - ** ---------- [config]
    - ** ---------- .> app:         pss:0x7f348eaec160
    - ** ---------- .> transport:   redis://***:6379/6
    - ** ---------- .> results:     redis://***:6379/6
    - *** --- * --- .> concurrency: 8 (prefork)
    -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
    --- ***** -----
     -------------- [queues]
                    .> celery           exchange=celery(direct) key=celery
    

    请各位指教,是不是哪里配置有问题

    19 条回复    2023-09-04 14:10:21 +08:00
    kkk9
        1
    kkk9  
       320 天前
    redis 正常吗
    lzjunika
        2
    lzjunika  
    OP
       320 天前
    都能正常运行,本地由于是异步的,很快执行完毕
    服务器上,一真无法异步,worker 从默认的 prefork ,调到 eventlet 和 gevent ,都不行
    有没有人遇到这种情况
    lzjunika
        3
    lzjunika  
    OP
       320 天前
    @kkk9 redis 正常,换成本机了
    purensong
        4
    purensong  
       320 天前
    我觉得是参数的问题,
    kkk9
        5
    kkk9  
       320 天前
    首先确定队列是正常的,redis 存活,可以连接,密码没错

    然后是 queues ,看服务器上只有 celery ,是否 mac 不是 celery ,另外指定?
    kkk9
        6
    kkk9  
       320 天前
    可以注意下 seetings 部分
    lzjunika
        7
    lzjunika  
    OP
       320 天前
    服务器和本地运行 celery 的命令一样:
    celery -A make_celery worker --loglevel INFO --logfile=/logs/celery.log
    lzjunika
        8
    lzjunika  
    OP
       320 天前
    @kkk9 queues 和 seetings 这两怎么排查
    Anybfans
        9
    Anybfans  
       320 天前
    安装个 flower 看下
    celerysoft
        10
    celerysoft  
       320 天前
    @lzjunika #7 看了下你的启动命令,可能是因为没有指定 concurrency ,默认 concurrency 和机器的 CPU 核心数相同,盲猜服务器是 1 核的?那指定一下 concurrency 参数应该就好了
    lzjunika
        11
    lzjunika  
    OP
       320 天前
    redis 确实是有值,说明是联通的:
    127.0.0.1:6379[6]> keys *
    1) "celery-task-meta-a144f43b-93eb-4047-bc01-6d0fdfe9b8f6"
    2) "celery-task-meta-865395d9-2226-4969-a269-a93d56ee3c4c"
    3) "celery-task-meta-2c44dafc-93e4-4792-8a40-7f747bbd063b"
    4) "celery-task-meta-0203b744-504b-414f-adda-41b45fe2aff9"
    5) "celery-task-meta-16d37b55-b645-4e05-b58b-55b87fbf4e37"
    6) "celery-task-meta-1e2fc20a-a31d-41a3-9003-5c7ffef30e42"
    7) "celery-task-meta-a819a02b-7c15-475d-907a-7ab5ed5221cd"
    8) "celery-task-meta-c2779805-d922-4423-b2bd-976317e5486d"
    9) "celery-task-meta-7a4868f2-305f-4f6b-992c-6ea0791f3427"
    10) "celery-task-meta-ff756f38-02c7-4e1f-8b20-39db4722fe83"
    11) "celery-task-meta-0e38860b-dd44-47c2-9e40-4a1f4a7c4bb4"
    12) "celery-task-meta-3187c555-d3a3-46b1-bf13-3bc38bc79fbd"
    13) "celery-task-meta-873c3f38-98b4-47cc-98e8-6f65a58c3269"
    14) "_kombu.binding.celery"
    15) "_kombu.binding.celery.pidbox"
    16) "celery-task-meta-bca09af8-14f4-4d00-84d1-baae7d233070"
    17) "celery-task-meta-4f2c9e67-86a8-410f-bbe4-1a408981fd1a"
    18) "celery-task-meta-cc93cd0f-f931-4a8c-a24e-795863531953"
    19) "celery-task-meta-53d64e39-c872-46d7-a392-57e8617b8751"
    20) "celery-task-meta-30efb54a-9f95-46e0-bd49-4190d5684f4c"
    21) "celery-task-meta-ca6a5f83-3cab-4111-92c8-f154c2e03766"
    22) "celery-task-meta-02a741d2-7426-4339-ad57-a5eea00c72e6"
    23) "_kombu.binding.celeryev"
    24) "celery-task-meta-94218c29-08b7-4982-ac15-2bc349767fa6"
    25) "celery-task-meta-2a9fd0de-2f14-4dbe-a26e-56d6a22c8466"
    26) "celery-task-meta-2c9da801-8383-4829-8db0-a0cf5fb8030b"
    27) "celery-task-meta-d3d0c01d-359d-45d2-809c-9cbc5072b73d"
    28) "celery-task-meta-71610058-15ea-4d5c-b282-226448300228"
    29) "celery-task-meta-ee4efe45-43c3-44e6-af0e-df843cb57ea6"
    30) "celery-task-meta-6ea9d50a-6b6e-4e28-a8cb-837c6517da54"
    lzjunika
        12
    lzjunika  
    OP
       320 天前
    @celerysoft
    8 核 32g 的服务器,按默认启动 concurrency 值是 8 ,上面有运行摘要截图,这个试过了,手动指定也不行
    lzjunika
        13
    lzjunika  
    OP
       320 天前
    手动指定了 3 个并发
    唯一改变的是 workerid:celery -A make_celery worker --concurrency=3 --loglevel INFO

    但还是同步的:

    [2023-08-29 18:19:00,670: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[02ac662b-e0bd-4cc3-b659-6345a471505a] received
    [2023-08-29 18:19:00,756: WARNING/ForkPoolWorker-1] requestId:
    [2023-08-29 18:19:00,756: WARNING/ForkPoolWorker-1] 0000018A40CDC0F9540ADCD7126FE0E9
    [2023-08-29 18:19:00,757: WARNING/ForkPoolWorker-1] [2023-08-29 18:19:00,757] WARNING in offce: obs_upload_file:OK
    [2023-08-29 18:19:00,757: WARNING/ForkPoolWorker-1] obs_upload_file:OK
    [2023-08-29 18:19:00,757: WARNING/ForkPoolWorker-1] test_8.png
    [2023-08-29 18:19:00,757: INFO/ForkPoolWorker-1] Task pss.api.offce.put_content_to_obs[02ac662b-e0bd-4cc3-b659-6345a471505a] succeeded in 0.08660224080085754s: True

    [2023-08-29 18:19:02,301: INFO/MainProcess] Task pss.api.offce.put_content_to_obs[19d3c1aa-20be-4dcb-a819-360191532325] received
    [2023-08-29 18:19:02,400: WARNING/ForkPoolWorker-1] requestId:
    [2023-08-29 18:19:02,400: WARNING/ForkPoolWorker-1] 0000018A40CDC7595A03C83BB2923AA0
    [2023-08-29 18:19:02,401: WARNING/ForkPoolWorker-1] [2023-08-29 18:19:02,401] WARNING in offce: obs_upload_file:OK
    [2023-08-29 18:19:02,401: WARNING/ForkPoolWorker-1] obs_upload_file:OK
    [2023-08-29 18:19:02,401: WARNING/ForkPoolWorker-1] test_9.png
    [2023-08-29 18:19:02,402: INFO/ForkPoolWorker-1] Task pss.api.offce.put_content_to_obs[19d3c1aa-20be-4dcb-a819-360191532325] succeeded in 0.09988882020115852s: True
    deplivesb
        14
    deplivesb  
       319 天前
    给我开个 ssh 我上去看看
    lzjunika
        15
    lzjunika  
    OP
       319 天前
    这几天,被这事搞得有点晕了,我本地,以及本地 docker 都试过,可以异步,到服务器就不行
    今天试了,在服务器中多运行一个 docker 容器,来跑 celery
    也就是在服务器上宿主运行一个 celery worker ,docker 中运行一个 worker, 相当于有两个消息费了
    但还是同步,发起了 10 个任务,宿主执行 4 个,docker 中执行了 6 个,有分配,但还是同步,总的执行时间没变

    docker 的执行命令分别试了两种,但一样的效果:
    Dockerfile1(使用默认的 prefork ,8 个并发):
    ...
    CMD ["celery", "-A", "docker_celery", "worker", "--loglevel", "INFO", "--logfile=logs/celery_docker.log"]

    Dockerfile2(使用 eventlet ,5 并发):
    ...
    CMD ["celery", "-A", "docker_celery", "worker", "--pool=eventlet", "--concurrency=5", "--loglevel", "INFO", "--logfile=logs/celery_docker.log"]

    一样的结果,同步,总执行时间不变,郁闷了...
    lzjunika
        16
    lzjunika  
    OP
       319 天前
    一个生产者发任务,对应两个消费者(宿主服务+docker 服务)了,任务消息能正常处理,就是不能异步
    生产者发任务很正常:
    put_content_to_obs.delay(new_name, local_name)
    在生产者端也没要求返回结果,只发送
    lzjunika
        17
    lzjunika  
    OP
       319 天前
    到底是哪配置的问题,非常诡异,还是换个服务器?
    lzjunika
        18
    lzjunika  
    OP
       318 天前
    终于看到希望了,在同一个服务器上,部署一个最小 flask+celery 服务,通过 docker-compose 跑起来,可以同步了
    那就是服务器没问题,redis 没问题,就是原来程序哪里写的不对,再找找:

    [2023-09-01 11:08:45,531: INFO/MainProcess] Task app.controller.index.add_together[d4885e83-f346-46b9-98c2-a9f981d7d1de] received
    [2023-09-01 11:08:45,533: INFO/MainProcess] Task app.controller.index.add_together[0abc8808-5603-4c61-87de-f6bcd2747d53] received
    [2023-09-01 11:08:45,535: INFO/MainProcess] Task app.controller.index.add_together[e1211bbc-8a76-4d8c-94d6-e3904cc50bdc] received
    [2023-09-01 11:08:45,538: INFO/MainProcess] Task app.controller.index.add_together[3a099971-abc5-4c2c-b784-1a2aaba86a24] received
    [2023-09-01 11:08:45,539: INFO/MainProcess] Task app.controller.index.add_together[f1a6604d-2757-4742-b4b5-33c4b92bbbb8] received
    [2023-09-01 11:08:45,541: INFO/MainProcess] Task app.controller.index.add_together[d380858f-3e65-4569-bcea-54ea8db5e6cf] received
    [2023-09-01 11:08:45,542: INFO/MainProcess] Task app.controller.index.add_together[740fbfed-7074-49f1-8680-6ddc48bfc2da] received
    [2023-09-01 11:08:45,544: INFO/MainProcess] Task app.controller.index.add_together[78b6ee5f-15a0-409b-b41f-709b0fdcb818] received
    [2023-09-01 11:08:45,545: INFO/MainProcess] Task app.controller.index.add_together[a482a9d2-1ffd-47df-b421-0bfcd1b386e1] received
    [2023-09-01 11:08:45,546: INFO/MainProcess] Task app.controller.index.add_together[7baa35a0-d695-4010-8120-051d5eea9af7] received
    [2023-09-01 11:08:46,535: INFO/ForkPoolWorker-7] Task app.controller.index.add_together[d4885e83-f346-46b9-98c2-a9f981d7d1de] succeeded in 1.0014203377068043s: 231
    [2023-09-01 11:08:46,535: INFO/ForkPoolWorker-8] Task app.controller.index.add_together[0abc8808-5603-4c61-87de-f6bcd2747d53] succeeded in 1.001225769519806s: 647
    [2023-09-01 11:08:46,537: INFO/ForkPoolWorker-1] Task app.controller.index.add_together[e1211bbc-8a76-4d8c-94d6-e3904cc50bdc] succeeded in 1.001103661954403s: 308
    [2023-09-01 11:08:46,540: INFO/ForkPoolWorker-2] Task app.controller.index.add_together[3a099971-abc5-4c2c-b784-1a2aaba86a24] succeeded in 1.0009450502693653s: 735
    [2023-09-01 11:08:46,542: INFO/ForkPoolWorker-3] Task app.controller.index.add_together[f1a6604d-2757-4742-b4b5-33c4b92bbbb8] succeeded in 1.0019154399633408s: 554
    [2023-09-01 11:08:46,544: INFO/ForkPoolWorker-5] Task app.controller.index.add_together[740fbfed-7074-49f1-8680-6ddc48bfc2da] succeeded in 1.000898975878954s: 455
    [2023-09-01 11:08:46,545: INFO/ForkPoolWorker-4] Task app.controller.index.add_together[d380858f-3e65-4569-bcea-54ea8db5e6cf] succeeded in 1.0016995184123516s: 771
    [2023-09-01 11:08:46,546: INFO/ForkPoolWorker-6] Task app.controller.index.add_together[78b6ee5f-15a0-409b-b41f-709b0fdcb818] succeeded in 1.0007124096155167s: 281
    [2023-09-01 11:08:47,537: INFO/ForkPoolWorker-8] Task app.controller.index.add_together[7baa35a0-d695-4010-8120-051d5eea9af7] succeeded in 1.00179473310709s: 788
    [2023-09-01 11:08:47,538: INFO/ForkPoolWorker-7] Task app.controller.index.add_together[a482a9d2-1ffd-47df-b421-0bfcd1b386e1] succeeded in 1.0018408931791782s: 729
    misoomang
        19
    misoomang  
       315 天前
    作为菜鸡提出的验证想法:可以写一个 @worker_process_init.connect() 写个方法打印日志,确定是否有多个 worker

    猜想:在服务器上从上面看上传 test_8.png 任务是执行了 0.08s ,间隔接收 test_9.png 有 2s 的时间间隔,是否会是 io 的影响导致接收图片过慢,导致第一个任务再 worker1 执行,第二个任务也在 worker1 中执行
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   4990 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 28ms · UTC 08:14 · PVG 16:14 · LAX 01:14 · JFK 04:14
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.