首页   注册   登录
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python 学习手册
Python Cookbook
Python 基础教程
Python Sites
PyPI - Python Package Index
http://www.simple-is-better.com/
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
V2EX  ›  Python

ScrapydWeb 现已支持自定义 Run Spider 页面的 settings & arguments 默认值

  •  
  •   my8100 · 63 天前 · 616 次点击
    这是一个创建于 63 天前的主题,其中的信息可能已经有所发展或是发生改变。

    1.安装更新:

    pip install -U git+https://github.com/my8100/scrapydweb.git
    

    2.如果之前已在使用 scrapydweb v1.2.0,则在已有的配置文件中添加如下配置选项:

    
    ############################## Run Spider #####################################
    # The default is False, set it to True to automatically
    # expand the 'settings & arguments' section in the Run Spider page.
    SCHEDULE_EXPAND_SETTINGS_ARGUMENTS = False
    
    # The default is 'Mozilla/5.0', set it a non-empty string to customize the default value of `custom`
    # in the drop-down list of `USER_AGENT`.
    SCHEDULE_CUSTOM_USER_AGENT = 'Mozilla/5.0'
    
    # The default is None, set it to any value of ['custom', 'Chrome', 'iPhone', 'iPad', 'Android']
    # to customize the default value of `USER_AGENT`.
    SCHEDULE_USER_AGENT = None
    
    # The default is None, set it to True or False to customize the default value of `ROBOTSTXT_OBEY`.
    SCHEDULE_ROBOTSTXT_OBEY = None
    
    # The default is None, set it to True or False to customize the default value of `COOKIES_ENABLED`.
    SCHEDULE_COOKIES_ENABLED = None
    
    # The default is None, set it to a non-negative integer to customize the default value of `CONCURRENT_REQUESTS`.
    SCHEDULE_CONCURRENT_REQUESTS = None
    
    # The default is None, set it to a non-negative number to customize the default value of `DOWNLOAD_DELAY`.
    SCHEDULE_DOWNLOAD_DELAY = None
    
    # The default is "-d setting=CLOSESPIDER_TIMEOUT=60\r\n-d setting=CLOSESPIDER_PAGECOUNT=10\r\n-d arg1=val1",
    # set it to '' or any non-empty string to customize the default value of `additional`.
    # Use '\r\n' as the line separator.
    SCHEDULE_ADDITIONAL = "-d setting=CLOSESPIDER_TIMEOUT=60\r\n-d setting=CLOSESPIDER_PAGECOUNT=10\r\n-d arg1=val1"
    
    

    3.GitHub

    https://github.com/my8100/scrapydweb

    目前尚无回复
    关于   ·   FAQ   ·   API   ·   我们的愿景   ·   广告投放   ·   感谢   ·   实用小工具   ·   2132 人在线   最高记录 5043   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.3 · 21ms · UTC 04:38 · PVG 12:38 · LAX 21:38 · JFK 00:38
    ♥ Do have faith in what you're doing.