Pixiv 的插图小爬虫( ̄ y▽ ̄)╭ Ohohoho.....

2016-10-24 17:03:57 +08:00
 pwcong
  1. python main.py
  2. 输入用户名,密码,和保存的文件夹
  3. 选择要下载哪个排行榜的插图

今天的排行榜还真是多福利(●ˇ∀ˇ●)

github: https://github.com/pwcong/PixivCrawler

如果你喜欢这个小爬虫,请尽情给我个 Start 哈

2990 次点击
所在节点    Python
8 条回复
seewhy
2016-10-24 19:45:12 +08:00
Start~~
pwcong
2016-10-24 22:37:47 +08:00
@seewhy つ﹏⊂
402645707
2016-10-30 10:45:38 +08:00
po 主好,我想用 docker 把这个打个包方便我在群晖上面用
运行时报这个

Traceback (most recent call last):
File "/usr/local/lib/python3.5/urllib/request.py", line 1254, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/usr/local/lib/python3.5/http/client.py", line 1106, in request
self._send_request(method, url, body, headers)
File "/usr/local/lib/python3.5/http/client.py", line 1151, in _send_request
self.endheaders(body)
File "/usr/local/lib/python3.5/http/client.py", line 1102, in endheaders
self._send_output(message_body)
File "/usr/local/lib/python3.5/http/client.py", line 934, in _send_output
self.send(msg)
File "/usr/local/lib/python3.5/http/client.py", line 877, in send
self.connect()
File "/usr/local/lib/python3.5/http/client.py", line 849, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/usr/local/lib/python3.5/socket.py", line 693, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/usr/local/lib/python3.5/socket.py", line 732, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "man.py", line 25, in <module>
query_tt = RankingCrawler.download_first(opener, RankingCrawler.query_mode[int(qmNo)], saveDir)
File "/usr/src/app/crawler/RankingCrawler.py", line 119, in download_first
with op.open(visit) as f:
File "/usr/local/lib/python3.5/urllib/request.py", line 466, in open
response = self._open(req, data)
File "/usr/local/lib/python3.5/urllib/request.py", line 484, in _open
'_open', req)
File "/usr/local/lib/python3.5/urllib/request.py", line 444, in _call_chain
result = func(*args)
File "/usr/local/lib/python3.5/urllib/request.py", line 1282, in http_open
return self.do_open( http.client.HTTPConnection, req)
File "/usr/local/lib/python3.5/urllib/request.py", line 1256, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>

尝试改了 resolv.conf 为 114 ,但还是这个错误
ping pixiv 也能通,感觉应该不是网络的问题
可以指点下是我哪个运行库没装或者配错了
402645707
2016-10-30 11:24:00 +08:00
排查了半天发现是 lxml 没弄,已自行解决
pwcong
2016-10-31 11:40:14 +08:00
@402645707 (>人<;) 骚瑞啊,这两天学车跑长途没看到信息,外部的库只有 bs4(解析用 lxml)
402645707
2016-11-01 00:37:56 +08:00
本地调试的时候还好好的上到 daodocker 就没辙了
目测又是依赖,然而这次人品用光了

2016-11-01 00:36:26:start
2016-11-01 00:36:27:Traceback (most recent call last):
2016-11-01 00:36:27: File "all.py", line 23, in <module>
2016-11-01 00:36:27: opener = PixivLoginer.login(userid, password)
2016-11-01 00:36:27: File "/usr/src/app/api/PixivLoginer.py", line 56, in login
2016-11-01 00:36:27: data = utils.ungzip(data).decode()
2016-11-01 00:36:27:AttributeError: module 'utils' has no attribute 'ungzip'
2016-11-01 00:36:28:Traceback (most recent call last):
2016-11-01 00:36:28: File "man.py", line 23, in <module>
2016-11-01 00:36:28: opener = PixivLoginer.login(userid, password)
2016-11-01 00:36:28: File "/usr/src/app/api/PixivLoginer.py", line 56, in login
2016-11-01 00:36:28: data = utils.ungzip(data).decode()
2016-11-01 00:36:28:AttributeError: module 'utils' has no attribute 'ungzip'
402645707
2016-11-01 00:38:38 +08:00
表示主攻嵌入式不太懂 python ,可以的话请指点一下
mistak1992
2016-11-02 09:20:22 +08:00
标签“小爬虫”是什么鬼~

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/315075

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX