初来乍到,请教一个 bug 。。。

2016-05-06 15:51:37 +08:00
 hqtc
import urllib.request

urllib.request.urlopen("http://hq.sinajs.cn/list=sh000001").read()

python 版本 3.4.

在 windows 下正常。 但是在阿里云服务器上抛出如下异常。。 在阿里云上换上百度的 url 也可以读出,敢问是怎么回事。。。

Traceback (most recent call last):
  File "/usr/lib/python3.4/urllib/request.py", line 1232, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "/usr/lib/python3.4/http/client.py", line 1065, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python3.4/http/client.py", line 1103, in _send_request
    self.endheaders(body)
  File "/usr/lib/python3.4/http/client.py", line 1061, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python3.4/http/client.py", line 906, in _send_output
    self.send(msg)
  File "/usr/lib/python3.4/http/client.py", line 841, in send
    self.connect()
  File "/usr/lib/python3.4/http/client.py", line 819, in connect
    self.timeout, self.source_address)
  File "/usr/lib/python3.4/socket.py", line 509, in create_connection
    raise err
  File "/usr/lib/python3.4/socket.py", line 500, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "price_and_deal_data.py", line 33, in <module>
    data_arr = urllib.request.urlopen(url).read().decode("gbk").split("=")[1].split(",")
  File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 455, in open
    response = self._open(req, data)
  File "/usr/lib/python3.4/urllib/request.py", line 473, in _open
    '_open', req)
  File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 1258, in http_open
    return self.do_open( http.client.HTTPConnection, req)
  File "/usr/lib/python3.4/urllib/request.py", line 1235, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 110] Connection timed out>
6068 次点击
所在节点    Python
9 条回复
hging
2016-05-06 15:59:45 +08:00
我猜是爬太多 新浪封了...
nellace
2016-05-06 16:14:49 +08:00
目测编码问题,去掉 decode 试试
ldsink
2016-05-06 16:30:35 +08:00
Connection timed out 错误,这是连接超时。阿里云连你要下载的那个网址有问题。

PS :别用 urllib ,太麻烦了,用 requests ,这都快成准标准库了, HTTP for Humans 。
tongle
2016-05-06 16:32:47 +08:00
确定下你的服务器是否能访问 URL
star001007
2016-05-06 16:40:05 +08:00
抓我浪接口做甚
thomasjiao
2016-05-06 16:47:42 +08:00
加个 referer 和 ua 之类的试试吧,这种接口据内部人士说能抗 1000w 的并发
cxbig
2016-05-06 18:37:29 +08:00
在服务器上用 curl 或 wget 试试能不能读到页面,就可以知道是你程序问题还是服务器连接问题。
hqtc
2016-05-06 18:40:38 +08:00
感谢各位大大,我刚用 wget 试了一下,确实我的服务器不能访问这个 url 了。。我也就每天下午 5 点抓一下三大股指的点数数据,这样也会被封么?真是醉了。。。

有没有什么解决方案?
xiahei
2016-05-07 14:36:33 +08:00
A sorrowful story !

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/276786

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX