爬虫：提取网页源代码转码时报错 UnicodeDecodeError

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

For Existing Member Sign In

This topic created in 2884 days ago, the information mentioned may be changed or developed.

这是京东网页源代码写的<meta http-equiv="Content-Type" content="text/html; charset=gbk"> 但用 response.body.decode("utf-8")或者 respon.body.decode("gbk")都会报错 UnicodeDecodeError: 'gbk' codec can't decode byte 0x81 in position 88852: illegal multibyte sequence 这是什么原因呢？？最后可以用 response.text 来解码不报错，但不明白上面使用 gbk 为什么会报错，response.text 又是将源码解码成什么格式呢？？

No Comments Yet

gbk Text response 解码