Python logging FileHandler 写 emoji 报错

2021-02-28 22:38:49 +08:00
 chenqh

代码

import logging
from logging import config

LOGGING_CONFIG = {
        "version": 1,
        "formatters": {
            "default": {
                'format': '%(asctime)19.19s %(levelname)1.1s %(message)s',
            },
            "file": {
                'format': '%(asctime)19.19s %(filename)s %(lineno)s %(levelname)1.1s %(message)s',
            },
            "plain": {
                "format": "%(message)s",
            },
        },
        "handlers": {
            "console": {
                "class": "logging.StreamHandler",
                "level": "INFO",
                "formatter": "default",
            },

            "file": {
                "class": "logging.FileHandler",
                "level": 20,
                "filename": "./log.txt",
                "formatter": "default",
            },
            "rotate_file": {
                "class": "logging.handlers.RotatingFileHandler",
                "level": 20,
                "filename": "./log.txt",
                "formatter": "default",
                "maxBytes": 52428800,
                "backupCount": 7,
            }
        },
        "loggers": {

            "tmp": {
                "handlers": ["console", "rotate_file"],
                "level": "INFO",
                "propagate": False,
            },

        },
        "disable_existing_loggers": True,
    }

config.dictConfig(LOGGING_CONFIG)
logger = logging.getLogger("tmp")


logger.info("吃\ud83d\udc3a")


报错信息

2021-02-28 22:35:55 I 吃\ud83d\udc3a
--- Logging error ---
Traceback (most recent call last):
  File "/home/vagrant/.pyenv/versions/3.6.9/lib/python3.6/logging/__init__.py", line 996, in emit
    stream.write(msg)
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 23-24: surrogates not allowed
Call stack:
  File "/home/vagrant/code/xxx/tmp.py", line 55, in <module>
    logger.info("吃\ud83d\udc3a")
Message: '吃\ud83d\udc3a'
Arguments: ()

这种问题怎么 fix 呀?

2174 次点击
所在节点    Python
8 条回复
Sunyanzi
2021-02-28 23:39:54 +08:00
这难道不是只要 encode 一下就好了吗 ... 最后一行改成下面样子 ...

logger.info("吃\ud83d\udc3a".encode('unicode-escape'))
lxy42
2021-02-28 23:48:14 +08:00
```
In [38]: print('\U0001f43a')
🐺
In [39]: hex(ord('🐺'))
Out[39]: '0x1f43a'
```
chenqh
2021-03-01 00:06:31 +08:00
@Sunyanzi 虽然你这个样子不会报错,但是日志变成这个样子了

```
2021-03-01 00:05:07 I b'\\u5403\\ud83d\\udc3a'
```

日志都不能肉眼识别了,那么日志的意义就没有了呀
chenqh
2021-03-01 00:08:35 +08:00
@lxy42 只是 streamHandler 也不会报错的,关键是 FileHandler 导致报错了
Sunyanzi
2021-03-01 00:12:28 +08:00
@chenqh 只是需要保中文的肉眼识别 ..? 那换个写法用 rper 就好 ...

参数改成 ("%r" % "吃\ud83d\udc3a") ...
Sylv
2021-03-01 04:20:31 +08:00
\ud83d\udc3a 是 🐺 的 UTF-16 编码,然后再用默认的 UTF-8 编码 encode 肯定就出错了。
Sylv
2021-03-01 04:20:47 +08:00
0x0208v0
2021-03-01 16:02:58 +08:00
@Sylv
@Sunyanzi 好牛啊,我学到了

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/757063

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX