请教一个关于 PostgreSQL 连接被 Server 关闭的问题

195 天前
 Koril

背景

我在自己的阿里云服务器上( 2C2G ,3M ,Debian 12 )装了一个 PostgreSQL ( 15 ),安装后,仅仅做了以下配置改动:

pg_hba.conf:

添加:host all all 0.0.0.0/0 md5

postgresql.conf

开放端口:listen_addresses = '*'

给 postgres 设置了密码,sudo -i -u postgres -> psql -> \password


异常

无论是 Navicat 还是 Python 的 Psycopg2 在超过一定时间(大概 3-5 分钟),就会连接失效了。

Navicat 报错如下:

Server closed the connection unexpectedly
This probably means the server terminated abnormally before or while processing the request.

Psycopg2 报错如下:

psycopg2.OperationalError: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request. 

已知

服务器没有太大的负载,cpu ,内存,磁盘,网络,占用率都很低,除了 Prometheus/Grafana/nginx 之外,这个机器上只有刚刚安装的 PostgreSQL 。

重新获取连接,能够正常访问,但是我使用的 Psycopg2 的 ThreadedConnectionPool 连接池,连接池没法保持连接么?代码如下:


def get_pg_pool(pg_config):
    pool = psycopg2.pool.ThreadedConnectionPool(
        minconn=1,
        maxconn=200,
        host=pg_config['PG_HOST'],
        port=pg_config['PG_PORT'],
        dbname=pg_config['PG_DB'],
        user=pg_config['PG_USER'],
        password=pg_config['PG_PASSWORD'],
        connect_timeout=5,
    )
    return pool
    
@flask_app.route('/task/log', methods=['GET'])
    def task_log():
        task_log_list = []
        # 在进程启动时,给 flask_app 初始化了一个 pg_pool 对象
        conn = flask_app.pg_pool.getconn()
        try:
            with conn.cursor() as cur:
                cur.execute('SELECT * FROM t_log')
                task_log_list = cur.fetchall()
        except psycopg2.Error as e:
            return JsonResult.failed('获取日志列表失败')
        finally:
            flask_app.pg_pool.putconn(conn)

        return JsonResult.successful(task_log_list)


问题

这个问题是和 Linux 服务器配置有关呢?还是跟 PostgreSQL 配置有关?请问如何排查和解决呢?

1589 次点击
所在节点    PostgreSQL
3 条回复
zbinlin
195 天前
你看下 pg 的日志里有什么报错信息
Koril
195 天前
@zbinlin /var/log/postgresql/postgresql-15-main.log 的日志里显示:

2025-03-11 11:47:12.722 CST [1370665] postgres@badminton LOG: could not receive data from client: Connection timed out
2025-03-11 11:47:12.722 CST [1370663] postgres@badminton LOG: could not receive data from client: Connection timed out
2025-03-11 11:49:03.314 CST [1370701] postgres@badminton LOG: could not receive data from client: Connection timed out
Nt6Z1g
195 天前
印象中阿里云服务器修改了 OS 的 tcp_keepalive 参数, pg 默认用系统的参数会导致这个问题. 你手动设置 pg 的 tcp_keepalive 参数试试. 比如

```
select name,setting from pg_settings where name like 'tcp_keepalives%';
ALTER SYSTEM set tcp_keepalives_idle = 600;
ALTER SYSTEM set tcp_keepalives_interval = 30;
ALTER SYSTEM set tcp_keepalives_count = 10;
SELECT pg_reload_conf();
```

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/1117481

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX