请教一下 ab 压测

2019-11-17 23:12:44 +08:00
 whoami9894
Concurrency Level:      1000
Time taken for tests:   14.033 seconds
Complete requests:      10000
Failed requests:        9761
   (Connect: 0, Receive: 0, Length: 9761, Exceptions: 0)
Total transferred:      25209257 bytes
HTML transferred:       21404477 bytes
Requests per second:    712.61 [#/sec] (mean)
Time per request:       1403.301 [ms] (mean)
Time per request:       1.403 [ms] (mean, across all concurrent requests)
Transfer rate:          1754.32 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      198  798 312.5    727    3163
Processing:   172  539 141.7    530    1041
Waiting:       22  236 116.8    209     776
Total:        444 1337 353.1   1276    3957

Percentage of the requests served within a certain time (ms)
  50%   1276
  66%   1350
  75%   1433
  80%   1502
  90%   1739
  95%   2117
  98%   2376
  99%   2538
 100%   3957 (longest request)

这个服务是学校查询学生信息的,压测的这个请求有一次数据库查询。假如 4W 左右学生在一天内集中访问修改个人信息(假设 4W 人请求平均到 8h 中),会不会有扛不住的风险

1594 次点击
所在节点    问与答
18 条回复
chenset
2019-11-17 23:39:43 +08:00
712.61 [#/sec] 这个不是很高了吗, 单看机器性能是够了
richangfan
2019-11-18 00:03:02 +08:00
Failed requests: 9761
ClericPy
2019-11-18 00:05:18 +08:00
对后端不太熟, 看这份结果问题不大, 响应时间很多超过 1 秒有分析过哪里 block 的么, 比如数据库语句的优化, 如果是网络传输的话就不算问题

做好 Get 请求的缓存, Post 的可以模拟下更高并发数, 一般情况下数据库比想象中能抗的多, 毕竟用户量就那么点, 真扛不住改 MQ 那种设计, 或者加机器, 或者数据库读写分离, 或者冷热数据分库

以前上学用的学生系统, 修改信息时候一两秒的等待还可以接受, 只要别失败的时候让我重填就好, 以前用的教务处 MIS 甚至让我修改的时候说当前队列拥堵, 让我排队等待多少秒... 弄得跟打游戏似的, 不过这种设计对低配环境的并发限制还挺有用的(非常态高并发的情境下节省成本).
Flasky
2019-11-18 00:08:32 +08:00
这个性能已经可以了吧,说实话 4W 人 4 个年级最高峰的时候可能也就 5000/分钟,我们学校 2W 在校生,每次选课平均 1500/分钟,最高峰的时候因为某个学生搞了个脚本刷到了 4000/分钟,还好有监控及时掐掉了他的网
lbp0200
2019-11-18 09:09:12 +08:00
并发改成 20,重新测
phpdever
2019-11-18 10:35:44 +08:00
从压测结果来看,一共 1000 的并发,请求了 10000 次,失败有 9761 次。

Complete requests:10000

Failed requests: 9761

Requests per second:1403.301

我怀疑测试用例有问题,测试时,观察一下服务器的负载情况。
whoami9894
2019-11-18 13:17:55 +08:00
@chenset 其实我不太理解的是 rps 700 是个什么级别,假如说 2000 人同时请求的话会不会就挂了
whoami9894
2019-11-18 13:18:33 +08:00
@richangfan 你不说我还没注意到 90%请求都失败了.......
whoami9894
2019-11-18 13:22:17 +08:00
@ClericPy 这个接口逻辑就是从 session 里取用户名,然后 SELECT 一次数据库,这样测试我怀疑数据库缓存也有加成

我换成`-n 3000 -c 1000`结果是这样,fail 还是过多

```
Concurrency Level: 1000
Time taken for tests: 5.023 seconds
Complete requests: 3000
Failed requests: 2421
(Connect: 0, Receive: 0, Length: 2421, Exceptions: 0)
Total transferred: 6381837 bytes
HTML transferred: 5230257 bytes
Requests per second: 597.29 [#/sec] (mean)
Time per request: 1674.237 [ms] (mean)
Time per request: 1.674 [ms] (mean, across all concurrent requests)
Transfer rate: 1240.82 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 380 807 255.4 738 1678
Processing: 151 518 156.7 480 1094
Waiting: 28 264 151.7 223 830
Total: 594 1325 242.7 1277 2621

Percentage of the requests served within a certain time (ms)
50% 1277
66% 1387
75% 1457
80% 1531
90% 1642
95% 1769
98% 1994
99% 2087
100% 2621 (longest request)
```
whoami9894
2019-11-18 13:23:27 +08:00
@Flasky 不求能抗住抢课那个并发量 2333。我们教务系统每逢抢课必挂,抢课脚本得在抢课开始前维护一个 TCP 连接池
whoami9894
2019-11-18 13:24:42 +08:00
@lbp0200
这是`-c 20 -n 1000`的结果

```
Concurrency Level: 20
Time taken for tests: 1.383 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 2532000 bytes
HTML transferred: 2152000 bytes
Requests per second: 723.07 [#/sec] (mean)
Time per request: 27.660 [ms] (mean)
Time per request: 1.383 [ms] (mean, across all concurrent requests)
Transfer rate: 1787.92 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 7 19 5.4 18 39
Processing: 2 8 3.9 8 27
Waiting: 1 6 3.3 5 25
Total: 10 27 6.3 27 54

Percentage of the requests served within a certain time (ms)
50% 27
66% 29
75% 30
80% 31
90% 36
95% 40
98% 45
99% 46
100% 54 (longest request)
```
whoami9894
2019-11-18 13:33:45 +08:00
@phpdever
`-c 1000 -n 3000`时的负载情况

```
top - 13:30:23 up 51 days, 16:50, 2 users, load average: 1.67, 0.46, 0.18
任务: 269 total, 3 running, 197 sleeping, 0 stopped, 0 zombie
%Cpu(s): 72.1 us, 7.5 sy, 0.0 ni, 18.5 id, 0.0 wa, 0.0 hi, 1.9 si, 0.0 st
KiB Mem : 16422300 total, 1691276 free, 1062368 used, 13668656 buff/cache
KiB Swap: 2097148 total, 2097148 free, 0 used. 15068552 avail Mem

进 USER PR NI VIRT RES SHR %CPU %MEM TIME+ COMMAND 22322 root 20 0 120432 72864 11844 R 496.3 0.4 5:00.33 main
11740 root 20 0 115144 75736 5228 R 98.7 0.5 0:04.26 ab
4207 mysql 20 0 4790288 291612 15756 S 44.9 1.8 31:01.26 mysqld
```


```
Concurrency Level: 1000
Time taken for tests: 4.461 seconds
Complete requests: 3000
Failed requests: 2677
(Connect: 0, Receive: 0, Length: 2677, Exceptions: 0)
Total transferred: 6918669 bytes
HTML transferred: 5772209 bytes
Requests per second: 672.43 [#/sec] (mean)
Time per request: 1487.146 [ms] (mean)
Time per request: 1.487 [ms] (mean, across all concurrent requests)
Transfer rate: 1514.42 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 156 753 286.5 689 1852
Processing: 57 517 166.8 534 1167
Waiting: 9 243 132.5 208 904
Total: 326 1271 353.3 1284 2457

Percentage of the requests served within a certain time (ms)
50% 1284
66% 1355
75% 1384
80% 1481
90% 1696
95% 1965
98% 2169
99% 2209
100% 2457 (longest request)
```
chenset
2019-11-18 14:16:53 +08:00
Failed requests 是怎么回事? 你查查 ulimit -n 是不是超出连接数大小了.
whoami9894
2019-11-18 16:17:15 +08:00
@chenset 果然是,改成 2^16-1 感觉没啥问题了

```
Concurrency Level: 1000
Time taken for tests: 13.114 seconds
Complete requests: 10000
Failed requests: 421
(Connect: 0, Receive: 0, Length: 421, Exceptions: 0)
Total transferred: 24437163 bytes
HTML transferred: 20628743 bytes
Requests per second: 762.53 [#/sec] (mean)
Time per request: 1311.422 [ms] (mean)
Time per request: 1.311 [ms] (mean, across all concurrent requests)
Transfer rate: 1819.74 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 188 724 226.3 670 2185
Processing: 118 540 146.4 538 1997
Waiting: 22 232 137.8 188 1775
Total: 465 1265 265.6 1234 2689

Percentage of the requests served within a certain time (ms)
50% 1234
66% 1315
75% 1370
80% 1409
90% 1538
95% 1703
98% 2100
99% 2193
100% 2689 (longest request)
```
whoami9894
2019-11-18 16:22:31 +08:00
@ClericPy
Time per request: 1311.422 [ms] (mean)
Time per request: 1.311 [ms] (mean, across all concurrent requests)
我搜了下超过 1s 应该是 1000 并发量都请求一次的时间,平均下来一个请求 1ms 左右,应该差不多
ClericPy
2019-11-18 16:32:04 +08:00
@whoami9894 #9 Failed requests 那个如果全是 Length 的一般不用管, 说失败是因为你发的请求返回结果长度变化不一定是真错, 如果真错, 你在状态码或者 connection 层面就检查到了
如果只是单表 select, 这性能不太正常, 是不是没命中索引 (explain 看看)

别太指望数据库的缓存, 可以考虑给函数拉个内存的 lru cache, 必要时候 redis 做层缓存可以进一步提高性能, 也更灵活点. 看你不同并发数量时候 Time per request 差距那么大, 应该瓶颈就是数据库那头了, 也可以考虑连接池开大点.

测试时候并发数盲目高了没什么用, 实际情况下的当前 qps 是比较均匀也不大可能到 700 的, 毕竟一天八小时里才四万人并不是太高, 除非你还有个抢课系统, 那个是真的坑, 稍微扛不住就被 DOS...

我对测试的理解也不是太深刻, TPS 和 QPS 这些只做个参考没法预测真高压下的复杂环境, 再说八小时里就算不均匀分布, 学生在中午吃饭一块修改并发数也是有限的

简单地说, 目测数据库操作部分还有可以优化的时间, 查询的函数或者表结构那边, 至少搞下缓存也是好的. 目前的性能对学生来说问题已经不大, 打好日志真机上线试试看
ClericPy
2019-11-18 16:36:40 +08:00
@whoami9894 #15 你说的对... 貌似没什么问题不用改了... 缓存看心情
whoami9894
2019-11-18 17:34:12 +08:00
@ClericPy
学到了,感谢感谢

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/620475

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX