运行环境:
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
macOS 10.13.6
一个朋友问,线程可不可以从 OOM 中恢复,然后给了一篇网上看到的文章,大意是如果一个线程发生了 OOM,是不影响其他线程的正常运行的。
然后我就想,如果主线程疯狂创建线程,也会抛出 OOM,那么如果我 catch 住异常,然后把创建的线程 stop 掉,是不是可行,于是写了下面一段代码测试。
JVM 参数:-Xms256m -Xmx256m -Xss1m
public class TestThread {
    public static void main(String[] args) throws InterruptedException {
        List<Thread> list = new ArrayList<>();
        try {
            while (true) {
                Thread t = new Thread(() -> {
                    System.out.println(Thread.currentThread().getName());
                    try {
                        Thread.sleep(Integer.MAX_VALUE);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                });
                list.add(t);
                t.start();
            }
        } catch (Throwable e) {
            e.printStackTrace();
            System.out.println("list size =================" + list.size());
            System.out.println("thread active =================" + Thread.activeCount());
            list.forEach(testThread -> {
                testThread.stop();
            });
            System.out.println("stop thread over");
        }
        Thread.sleep(2000L);
        System.out.println("thread active =================" + Thread.activeCount());
        Thread.currentThread().getThreadGroup().list();
    }
}
部分日志如下:
Thread-2024
Thread-2025
Thread-2026
Thread-2027
java.lang.OutOfMemoryError: unable to create new native thread
	at java.lang.Thread.start0(Native Method)
	at java.lang.Thread.start(Thread.java:717)
	at com.xzy.test.TestThread.main(TestThread.java:25)
list size =================2029
thread active =================2030
stop thread over
thread active =================2
java.lang.ThreadGroup[name=main,maxpri=10]
    Thread[main,5,main]
    Thread[Monitor Ctrl-Break,5,main]
Process finished with exit code 0
可以看到,在 catch 住异常后,主线程正常执行了剩下的内容,正常退出了(exit code 0)。然后我就想,如果这里的是线程池呢?这个时候测试的出来的结果就有点不理解了。
JVM 参数依然是:-Xms256m -Xmx256m -Xss1m
public class TestThreadPool {
    public static void main(String[] args) throws InterruptedException {
        List<ExecutorService> list = new ArrayList<>();
        try {
            while (true) {
                ExecutorService executorService = Executors.newFixedThreadPool(10);
                int i = 0;
                while (i++ <= 10) {
                    executorService.submit(() -> {System.out.println(Thread.currentThread().getName());});
                }
                list.add(executorService);
            }
        } catch (Throwable e) {
            System.out.println("catch throwable");
            e.printStackTrace();
            System.out.println("============" + Thread.currentThread().getName() + " collect thread start!");
            System.out.println("list size ============" + list.size());
            list.forEach(ExecutorService::shutdownNow);
            System.out.println("============" + Thread.currentThread().getName() + " shutdown success!");
        }
        list.forEach(executorService -> {
            if (!executorService.isShutdown()) {
                System.out.println("============ alive executorservice");
            }
        });
        Thread.sleep(2000L);
        System.out.println("thread active count ============" + Thread.activeCount());
        Thread.currentThread().getThreadGroup().list();
    }
}
部分日志输出如下:
pool-203-thread-4
pool-203-thread-5
pool-203-thread-6
pool-203-thread-7
catch throwable
java.lang.OutOfMemoryError: unable to create new native thread
	at java.lang.Thread.start0(Native Method)
	at java.lang.Thread.start(Thread.java:717)
	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
	at com.xzy.test.TestThreadPool.main(TestThreadPool.java:20)
============main collect thread start!
list size ============202
============main shutdown success!
thread active count ============9
java.lang.ThreadGroup[name=main,maxpri=10]
    Thread[main,5,main]
    Thread[Monitor Ctrl-Break,5,main]
    Thread[pool-203-thread-1,5,main]
    Thread[pool-203-thread-2,5,main]
    Thread[pool-203-thread-3,5,main]
    Thread[pool-203-thread-4,5,main]
    Thread[pool-203-thread-5,5,main]
    Thread[pool-203-thread-6,5,main]
    Thread[pool-203-thread-7,5,main]
主线程没有退出,没理解错的话,因为pool-203线程池还在运行,但是为什么pool-203这个线程池会卡住呢,还是 catch 住异常之前的样子(都是新建了 7 个线程),我用 VisualVM 看了下线程的状态,发现pool-203-thread-这一批线程因为调用了java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#await()这个方法,都处于 waiting 状态。
java.util.concurrent.ThreadPoolExecutor#awaitTermination这里调用的,这个方法是有 shutdwon 请求时,阻塞等待所有任务完成执行。
不知道我前面的分析有没有哪个地方有错。如果前面分析的没错的话,我的疑问是,为什么会卡在java.util.concurrent.ThreadPoolExecutor#awaitTermination这里呢,是因为创建线程过多发生了 oom,但是最后一组线程池没有正确创建完毕,导致主线程恢复过来后,无法彻底回收最后那组线程池吗?
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.