We have been hitting this bug quite often while running Tomcat 8.5 on Amazon AWS Linux 2 with a kernel of 4.14.268-205.500.amzn2.x86_64
I wanted to see if the bug could be reproduced using an updated kernel so I attempted to repro it using the server code and methodology provided by Mark Thomas on Ubuntu Server 21.10 (running on a Raspberry Pi 4 with 4GB RAM) and was NOT able to repro the bug (kernel 5.13.0-1008-raspi). I then installed Ubuntu Server 20.04 LTS on the same machine and WAS able to repro the bug (kernel 5.4.0-1052-raspi). The bug was fairly easy to repro and did not take multiple times to repro. Since then I have been able to repro the bug using the server code on AWS Linux 2 with the 4.14.268-205.500.amzn2.x86_64 kernel, but not on AWS Linux 2 with a 5.10.109-104.500.amzn2.x86_64 kernel. I think there is a slight problem with the server code used in the repro, as it is calling `pthread_create` with no thread attributes, which will create joinable threads instead of detached threads. The documentation for `pthread_create` says that "Only when a terminated joinable thread has been joined are the last of its resources released back to the system." Because the server code never joins the threads I think this is preventing the OS from releasing the thread resources. This results in the server eventually running out of memory and the server program returning a "pthread_create: Cannot allocate memory" as mentioned by Brooke Hedrick in their comment. I was also not able to repro the bug on WSL (kernel 4.4.0-19041-Microsoft), but perhaps their underlying network drivers are different? I also was running into this issue when running the server code. I made a slight modification to the server code to set the pthread attribute to create the new threads in a detached state. This seemed to solve the memory issue and I was able to repro the bug with this server. I've attached the code. Additionally, I found it useful to use `prlimit` to update the maximum number of open files for the server process, once it was running. This made the server less likely to run into an EMFILE error when calling `accept`. ** Attachment added: "Updated server to demonstrate bug" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1924298/+attachment/5582247/+files/server.c -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1924298 Title: accept returns duplicate endpoints under load Status in Linux: New Status in linux package in Ubuntu: Confirmed Bug description: When accepting client connections under load, duplicate endpoints may be returned. These endpoints will have different (usually sequential) file descriptors but will refer to the same connection (same server IP, same server port, same client IP, same client port). Both copies of the endpoint appear to be functional. Reproduction requires: - compilation of the attached server.c program - wrk (https://github.com/wg/wrk) to generate load The steps to reproduce are: - run 'server' application in one console window - run 'for i in {1..50}; do /opt/wrk/wrk -t 2 -c 1000 -d 5s --latency --timeout 1s http://localhost:5555/post; done' in a second console window - run the same command in a third window to generate concurrent load You may need to run additional instance of the wrk command in multiple windows to trigger the issue. When the problem occurs the server executable will exit and print some debugging info. e.g.: accerror = 1950892, counter = 10683, port = 59892, clientfd = 233, lastClient = 232 This indicates that the sockets with file descriptors 233 and 232 are duplicates. The issue has been reproduced on fully patched versions of Ubuntu 20.04 and 18.04. Other versions have not been tested. This issue was originally observed in Java and was reported against the Spring Framework: https://github.com/spring-projects/spring-framework/issues/26434 Investigation from the Spring team and the Apache Tomcat team identified that it appeared to be a JDK issue: https://bugs.openjdk.java.net/browse/JDK-8263243 Further research from the JDK team determined that the issue was at the OS level. Hence this report. ProblemType: Bug DistroRelease: Ubuntu 20.04 Package: linux-image-5.4.0-71-generic 5.4.0-71.79 ProcVersionSignature: Ubuntu 5.4.0-71.79-generic 5.4.101 Uname: Linux 5.4.0-71-generic x86_64 NonfreeKernelModules: nvidia_modeset nvidia ApportVersion: 2.20.11-0ubuntu27.16 Architecture: amd64 CasperMD5CheckResult: skip CurrentDesktop: ubuntu:GNOME Date: Thu Apr 15 12:52:53 2021 HibernationDevice: RESUME=UUID=f5a46e09-d99b-4475-8ab6-2cd70da8418d InstallationDate: Installed on 2017-02-02 (1532 days ago) InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719) IwConfig: lo no wireless extensions. docker0 no wireless extensions. eno1 no wireless extensions. MachineType: Gigabyte Technology Co., Ltd. Default string ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-71-generic root=/dev/mapper/ubuntu--vg-root ro text RelatedPackageVersions: linux-restricted-modules-5.4.0-71-generic N/A linux-backports-modules-5.4.0-71-generic N/A linux-firmware 1.187.10 RfKill: SourcePackage: linux UpgradeStatus: Upgraded to focal on 2020-09-07 (219 days ago) dmi.bios.date: 06/13/2016 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: F22 dmi.board.asset.tag: Default string dmi.board.name: X99-SLI-CF dmi.board.vendor: Gigabyte Technology Co., Ltd. dmi.board.version: x.x dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrF22:bd06/13/2016:svnGigabyteTechnologyCo.,Ltd.:pnDefaultstring:pvrDefaultstring:rvnGigabyteTechnologyCo.,Ltd.:rnX99-SLI-CF:rvrx.x:cvnDefaultstring:ct3:cvrDefaultstring: dmi.product.family: Default string dmi.product.name: Default string dmi.product.sku: Default string dmi.product.version: Default string dmi.sys.vendor: Gigabyte Technology Co., Ltd. To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1924298/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp