http：接受错误：接受tcp [::]：9000：接受4：打开文件过多； 1s重试

the pid of the process is 1996291.

there are 65534 fds in /proc/1996291/fd, most of the fds are sockets, like this:

lrwx------ 1 root root 64 Dec 30 13:59 10000 -> socket:[952574733]
lrwx------ 1 root root 64 Dec 30 13:59 10001 -> socket:[952566188]

I know that the number in bracket is inode of the socket. There should be one same inode in /proc/net/tcp for every socket. However, some inode can be found, but some can't:

cat /proc/net/tcp | grep 952574733

If I found the inode, the output like follows:

  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
 336: 4114C80A:271A 1914C80A:0CEA 01 00000000:00000000 02:0000BE1B 00000000     0        0 962759319 2 ffff88035a20cb00 20 4 30 10 16

This is a real connection.

I use netstat -tnp to show connections and get a great many TIME_WAIT connections. I don't know whether they have relationship with my problem.

I use lsof -p 1996291, the output is like this, a great many sockets:

app    1996291 root *520u     sock       0,8      0t0 953021420 protocol: TCP
app    1996291 root *521u     sock       0,8      0t0 953027193 protocol: TCP
app    1996291 root *522u     sock       0,8      0t0 953021422 protocol: TCP
app    1996291 root *523u     sock       0,8      0t0 953038715 protocol: TCP

There three kernal options have been set to 1:

net.ipv4.tcp_tw_reuse
net.ipv4.tcp_tw_recycle
net.ipv4.tcp_syncookies

I can't solve these problem for several days, anyone can help me?

For each socket on your machine there is a file descriptor. When you have too many open connections there will be too many files open and it will crash.

You can try to prevent this by limiting your amount of open connections at the same time or by properly closing the fd's by closing the body of your returned responses. Quickly recycling sockets may also help.

Another hacky approach would be to up the limit of open files with:

ulimit -n [new limit]