What is the formula to determine how much memory a socket consumes under Linux?
I’m doing some capacity planning and I was wondering if there is a formula that I could use to predict (from a memory standpoint) how many TCP connections I could handle on my server. At the moment, I’m only concerned about memory requirements.
Some variables that I think will show up in the formula are:
- sysctl’s
net.ipv4.tcp_wmem
(min or default value) - sysctl’s
net.ipv4.tcp_rmem
(min or default value) - the size of the sock, sock_common, proto and other per-socket data structures.
I’m not sure how much of the tcp_wmem and tcp_rmem are actually allocated and when that memory is allocated. At socket creation time? On demand?
tcp_mem is more important because it defines how the tcp stack should behave when it comes to memory usage. IMO send and receive buffer should be a multiple of tcp_mem. Here is a link to a formula for receive buffer: http://www.acc.umu.se/~maswan/linux-netperf.txt. In short:
The overhead is: window/2^tcp_adv_win_scale (tcp_adv_win_scale default is 2)
So for linux default parameters for the recieve window (tcp_rmem):
87380 – (87380 / 2^2) = 65536.
Given a transatlantic link (150 ms RTT), the maximum performance ends up at:
65536/0.150 = 436906 bytes/s or about 400 kbyte/s, which is really slow today.
With the increased default size:
(873800 – 873800/2^2)/0.150 = 4369000 bytes/s, or about 4Mbytes/s, which
is resonable for a modern network. And note that this is the default, if
the sender is configured with a larger window size it will happily scale
up to 10 times this (8738000*0.75/0.150 = ~40Mbytes/s), pretty good for
a modern network.
Here is what the article says about tcp_mem:
What you remove is an artificial limit to tcp performance, without that limit
you are bounded by the available end-to-end bandwidth and loss. So you might
end up saturating your uplink more effectively, but tcp is good at handling
this.
IMO a bigger middle tcp_mem value speeds up connection at the loss of less security and slightly increase memory usuage.
You can monitor the network stack with:
grep skbuff /proc/slabinfo
If you can modify the source code, then use rusage data to measure the RSS and record how many TCP connections are in play at the time of the measurement.
If source code cannot be changed, then use the RSS of the network app as reported by top or ps and get the number of network connections at the time of measurement from lsof -i
.
Collect this data every minute while your application moves through peak load, and from that data you can come up with a formula that relates number of connections to RAM usage.
Of course there are a lot more things that you could measure, in particular you might want to measure kernel RAM usage, although tcp data structures should be predictable and calculable in advance. In any case, have a look at this question https://serverfault.com/questions/10852/what-limits-the-maximum-number-of-connections-on-a-linux-server for more information on TCP tuning and how to get a clear view of what is happening in the network stack.
David has provided a very good answer to the question as asked, however unless you’re exclusively using LFNs, then even on an event-based server, the TCP buffers are likely to be only a small part of the per-connection footprint.
For capacity planning there’s no substitute for testing the server and calculating the regression of the memory usage by load.