TCP Window Scaling(TCP socket buffer size, TCP window size)
/proc/sys/net/ipv4/tcp_window_scaling #1 is to enable window scaling
/proc/sys/net/ipv4/tcp_rmem - memory reserved for TCP rcv buffers. minimum, initial and maximum buffer size
/proc/sys/net/ipv4/tcp_wmem - memory reserved for TCP send buffers
/proc/sys/net/core/rmem_max - maximum receive window
/proc/sys/net/core/wmem_max - maximum send window
The following values (which are the defaults for 2.6.17 with more than 1 GByte of memory) would be reasonable for all paths with a 4MB BDP or smaller:
echo 1 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf #autotuning enabled. The receiver buffer size (and TCP window size) is dynamically updated (autotuned) for each connection. (Sender side autotuning has been present and unconditionally enabled for many years now).
echo 108544 > /proc/sys/net/core/wmem_max
echo 108544 > /proc/sys/net/core/rmem_max
echo "4096 87380 4194304" > /proc/sys/net/ipv4/tcp_rmem
echo "4096 16384 4194304" > /proc/sys/net/ipv4/tcp_wmem
Advanced TCP features
cat /proc/sys/net/ipv4/tcp_timestamps #more is here(allow more accurate RTT measurements for deriving the retransmission timeout estimator; protect against old segments from the previous incarnations of the TCP connection; allow detection of unnecessary retransmissions. But enabling it will also allow you to guess the uptime of a target system.)
Here are some background knowledge:
- The throughput of a communication is limited by two windows: the congestion window and the receive window(TCP congestion window is maintained by the sender, and TCP window size is maintained by the receiver). The former tries not to exceed the capacity of the network (congestion control) and the latter tries not to exceed the capacity of the receiver to process data (flow control). The receiver may be overwhelmed by data if for example it is very busy (such as a Web server). Each TCP segment contains the current value of the receive window. If for example a sender receives an ack which acknowledges byte 4000 and specifies a receive window of 10000 (bytes), the sender will not send packets after byte 14000, even if the congestion window allows it.
- TCP uses what is called the "congestion window", or CWND, to determine how many packets can be sent at one time. The larger the congestion window size, the higher the throughput. The TCP "slow start" and "congestion avoidance" algorithms determine the size of the congestion window. The maximum congestion window is related to the amount of buffer space that the kernel allocates for each socket. For each socket, there is a default value for the buffer size, which can be changed by the program using a system library call just before opening the socket. There is also a kernel enforced maximum buffer size. The buffer size can be adjusted for both the send and receive ends of the socket.
- To get maximal throughput it is critical to use optimal TCP send and receive socket buffer sizes for the link you are using. If the buffers are too small, the TCP congestion window will never fully open up. If the receiver buffers are too large, TCP flow control breaks and the sender can overrun the receiver, which will cause the TCP window to shut down. This is likely to happen if the sending host is faster than the receiving host. Overly large windows on the sending side is not usually a problem as long as you have excess memory; note that every TCP socket has the potential to request this amount of memory even for short connections, making it easy to exhaust system resources.
- More about TCP Buffer Sizing is here.
- More about /proc/sys/net/ipv4/* Variables is here.