Mec*_*cki 630 unix sockets linux windows portability
在man pages
和程序员单证套接字选项SO_REUSEADDR
,并SO_REUSEPORT
针对不同的操作系统,不同的,往往非常混乱.有些操作系统甚至没有选项SO_REUSEPORT
.WEB中充满了关于此主题的矛盾信息,并且通常您可以找到仅对特定操作系统的一个套接字实现的信息,这些信息甚至可能在文本中没有明确提及.
那究竟有什么SO_REUSEADDR
不同SO_REUSEPORT
呢?
系统是否没有SO_REUSEPORT
更多限制?
如果我在不同的操作系统上使用任何一个,那么预期的行为究竟是什么?
Mec*_*cki 1526
欢迎来到可移植性的美妙世界......或者更确切地说,缺乏它.在我们开始详细分析这两个选项并深入了解不同操作系统如何处理它们之前,应该注意BSD套接字实现是所有套接字实现的母亲.基本上所有其他系统在某个时间点(或至少其接口)复制了BSD套接字实现,然后开始自己进行演变.当然,BSD套接字实现也是同时发展的,因此后来复制它的系统具有之前复制它的系统所缺少的功能.理解BSD套接字实现是理解所有其他套接字实现的关键,因此即使您不想编写BSD系统的代码,也应该阅读它.
在我们查看这两个选项之前,您应该了解一些基础知识.TCP/UDP连接由五个值的元组标识:
{<protocol>, <src addr>, <src port>, <dest addr>, <dest port>}
这些值的任何唯一组合都标识了连接.因此,没有两个连接可以具有相同的五个值,否则系统将无法再区分这些连接.
使用该socket()
函数创建套接字时,将设置套接字的协议.源地址和端口使用该bind()
功能设置.使用该connect()
功能设置目标地址和端口.由于UDP是无连接协议,因此可以在不连接UDP套接字的情况下使用UDP套接字.但它允许连接它们,在某些情况下,它们对您的代码和一般应用程序设计非常有利.在无连接模式下,第一次通过它们发送数据时未明确绑定的UDP套接字通常由系统自动绑定,因为未绑定的UDP套接字无法接收任何(回复)数据.对于未绑定的TCP套接字也是如此,它在连接之前会自动绑定.
如果显式绑定套接字,则可以将其绑定到端口0
,这意味着"任何端口".由于套接字实际上不能绑定到所有现有端口,因此系统必须在这种情况下选择特定端口(通常来自预定义的OS特定的源端口范围).源地址存在类似的通配符,可以是"任何地址"(0.0.0.0
如果是IPv4和::
在IPv6的情况下).与端口的情况不同,套接字实际上可以绑定到"任何地址",这意味着"所有本地接口的所有源IP地址".如果稍后连接套接字,则系统必须选择特定的源IP地址,因为套接字无法连接,同时绑定到任何本地IP地址.根据目标地址和路由表的内容,系统将选择适当的源地址,并将"any"绑定替换为对所选源IP地址的绑定.
缺省情况下,没有两个套接字可以绑定到源地址和源端口的同一组合.只要源端口不同,源地址实际上是无关紧要的.绑定socketA
到A:X
并socketB
到B:Y
,这里A
和B
是地址和X
和Y
的港口,只要始终是可能X != Y
成立.但是,即使只有成立X == Y
,绑定仍然是可能的A != B
.例如,socketA
属于FTP服务器程序,并且绑定192.168.0.1:21
并socketB
属于另一个FTP服务器程序并且绑定10.0.0.1:21
,两个绑定都将成功.但请记住,套接字可能在本地绑定到"任何地址".如果绑定了套接字0.0.0.0:21
,它同时绑定到所有现有本地地址,在这种情况下21
,无论其尝试绑定哪个特定IP地址,都不能将其他套接字绑定到端口,因为它0.0.0.0
与所有现有本地IP地址冲突.
到目前为止所说的任何内容对于所有主要操作系统来说都是相同的.当地址重用发挥作用时,事情开始变得特定于操作系统.我们从BSD开始,因为如上所述,它是所有套接字实现的母亲.
如果SO_REUSEADDR
在绑定之前在套接字上启用了套接字,则可以成功绑定套接字,除非与另一个绑定到完全相同的源地址和端口组合的套接字发生冲突.现在您可能想知道与以前有什么不同?关键字是"完全".SO_REUSEADDR
主要改变搜索冲突时如何处理通配符地址("任何IP地址")的方式.
Without SO_REUSEADDR
, binding socketA
to 0.0.0.0:21
and then binding socketB
to 192.168.0.1:21
will fail (with error EADDRINUSE
), since 0.0.0.0 means "any local IP address", thus all local IP addresses are considered in use by this socket and this includes 192.168.0.1
, too. With SO_REUSEADDR
it will succeed, since 0.0.0.0
and 192.168.0.1
are not exactly the same address, one is a wildcard for all local addresses and the other one is a very specific local address. Note that the statement above is true regardless in which order socketA
and socketB
are bound; without SO_REUSEADDR
it will always fail, with SO_REUSEADDR
it will always succeed.
To give you a better overview, let's make a table here and list all possible combinations:
SO_REUSEADDR socketA socketB Result --------------------------------------------------------------------- ON/OFF 192.168.0.1:21 192.168.0.1:21 Error (EADDRINUSE) ON/OFF 192.168.0.1:21 10.0.0.1:21 OK ON/OFF 10.0.0.1:21 192.168.0.1:21 OK OFF 0.0.0.0:21 192.168.1.0:21 Error (EADDRINUSE) OFF 192.168.1.0:21 0.0.0.0:21 Error (EADDRINUSE) ON 0.0.0.0:21 192.168.1.0:21 OK ON 192.168.1.0:21 0.0.0.0:21 OK ON/OFF 0.0.0.0:21 0.0.0.0:21 Error (EADDRINUSE)
上面的表假设socketA
已成功绑定到给定的地址socketA
,然后socketB
创建,或者SO_REUSEADDR
设置与否,最后绑定到给定的地址socketB
.Result
是绑定操作的结果socketB
.如果第一列说ON/OFF
,则值SO_REUSEADDR
与结果无关.
好的,SO_REUSEADDR
对通配符地址有影响,很高兴知道.然而,这不是它唯一的影响.另一个众所周知的效果也是大多数人SO_REUSEADDR
首先在服务器程序中使用的原因.对于此选项的其他重要用途,我们必须深入了解TCP协议的工作原理.
套接字有一个发送缓冲区,如果对send()
函数的调用成功,并不意味着所请求的数据实际上已经被发送出去,它只意味着数据已被添加到发送缓冲区.对于UDP套接字,数据通常很快发送,如果不是立即发送,但对于TCP套接字,在向发送缓冲区添加数据和使TCP实现真正发送该数据之间可能存在相对长的延迟.因此,当您关闭TCP套接字时,发送缓冲区中可能仍有未决数据,但尚未发送但您的代码认为它已发送,因为send()
呼叫成功.如果TCP实现在您的请求中立即关闭套接字,则所有这些数据都将丢失,您的代码甚至都不会知道.据说TCP是一种可靠的协议,丢失数据就像那样不太可靠.这就是为什么仍然有数据要发送的套接字将进入TIME_WAIT
关闭它时调用的状态.在该状态下,它将等待所有挂起的数据成功发送或直到超时,在这种情况下,套接字被强制关闭.
The amount of time the kernel will wait before it closes the socket, regardless if it still has pending send data or not, is called the Linger Time. The Linger Time is globally configurable on most systems and by default rather long (two minutes is a common value you will find on many systems). It is also configurable per socket using the socket option SO_LINGER
which can be used to make the timeout shorter or longer, and even to disable it completely. Disabling it completely is a very bad idea, though, since closing a TCP socket gracefully is a slightly complex process and involves sending forth and back a couple of packets (as well as resending those packets in case they got lost) and this whole close process is also limited by the Linger Time.如果禁用延迟,则套接字可能不仅会丢失挂起的数据,而且还会强制关闭而不是正常关闭,这通常不建议使用.有关如何正常关闭TCP连接的详细信息超出了本答案的范围,如果您想了解更多信息,我建议您查看此页面.即使您禁用了延迟SO_LINGER
,如果您的进程在没有明确关闭套接字的情况下死亡,BSD(以及可能还有其他系统)仍会延迟,忽略您已配置的内容.例如,如果您的代码只是调用,就会发生这种情况exit()
(对于微小的,简单的服务器程序来说很常见)或者进程被信号杀死(包括由于非法内存访问而导致崩溃的可能性).因此,没有什么可以确保套接字在任何情况下都不会延续.
The question is, how does the system treat a socket in state TIME_WAIT
? If SO_REUSEADDR
is not set, a socket in state TIME_WAIT
is considered to still be bound to the source address and port and any attempt to bind a new socket to the same address and port will fail until the socket has really been closed, which may take as long as the configured Linger Time. So don't expect that you can rebind the source address of a socket immediately after closing it. In most cases this will fail. However, if SO_REUSEADDR
is set for the socket you are trying to bind, another socket bound to the same address and port in state TIME_WAIT
is simply ignored, after all its already "half dead", and your socket can bind to exactly the same address without any problem. In that case it plays no role that the other socket may have exactly the same address and port. Note that binding a socket to exactly the same address and port as a dying socket in TIME_WAIT
state can have unexpected, and usually undesired, side effects in case the other socket is still "at work", but that is beyond the scope of this answer and fortunately those side effects are rather rare in practice.
There is one final thing you should know about SO_REUSEADDR
. Everything written above will work as long as the socket you want to bind to has address reuse enabled. It is not necessary that the other socket, the one which is already bound or is in a TIME_WAIT
state, also had this flag set when it was bound. The code that decides if the bind will succeed or fail only inspects the SO_REUSEADDR
flag of the socket fed into the bind()
call, for all other sockets inspected, this flag is not even looked at.
SO_REUSEPORT
is what most people would expect SO_REUSEADDR
to be. Basically, SO_REUSEPORT
allows you to bind an arbitrary number of sockets to exactly the same source address and port as long as all prior bound sockets also had SO_REUSEPORT
set before they were bound. If the first socket that is bound to an address and port does not have SO_REUSEPORT
set, no other socket can be bound to exactly the same address and port, regardless if this other socket has SO_REUSEPORT
set or not, until the first socket releases its binding again. Unlike in case of SO_REUESADDR
the code handling SO_REUSEPORT
will not only verify that the currently bound socket has SO_REUSEPORT
set but it will also verify that the socket with a conflicting address and port had SO_REUSEPORT
set when it was bound.
SO_REUSEPORT
does not imply SO_REUSEADDR
. This means if a socket did not have SO_REUSEPORT
set when it was bound and another socket has SO_REUSEPORT
set when it is bound to exactly the same address and port, the bind fails, which is expected, but it also fails if the other socket is already dying and is in TIME_WAIT
state. To be able to bind a socket to the same addresses and port as another socket in TIME_WAIT
state requires either SO_REUSEADDR
to be set on that socket or SO_REUSEPORT
must have been set on both sockets prior to binding them. Of course it is allowed to set both, SO_REUSEPORT
and SO_REUSEADDR
, on a socket.
There is not much more to say about SO_REUSEPORT
other than that it was added later than SO_REUSEADDR
, that's why you will not find it in many socket implementations of other systems, which "forked" the BSD code before this option was added, and that there was no way to bind two sockets to exactly the same socket address in BSD prior to this option.
Most people know that bind()
may fail with the error EADDRINUSE
, however, when you start playing around with address reuse, you may run into the strange situation that connect()
fails with that error as well. How can this be? How can a remote address, after all that's what connect adds to a socket, be already in use? Connecting multiple sockets to exactly the same remote address has never been a problem before, so what's going wrong here?
As I said on the very top of my reply, a connection is defined by a tuple of five values, remember? And I also said, that these five values must be unique otherwise the system cannot distinguish two connections any longer, right? Well, with address reuse, you can bind two sockets of the same protocol to the same source address and port. That means three of those five values are already the same for these two sockets. If you now try to connect both of these sockets also to the same destination address and port, you would create two connected sockets, whose tuples are absolutely identical. This cannot work, at least not for TCP connections (UDP connections are no real connections anyway). If data arrived for either one of the two connections, the system could not tell which connection the data belongs to. At least the destination address or destination port must be different for either connection, so that the system has no problem to identify to which connection incoming data belongs to.
So if you bind two sockets of the same protocol to the same source address and port and try to connect them both to the same destination address and port, connect()
will actually fail with the error EADDRINUSE
for the second socket you try to connect, which means that a socket with an identical tuple of five values is already connected.
Most people ignore the fact that multicast addresses exist, but they do exist. While unicast addresses are used for one-to-one communication, multicast addresses are used for one-to-many communication. Most people got aware of multicast addresses when they learned about IPv6 but multicast addresses also existed in IPv4, even though this feature was never widely used on the public Internet.
The meaning of SO_REUSEADDR
changes for multicast addresses as it allows multiple sockets to be bound to exactly the same combination of source multicast address and port. In other words, for multicast addresses SO_REUSEADDR
behaves exactly as SO_REUSEPORT
for unicast addresses. Actually, the code treats SO_REUSEADDR
and SO_REUSEPORT
identically for multicast addresses, that means you could say that SO_REUSEADDR
implies SO_REUSEPORT
for all multicast addresses and the other way round.
All these are rather late forks of the original BSD code, that's why they all three offer the same options as BSD and they also behave the same way as in BSD.
At its core, macOS is simply a BSD-style UNIX named "Darwin", based on a rather late fork of the BSD code (BSD 4.3), which was then later on even re-synchronized with the (at that time current) FreeBSD 5 code base for the Mac OS 10.3 release, so that Apple could gain full POSIX compliance (macOS is POSIX certified). Despite having a microkernel at its core ("Mach"), the rest of the kernel ("XNU") is basically just a BSD kernel, and that's why macOS offers the same options as BSD and they also behave the same way as in BSD.
iOS is just a macOS fork with a slightly modified and trimmed kernel, somewhat stripped down user space toolset and a slightly different default framework set. watchOS and tvOS are iOS forks, that are stripped down even further (especially watchOS). To my best knowledge they all behave exactly as macOS does.
Prior to Linux 3.9, only the option SO_REUSEADDR
existed. This option behaves generally the same as in BSD with two important exceptions:
As long as a listening (server) TCP socket is bound to a specific port, the SO_REUSEADDR
option is entirely ignored for all sockets targeting that port. Binding a second socket to the same port is only possible if it was also possible in BSD without having SO_REUSEADDR
set. E.g. you cannot bind to a wildcard address and then to a more specific one or the other way round, both is possible in BSD if you set SO_REUSEADDR
. What you can do is you can bind to the same port and two different non-wildcard addresses, as that's always allowed. In this aspect Linux is more restrictive than BSD.
The second exception is that for client sockets, this option behaves exactly like SO_REUSEPORT
in BSD, as long as both had this flag set before they were bound. The reason for allowing that was simply that it is important to be able to bind multiple sockets to exactly to the same UDP socket address for various protocols and as there used to be no SO_REUSEPORT
prior to 3.9, the behavior of SO_REUSEADDR
was altered accordingly to fill that gap. In that aspect Linux is less restrictive than BSD.
Linux 3.9 added the option SO_REUSEPORT
to Linux as well. This option behaves exactly like the option in BSD and allows binding to exactly the same address and port number as long as all sockets have this option set prior to binding them.
Yet, there are still two differences to SO_REUSEPORT
on other systems:
To prevent "port hijacking", there is one special limitation: All sockets that want to share the same address and port combination must belong to processes that share the same effective user ID! So one user cannot "steal" ports of another user. This is some special magic to somewhat compensate for the missing SO_EXCLBIND
/SO_EXCLUSIVEADDRUSE
flags.
Additionally the kernel performs some "special magic" for SO_REUSEPORT
sockets that isn't found in other operating systems: For UDP sockets, it tries to distribute datagrams evenly, for TCP listening sockets, it tries to distribute incoming connect requests (those accepted by calling accept()
) evenly across all the sockets that share the same address and port combination. Thus an application can easily open the same port in multiple child processes and then use SO_REUSEPORT
to get a very inexpensive load balancing.
Even though the whole Android system is somewhat different from most Linux distributions, at its core works a slightly modified Linux kernel, thus everything that applies to Linux should apply to Android as well.
Windows only knows the SO_REUSEADDR
option, there is no SO_REUSEPORT
. Setting SO_REUSEADDR
on a socket in Windows behaves like setting SO_REUSEPORT
and SO_REUSEADDR
on a socket in BSD, with one exception: A socket with SO_REUSEADDR
can always bind to exactly the same source address and port as an already bound socket, even if the other socket did not have this option set when it was bound. This behavior is somewhat dangerous because it allows an application "to steal" the connected port of another application. Needless to say, this can have major security implications. Microsoft realized that this might be a problem and thus added another socket option SO_EXCLUSIVEADDRUSE
. Setting SO_EXCLUSIVEADDRUSE
on a socket makes sure that if the binding succeeds, the combination of source address and port is owned exclusively by this socket and no other socket can bind to them, not even if it has SO_REUSEADDR
set.
For even more details on how the flags SO_REUSEADDR
and SO_EXCLUSIVEADDRUSE
work on Windows, how they influence binding/re-binding, Microsoft kindly provided a table similar to my table near the top of that reply. Just visit this page and scroll down a bit. Actually there are three tables, the first one shows the old behavior (prior Windows 2003), the second one the behavior (Windows 2003 and up) and the third one shows how the behavior changes in Windows 2003 and later if the bind()
calls are made by different users.
Solaris is the successor of SunOS. SunOS was originally based on a fork of BSD, SunOS 5 and later was based on a fork of SVR4, however SVR4 is a merge of BSD, System V, and Xenix, so up to some degree Solaris is also a BSD fork, and a rather early one. As a result Solaris only knows SO_REUSEADDR
, there is no SO_REUSEPORT
. The SO_REUSEADDR
behaves pretty much the same as it does in BSD. As far as I know there is no way to get the same behavior as SO_REUSEPORT
in Solaris, that means it is not possible to bind two sockets to exactly the same address and port.
Similar to Windows, Solaris has an option to give a socket an exclusive binding. This option is named SO_EXCLBIND
. If this option is set on a socket prior to binding it, setting SO_REUSEADDR
on another socket has no effect if the two sockets are tested for an address conflict. E.g. if socketA
is bound to a wildcard address and socketB
has SO_REUSEADDR
enabled and is bound to a non-wildcard address and the same port as socketA
, this bind will normally succeed, unless socketA
had SO_EXCLBIND
enabled, in which case it will fail regardless the SO_REUSEADDR
flag of socketB
.
In case your system is not listed above, I wrote a little test program that you can use to find out how your system handles these two options. Also if you think my results are wrong, please first run that program before posting any comments and possibly making false claims.
All that the code requires to build is a bit POSIX API (for the network parts) and a C99 compiler (actually most non-C99 compiler will work as well as long as they offer inttypes.h
and stdbool.h
; e.g. gcc
supported both long before offering full C99 support).
All that the program needs to run is that at least one interface in your system (other than the local interface) has an IP address assigned and that a default route is set which uses that interface. The program will gather that IP address and use it as the second "specific address".
It tests all possible combinations you can think of:
SO_REUSEADDR
set on socket1,
小智 14
Mecki 的回答绝对完美,但值得补充的是 FreeBSD 也支持SO_REUSEPORT_LB
,它模仿了 Linux 的SO_REUSEPORT
行为——它平衡了负载;见setockopt(2)