Q & A on Blocking vs. Non-blocking Socket Calls
Q: I would like to know the difference between blocking Vs Non-blocking sockets ? Read() call is a blocking call. It would return only when data is available at it's end. In the sense, there is a specified time out value for this call & it would return on data availability or upon timeout.
When I make the socket a non-blocking one, we are specifying the timeout value for these calls. So we are safe only if data is available within the specified time limit specified by us in the select() call or else we should poll till the data is available.
So is this blocking/non-blocking concept, merely a time out specification approach or do we have any other significance. For instance I would like to have an event driven approach wherein whenever data is available, the read() call should fire up automatically.
A: Although the select() call does have a timeout, operations such as read(), recv(), send(), write(), and accept() do not. Thus, the socket API is classified as ``synchronous,'' which means that if a process calls a socket function, the process blocks until the call completes. A non-blocking socket makes the API asynchronous. If the call can complete immediately, it does. Otherwise, the call returns immediately with the error code EWOULDBLOCK.
As you suggest, when using an asynchronous interface a process must try the operation periodically (i.e., poll). Most programmers prefer to use select() instead of polling.
Q: Thanks for the reply. But I still have a doubt. In the select call we will specify the desired sockets & then specify a timeout value. Then we check these sockets using FD_ISSET() macro. Even in this case, the FD_ISSET() macro will try only till the timeout value specified by us gets over. So if the socket descriptors are not set within the specified timeout, we got to revert for polling once again, right ?
A: If one places select() in a loop and merely retries the operation when it times out, the result is indeed polling. However, the point of allowing a timeout is not to permit polling, but to allow the program to detect an abnormal condition. For example, suppose two programs need to establish a new TCP connection between them. One end uses a passive open and the other uses an active open. Without timeout, the passive end will silently wait forever until the connection has been established. When select is used with timeout, the passive end can take action if the connection fails to arrive (e.g., display an error message for the user).
Q: In short I'm looking for a callback mechanism that would intimate us whenever there is data available for reading.
A: Few operating systems support real callback mechanisms. Instead, one creates a thread/process that calls select() for the set of descriptors that are being used for input. To perform concurrent processing while waiting, one needs another thread/process that performs the processing. Finally, the threads use an IPC to communicate (e.g., the thread waiting on select() sends a message to the processing thread).
--------------------------------------------------------------------------------