Fix libuv adapter to handle polling errors
Currently, when an async connection is attached to a *libuv* event loop using the libuv adapter, a libuv poll handle is initialized using the connection's file descriptor.
All subsequent async reads / writes are done by polling the file descriptor associated with the poll handle (using *uv_poll_start()* and *uv_poll_stop()*).
When an event is available, the **redisLibuvPoll()** function is invoked as the callback with **status** set to 0 and the detected event on the **events** parameter.
In the event of a polling error, the **redisLibuvPoll()** callback function is invoked with **status < 0** (exact error status depends on the reason the polling failed). Currently, this function returns if an error status is encountered without taking any action.
This can be problematic in various error scenarios, two of which are detailed below -
The steps involved in creating an async connection are -
1. Call *redisAsyncConnect()*
2. Call *redisLibuvAttach()* to attach the async connection to the event loop
3. Set connect / disconnect callbacks for the async connection using *redisAsyncSetConnectCallback()* and *redisAsyncSetDisconnectCallback()*
4. Run the event loop to check if the connection was successful or not
In step 4, if the connection was successful, the callback set in *redisAsyncSetConnectCallback()* is called.
**However** if the connection is not successful, an error is generated in polling the file descriptor and no action is taken.
If the Redis server crashes, the connection breaks resulting in a polling error. Now subsequent Redis commands issued using the <b>redisAsyncCommand()</b> family of functions would internally follow the steps below -
1. Call *__redisAsyncCommand()* helper function
2. Call
```
EL_ADD_WRITE(ac);
```
which expands to calling *redisLibuvAddWrite()*
3. Call *redisLibuvPoll()* which returns without taking any action
As a result, their callbacks would never be invoked and writes will get queued up indefinitely on this broken connection **unless** the connection is disconnected by a call to some other function in hiredis.c (which you really can't rely on to happen in all use cases)
To address the above scenarios and similar ones, if a polling error is encountered, we mark the *err* and *errstr* fields with appropriate error flag and error message respectively.
We also invoke *redisAsyncHandleRead()* so that
1. If a connection is not successful, *__redisAsyncHandleConnect()* can be called
2. If any other error occurs on a valid connection, (like scenario 2 above) *redisBufferRead()* would return REDIS_ERR and *__redisAsyncDisconnect()* will be called.
Note that either *redisAsyncHandleRead()* or *redisAsyncHandleWrite()* could be called when an error was encountered, but I went with *redisAsyncHandleRead()* as it seemed to be cleaner (no *done* variable)
Example Gist Link - https://gist.github.com/rangan337/11a5e893ec41977b901b
This is a simple example that assumes that a Redis server is not running on the PORT it is querying.
* **Prior to the PR,** no connection errors are reported.
* **After the PR,** the error that the client failed to connect to the server is reported.
Example Gist Link (With Sample Output) - https://gist.github.com/rangan337/77da6f7069d0fb29d443
This example basically :
1. Creates an async connection to a Redis server and attaches it to an event loop (running on its own thread)
2. Following that, the server is populated with some data.
3. At this point, on a parallel thread a **DEBUG SEGFAULT** is sent to crash the server.
4. While 3 is going on, in parallel requests are sent over the async connection by a set of 20 or so threads. Note that -
* While 20 *run command* events are fired on the event loop, as there is no guarantee that each *fire* will trigger the event handler, its quite possible that < 20 commands are actually sent to the server
* As this is in parallel with the segfault, some commands will be executed before and some after the segfault
5. Sleep the main thread for 2 seconds to allow all the other threads to do their job
6. Print the number of pending queued callbacks on the async connection.
* **Prior to the PR,** this value would be >=0.
* **After the PR,** this value will always be 0 (as when the connection is disconnected all pending callbacks are executed with a NULL reply)
7. Send another wave of parallel requests over the connection to the server.
* **Prior to the PR,** this will silently queue up the callbacks on the broken connection.
* Note that nothing is done in the **after PR** case as when the connection is disconnected in the step above, it is freed and set to NULL.
8. Lastly, print out the queued callback count (this is just to validate that they get queued up prior to the PR).
Note that for this example, the gist also has the sample output of a run attached.
Introduction
Currently, when an async connection is attached to a libuv event loop using the libuv adapter, a libuv poll handle is initialized using the connection's file descriptor.
All subsequent async reads / writes are done by polling the file descriptor associated with the poll handle (using uv_poll_start() and uv_poll_stop()).
When an event is available, the redisLibuvPoll() function is invoked as the callback with status set to 0 and the detected event on the events parameter.
Issue
In the event of a polling error, the redisLibuvPoll() callback function is invoked with status < 0 (exact error status depends on the reason the polling failed). Currently, this function returns if an error status is encountered without taking any action.
This can be problematic in various error scenarios, two of which are detailed below -
Scenario 1 - When a connection fails
The steps involved in creating an async connection are -
1. Call redisAsyncConnect()
2. Call redisLibuvAttach() to attach the async connection to the event loop
3. Set connect / disconnect callbacks for the async connection using redisAsyncSetConnectCallback() and redisAsyncSetDisconnectCallback()
4. Run the event loop to check if the connection was successful or not
In step 4, if the connection was successful, the callback set in redisAsyncSetConnectCallback() is called.
However if the connection is not successful, an error is generated in polling the file descriptor and no action is taken.
Scenario 2 - Server crashes and the connection is broken
If the Redis server crashes, the connection breaks resulting in a polling error. Now subsequent Redis commands issued using the redisAsyncCommand() family of functions would internally follow the steps below -
1. Call __redisAsyncCommand() helper function
2. Call
EL_ADD_WRITE(ac);
which expands to calling redisLibuvAddWrite()
3. Call redisLibuvPoll() which returns without taking any action
As a result, their callbacks would never be invoked and writes will get queued up indefinitely on this broken connection unless the connection is disconnected by a call to some other function in hiredis.c (which you really can't rely on to happen in all use cases)
Fix / Resolution
To address the above scenarios and similar ones, if a polling error is encountered, we mark the err and errstr fields with appropriate error flag and error message respectively.
We also invoke redisAsyncHandleRead() so that
1. If a connection is not successful, __redisAsyncHandleConnect() can be called
2. If any other error occurs on a valid connection, (like scenario 2 above) redisBufferRead() would return REDIS_ERR and __redisAsyncDisconnect() will be called.
Note that either redisAsyncHandleRead() or redisAsyncHandleWrite() could be called when an error was encountered, but I went with redisAsyncHandleRead() as it seemed to be cleaner (no done variable)
Examples
Scenario 1 - When a connection fails
Example Gist Link - https://gist.github.com/rangan337/11a5e893ec41977b901b
This is a simple example that assumes that a Redis server is not running on the PORT it is querying.
Prior to the PR, no connection errors are reported.
After the PR, the error that the client failed to connect to the server is reported.
Scenario 2 - Server crashes and the connection is broken
Example Gist Link (With Sample Output) - https://gist.github.com/rangan337/77da6f7069d0fb29d443
This example basically :
Creates an async connection to a Redis server and attaches it to an event loop (running on its own thread)
Following that, the server is populated with some data.
At this point, on a parallel thread a DEBUG SEGFAULT is sent to crash the server.
While 3 is going on, in parallel requests are sent over the async connection by a set of 20 or so threads. Note that -
While 20 run command events are fired on the event loop, as there is no guarantee that each fire will trigger the event handler, its quite possible that < 20 commands are actually sent to the server
As this is in parallel with the segfault, some commands will be executed before and some after the segfault
Sleep the main thread for 2 seconds to allow all the other threads to do their job
Print the number of pending queued callbacks on the async connection.
Prior to the PR, this value would be >=0.
After the PR, this value will always be 0 (as when the connection is disconnected all pending callbacks are executed with a NULL reply)
Send another wave of parallel requests over the connection to the server.
Prior to the PR, this will silently queue up the callbacks on the broken connection.
Note that nothing is done in the after PR case as when the connection is disconnected in the step above, it is freed and set to NULL.
Lastly, print out the queued callback count (this is just to validate that they get queued up prior to the PR).
Note that for this example, the gist also has the sample output of a run attached.
This reverts commit 1db17f257b.
If the `REDIS_CONNECTED` flag is cleared,
the async onDisconnect callback function will never be called.
This causes problems as the disconnect is never reported back to the user.
Closes#359
This adds a new adapter and an example for using hiredis with the ivykis
async I/O library.
Signed-off-by: Gergely Nagy <algernon@madhouse-project.org>
Not all code using hiredis can compile using '-std=c99', and/or not all users are able to easily make that change to the build process of various open-source projects, so it is more pragmatic to choose a different identifier that does not impose this requirement.
When an asynchronous hiredis connection subscribes to a Pub/Sub channel
and gets an error, and in other related conditions, the function
redisProcessCallbacks() enters a code path where the link is
disconnected, however the function returns before freeing the allocated
reply object. This causes a memory leak. The memory leak was trivial to
trigger in Redis Sentinel, which uses hiredis, every time we tried to
subscribe to an instance that required a password, in case the Sentinel
was configured either with the wrong password or without password at
all. In this case, the -AUTH error caused the leaking code path to be
executed.
Originally implemented by @abedra as part of #306.
In case a write or read times out, we force an error state, because we
can't guarantuee that the next read will get the right data.
Instead we need to reconnect to have a clean-state connection, which is
now easily possible with this method.
Attempting to use the install target before the make target works fine,
except for the missing pkgconfig file. Adding that file to the
dependencies for the install target to make sure it gets created first.
Due to the various processors going over the command, we need more
escaping.
1) Make parses it, so $${libdir} becomes ${libdir}
2) 'shell' parses it for the 'echo command', whereas echo ${libdir}
would be an empty string; escape it as \${libdir} to ensure we get what
we want.
Closes#312
Major fix:
- `make install` now works properly
Minor fix:
- `make test` now works after `make 32bit` on a 64-bit platform
- added more automated travis tests