A Backend Agnostic Redis Interface

Josua Krause, PhD
Stackademic
Published in
7 min readNov 22, 2023

--

Logo of RediPy.
RediPy, the versatile Redis interface.

Over the years, Redis [1] has become a trusted solution for managing distributed applications. Whether it is caching, queueing, or sharing data between different applications on a network, Redis has proven to be valuable. Its utility is so broad that it’s even tempting to use it for non-distributed, single-process applications. However, this approach comes with its own set of challenges. For instance, the standalone app can no longer run independently. It’s necessary to ensure that the Redis server is running before starting your app. Additionally, using Redis introduces some overhead. Data isn’t stored in the current process’ memory but in the memory of a separate process, accessible only via a (local) network connection. Wouldn’t it be great if we could use it locally contained in our process for small projects and smoothly transition to a server-based approach as our project grows from threads to multiple distributed processes?

This is where RediPy [2] comes in. RediPy is a versatile backend-agnostic Redis interface. It provides an in-process, Redis-like storage that can be accessed through the same API as an actual Redis storage via a network connection. The library is available on PyPI [3] and can be installed via:

pip install redipy

The Why

So, why should you install another library that essentially provides the same API as the official Python Redis client [4]? Here are a few reasons:

1) Suppose you want to cache computation results using Redis so that all containers on the network can utilize the cache. This could include multiple apps or redundant versions of the same app. If you find that the same cache keys are frequently accessed from a given container, you might decide to add an additional caching layer local to the running process. Normally, this would require duplicating the caching code for a process-local storage using different data structures. With RediPy, you can use the same code for both the local cache and the remote Redis cache.

2) RediPy is also useful for writing tests that involve Redis storage. While it is not difficult to make tests work with a test Redis server, with RediPy it is easier to set up clean initial states for tests. Especially when running tests in parallel using a test Redis server, great care must be taken to ensure properly separated test environments. Without a test-specific key prefix, it’s easy to accidentally have conflicting keys between tests that lead to unexpected behavior. Also, using an in-memory Redis avoids the need to remember to start a Redis server to be able to run tests. RediPy eliminates all these issues with its process-local backend.

3) As projects grow, so do their requirements. RediPy allows for proactive anticipation of distributed software requirements when a project is still in its infancy, running in a single process. When the time comes, the focus can be on separating the logic and turning threads into processes, instead of having to map all data structures to equivalent Redis types and commands. Moreover, RediPy makes it easy to implement apps that can be deployed both in a single process and in the cloud with containers.

4) Redis allows for complex transactions using Lua scripts. However, RediPy takes a different approach. Instead of bundling RediPy with a Lua interpreter, scripts are defined in Python using a symbolic API. These scripts are then compiled into Lua code when executing the script on the Redis backend and executed directly when using the in-memory backend. This method offers several benefits:

  • It eliminates the need to code in Lua (or other languages) when interfacing with Redis scripts. This is particularly beneficial as even enterprise Redis has been moving away from using Lua for scripting [5].
  • Lua has some behaviors that can be surprising, such as 1-based indexing or skipping nil values during iteration [6], and it can be challenging to debug when running on the Redis server.
  • Having a separate universal scripting API allows for similar behavior as the main API. For instance, the Redis LPOP function in Lua returns false for an empty list instead of nil (or None) as documented and implemented when calling the function through the API. This is an example of various undocumented changes from the Redis API specification to accommodate Lua’s quirks.

By using RediPy, you can avoid these issues and enjoy a more streamlined and consistent experience when working with Redis.

The How

Okay, enough talking. Show us some code! Following, we will implement a simple task queue that ensures no data loss if a worker fails. This is achieved by detecting the faulty worker and reinserting its task back into the queue. This code assumes you are familiar with Redis and the official Python Redis client. You can find the complete example code here [7].

First, let’s establish a connection to Redis:

Now, let’s construct the queue. We use a priority queue with the ZADD command, which allows us to assign a score that determines the order of task retrieval. Additionally, we store the actual task payload using a hash with the HSET command. This provides more flexibility in defining task parameters. However, this also necessitates the creation of unique task IDs to link the queue with the hash (which we won’t cover here). We also need not forget to clean up after a task has been completed.

We use a pipeline to ensure that neither the payload nor the task in the queue are accessible before the other. This prevents a data race when picking up tasks:

This enqueue_task function is executed by the task producer, which can be a thread or a process depending on the RediPy backend we choose to use. The workers, which can be threads or processes, can retrieve and execute a task as follows:

Here we use ZPOPMAX to get the queue entry with the highest score. Then we retrieve the payload with HGET, execute the task (which in our case is just sleeping for a number of seconds), and finally clean up with HDEL.

If we are using an actual Redis server and the workers are running in their own containers, there is always a risk of sudden worker failure (e.g., container eviction, node restarts, power loss, network down, etc.). If that happens while we are working on a task, there will be data loss. The task is already removed from the queue so it won’t be picked up by a different worker. The task is lost forever. To resolve this issue, we first need a way to detect when a worker has died. For this, we can implement a simple heartbeat that runs in a different thread in the worker process:

We use SET to first create a unique worker ID (mode="if_missing" is the RediPy equivalent of the NX flag which creates a key only if it didn’t exist already). Then, we use repeated Redis key expirations to create a heartbeat. As long as the heartbeat is active the key will not expire. However, if the worker dies (and with it the heartbeat), the key will expire. This can be detected on the producer side to re-enqueue the task the worker was handling. But before we can do that, we need a way for a worker to state which task they are currently working on.

Now we have a situation where we retrieve a value from Redis (the task ID) that we then use to store a different value on Redis (the mapping from task ID to worker ID). Both these operations need to happen in a single transaction. Otherwise, if we were really unlucky and the worker died exactly between the two commands we would experience data loss and the task would be forgotten. Note, using a pipeline is not enough here as the transaction ends the moment we retrieve its results. We cannot make use of a result in the same transaction. We have to use a script:

This new consume_task_loop function might seem intimidating at first due to its length, but let’s break it down. In _pick_task, we are creating the script. The initial statements set up the script arguments and define the expected type for each key. We then create a for loop that iterates over the result of ZPOPMAX, which returns either one result or zero (if the queue is empty). If a task is retrieved, we store the task ID and its payload in the script’s return value, and we create a mapping from the task ID to the worker ID in the hash key busy. The rest of the consume_task_loop function is essentially the same as before, except that we now also have to remove the field in busy. We can use a pipeline for this to ensure that the payload and the worker marker are removed at the same time.

Now that we have a way of detecting (in-)active workers and the tasks they are/were working on, we can implement task recovery for if a worker fails:

Here, we iterate over all fields in the busy hash. The task_check script then checks if the specified worker is still alive. If it is not, we put the task ID back into the queue and delete the task-to-worker mapping. Note, that HGETALL and task_check are two separate transactions. This is desired as we do not want to block Redis for the entire duration of all task checks.

Let’s run the code and see how it all works together:

We are starting three workers and one producer and let them chug along. At one point we terminate worker 2 which at the time is executing task 21. After a few seconds, the stale worker detection loop finds the orphaned task and adds it back to the queue. Eventually, task 21 is picked up again by worker 0, which finishes executing it. Works like a charm!

You can find the full code of this tutorial here [7] and hopefully, it has piqued your interest in RediPy [2].

References

[1] https://redis.io/docs/about/
[2] https://github.com/JosuaKrause/redipy/
[3] https://pypi.org/project/redipy/
[4] https://pypi.org/project/redis/
[5] https://redis.com/events-and-webinars/dev-it-live-program-your-redis-database-with-javascript/
[6] https://stackoverflow.com/questions/35515705/lua-doesnt-go-over-nil-in-table/
[7] https://github.com/JosuaKrause/redipy/blob/main/examples/workers.py

Stackademic

Thank you for reading until the end. Before you go:

  • Please consider clapping and following the writer! 👏
  • Follow us on Twitter(X), LinkedIn, and YouTube.
  • Visit Stackademic.com to find out more about how we are democratizing free programming education around the world.

--

--

Josua has led Data Science teams focused on deep representation learning, natural language processing, and adaptive learning. His PhD focused on explainable AI.