[WIP]: One Way to write your own Task Queue
One way to write your own Task Queue
I was reading a Hacker News post a while back about why you should implement a Task Queue - I’m not going to go there as I think Dan Palmer’s article and the original hacker news thread
Since it’s a and rewrite - I imagine one could fork one of the Codecrafters Challenges and put together
One could read the source code on your own. Despite this , I decide to write his was for my own learning and I hope that it’d make reading faster for you too
An Overview of the application as a whole
- Built using Redis as a backing store together with
croniteras a library. Useful terms:
Soft timeouts -> The soft timeouts allow the task to catch an exception to clean up before it is killed. Hard timeouts -> Similar to soft timeouts but hard timeouts aren’t catch-able and forcefully terminates the task. SIG_DFL -> This is a default signal which can be used in place over a generic signal SIG_CHLD -> Child signs
Breakdown of a task
- How are tasks started
- How are tasks stopped
This is enabled by pubsub on the broker which allows signals to be propagated
Breakdown of code within worker
- How are queues managed
- How does the worker decide which tasks to run?
queues_by_name -> Passed to tasks so we can quickly identify the queues that tasks are associated to
What’s a ZRANGEPOP
To save you the hassle of looking it up here’s the corresponding documentation for ZREMRANGEBYSCORE and ZRANGEBYSCORE. Keys are 1-indexed
What happens within the scheduler
How are tasks scheduled?
Let’s take a look at the run function.
While starting off we have no upcoming tasks.
- Advances the schedules of all cronjobs to the next interval
- Find the next closest task which will be executed
- Add all tasks with same minute precision to the queue and sleep until then
On second pass, we push all tasks to the corresponding queue
Runs at minute precision Here’s how the scheduler would look like.
Waitpid, does what you’d expect to do
State being passed around -> Queues by name,
What does the core architecture look like
Minute level task scheduler
How the ETA parameter helps with scheduling
- I wasn’t too sure about this initially - seems to be Estimated Time of Arrival
- When you want a task to run by a certain time you’d call
- This is an override by parent – recall that we previously enqueued tasks in a new eta queue with the timestamp param that we could use
Each queue has a corresponding eta queue which can be used as an override at each cycle the parent does a check for eta tasks and dumps them at the front of the queue
Insert all the specified values at the head of the list stored at key
Adds all the specified members with the specified scores to the sorted set stored at key. It is possible to specify multiple score / member pairs. https://redis.io/commands/zadd/
BLPOP is a blocking list pop primitive. It is the blocking version of LPOP because it blocks the connection when there are no elements to pop from any of the given lists.
https://github.com/tomarrell/miniqueue https://github.com/beanstalkd/beanstalkd https://github.com/rq/rq