Multi Leader Election with Rust and etcd

4 min readNov 29, 2022

This is the 13th (unlucky?) article in my “Data Mesh” series. This is the first article where I start implementing this idea. It has been a tough journey for me figuring the multi-leader election out. It helped notch up my distributed systems skills and I thought it might be good to make it a stand-alone article.

It took me roughly over a week, after my idea article, to figure out that systems like Consul and etcd help with leader election. Especially with golang, this seems easier and there are several articles detailing the use case. For Rust, both consul and etcd crates seemed a little lacking. Amongst the consule crates, I liked the ones from Roblox and Hashicorp. Though my fight with both have been too great leading me, in part, to drop a consul based approach. The other reason is that I figured that I might be able to use etcd to build the distributed lock too rather than using two different systems.

I decided to use this crate for a etcd based approach. The crate has a leader election API, which I had assumed to be a part of the etcdv3 concurrency API. But I had some difficulty to get it work properly. Also turns out it is not the use case the crate builders had in mind. However, I did find a workaround using their lock API.

Anyhow, onto the code then, starting with the requirements.

And we trace through the code using the tracing framework.

Next I defined a couple of structs and types to arrange our data.

We use the ServiceNode struct to hold information about the leaders in the group. ServerNode is to hold the node’s data. The Key type is used to comment within the code.

I implemented Drop for ServerNode. The idea here is that when the program exits, any locks and leases are released.

There is a lot to unpack in the new method for ServerNode.

We assume the pods are deployed as a StatefulSet. So we get pod names like — rdb-election-0.
This allows us to use the group size variable to decide which group this pod belongs to — lines 42–59.
The election_key_prefixes are used to identify there are leaders in the group as these keys are used to hold the locks. Might get clearer later. lease_ttl is used to how long the lease on the lock is held. Ultimately this might be easier to figure out.
The leader_keys are the keys that the leaders store the ServiceNode information at.

This function is run whenever a node comes online. It continually checks for the existence/liveness of the election keys. It also makes sure that the same node does not become both leaders for the same group.

The campaign function makes the current node one of the group’s leaders.

Get a new lease.
Create a new lock identified by the argument to the function and associating its lease to it.
Put the node’s ServiceNode into the key for the leader.

The keep_alive function is run by the leaders to keep the lease alive.

The get_leaders function is used by the non-leaders to collect and store information about the leaders.

Finally the main function. This function simply boils down to the loop — lines 44 to 52. The node continually checks for the leaders or becomes one of the leaders.

Testing can be done by deploying the service using server.yaml. Otherwise, it can be tested using commands like these in different windows:

TRACE_LEVEL=info NODE=Server-1 DBNAME=RDeeBee ADDRESS=192.168.10.10 ETCD=localhost:30327 LEASE_TTL=10 SLEEP_TIME=2 GROUP_SIZE=4 cargo run
TRACE_LEVEL=info NODE=Server-2 DBNAME=RDeeBee ADDRESS=192.168.10.11 ETCD=localhost:30327 LEASE_TTL=10 SLEEP_TIME=2 GROUP_SIZE=4 cargo run
TRACE_LEVEL=info NODE=Server-3 DBNAME=RDeeBee ADDRESS=192.168.10.13 ETCD=localhost:30327 LEASE_TTL=10 SLEEP_TIME=2 GROUP_SIZE=4 cargo run

This will start three processes mocking different hostnames and IP addresses.

Also, check out my article on distributed ordering.

Next I will start working on integrating this with the data mesh.

Multi Leader Election with Rust and etcd

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Ratnadeep Bhattacharya, PhD

No responses yet