Mastodon relays broadcast public posts from one instance to all connected instances. This can cause a massive influx of federation traffic that overwhelms Sidekiq, the background job processor. When Sidekiq queues fill up, new local posts, notifications, and deliveries slow down or stop entirely. This article explains why relays create Sidekiq backlogs and how to diagnose the root cause.
You will learn how to identify relay-related queue pressure, measure the exact impact on your instance, and decide whether to disconnect or throttle the relay. The goal is to restore normal Sidekiq throughput without disabling federation entirely.
Key Takeaways: Diagnosing Relay-Induced Sidekiq Backlog
- Sidekiq queues page at /sidekiq/queues: Shows the backlog count for each queue, with push and pull queues being the most affected by relay traffic.
- Mastodon admin dashboard > Federation > Relays: Displays the relay status and the number of received messages, which helps correlate backlog spikes.
- Instance logs at /logs or journalctl: Reveals repeated ActivityPub::ProcessingWorker errors or timeouts that indicate relay overload.
Why Relays Cause Sidekiq Backlogs
A Mastodon relay acts as a pub-sub hub. When any instance publishes a post that matches the relay’s rules, the relay forwards that post to every connected instance. For a small instance with a few hundred users, a single relay can deliver thousands of posts per minute. Each incoming post triggers Sidekiq workers to process, store, and potentially deliver the post further.
Sidekiq uses multiple queues with different priorities. The push queue handles outgoing deliveries to other instances. The pull queue handles incoming activities from relays and other instances. When a relay floods the pull queue, the workers become saturated. They cannot process local jobs such as sending notifications, updating timelines, or delivering your users’ posts. The backlog grows until Sidekiq runs out of memory or the workers hit the configured concurrency limit.
The root cause is not the relay itself but the ratio of incoming traffic to available worker resources. A relay that sends 500 posts per minute to an instance with only 25 Sidekiq workers will create a persistent backlog. The backlog clears only when incoming traffic drops below the processing capacity.
Steps to Diagnose Relay-Related Backlog
- Check the Sidekiq dashboard
Open https://yourinstance.com/sidekiq/queues. Look at the pull and push queue sizes. A healthy queue shows 0 to 100 pending jobs. A relay-caused backlog shows thousands or tens of thousands of pending jobs. Note the exact count and whether it grows or stays stable. - Identify the relay with the most traffic
Go to Preferences > Administration > Federation > Relays. Each relay row shows the number of received messages. Sort by received messages descending. The relay with the highest number is the likely culprit. Write down its URL and message count. - Monitor queue growth over 5 minutes
Refresh the Sidekiq queues page every 60 seconds for 5 minutes. If the pull queue count increases by more than 200 jobs per minute, the relay is overwhelming your instance. Take a screenshot or note the timestamps and counts. - Check Sidekiq worker latency
On the Sidekiq dashboard, look at the Latency column. Latency above 10 seconds indicates the workers cannot keep up. Relay traffic is the most common cause of high latency on small to medium instances. - Examine Sidekiq logs for relay errors
Runjournalctl -u sidekiq -n 200 --no-pageron the server. Search for lines containing ProcessingWorker, ActivityPub::IncomingActivity, or Relay. Repeated timeout or connection errors suggest the relay is sending data faster than your instance can accept it. - Compare backlog with relay enable time
If you enabled the relay recently, check the Sidekiq queue history. A backlog that started immediately after enabling the relay confirms the cause. If the backlog existed before the relay, the problem may be something else, such as a misconfigured federation or a DDoS attack.
If the Backlog Persists After Diagnosis
Sidekiq queue stuck at high count even after disconnecting the relay
Disconnecting the relay stops new incoming traffic, but the existing queued jobs must finish. Sidekiq may take hours to clear a backlog of 10,000 jobs. To speed this up, increase the number of Sidekiq workers temporarily. Edit docker-compose.yml or the systemd service file and add more concurrency settings. Restart Sidekiq after the change.
Relay shows zero messages but backlog remains
The relay may be misconfigured or the relay server may be offline. Remove the relay from Administration > Federation > Relays and add it again. If the backlog does not clear within 30 minutes, the relay server itself may be sending duplicate activities. Contact the relay operator and ask them to check their logs.
Sidekiq workers crash with out-of-memory errors
A large backlog consumes RAM. Each pending job holds a reference to the activity data. If your server has less than 2 GB of RAM, Sidekiq may run out of memory. Increase the server RAM or reduce the Sidekiq concurrency to 5 or fewer workers. Then disconnect the relay and let the backlog drain slowly.
| Item | Relay Connected | Relay Disconnected |
|---|---|---|
| Pull queue growth rate | 200-1000 jobs per minute | 0-20 jobs per minute |
| Worker latency | 10-60 seconds | Below 2 seconds |
| Memory usage by Sidekiq | 1.5-3 GB | 200-500 MB |
| Time to clear backlog | Never clears | 30 minutes to 4 hours |
You can now identify whether a Mastodon relay is causing your Sidekiq queue backlog. Start by checking the Sidekiq dashboard and the relay message counts. If the backlog is confirmed, disconnect the relay and monitor the queue for 30 minutes. For persistent backlogs, increase Sidekiq worker concurrency or upgrade server memory. As an advanced step, configure a dedicated Sidekiq process for the pull queue so that relay traffic does not block local jobs.