bulk messaging systemscalable backendsystem designapi quota managementbackend engineering

Why Your Bulk Messaging System Needs a “Control Architecture”

Usama Amjid

January 4, 2026

3 min read

3298 words

Why Your Bulk Messaging System Needs a “Control Architecture”

When you build a real, production-scale Fintech application, you eventually hit a hard reality: Messaging limits.

It doesn’t matter if you are sending WhatsApp messages, emails, or SMS—there is always a daily quota. Most systems break at this point because they follow a dangerous logic: “Just send messages to everyone.”

While that works for small demos, it fails in real products with thousands of users.

The “All-or-Nothing” Trap

Let’s imagine a scenario where you have 50,000 users, but your provider only allows 3,000 messages per day. If you run a simple loop, three things go wrong:

•Memory Overload: The system tries to load thousands of users at once, causing memory spikes and server crashes.
•Quota Blindness: Once the provider blocks you, the code keeps running—wasting CPU and filling logs with errors.
•No Emergency Stop: If you click “Pause,” it’s often too late. A running loop cannot be stopped cleanly.

This is not just a technical problem; it’s a business risk.

Treat Messaging as a Managed Resource

Instead of asking, “How do I send messages fast?”, I asked, “How do I stay in control?” That’s when I designed a Campaign Engine based on the Producer-Consumer model.

1. The “Tick” System (Micro-Batching)

The "Producer" never looks at the full user list. Instead, it works in small, controlled batches called ticks. It fetches a few users, pushes jobs to the queue, and then stops to re-evaluate.

This leads to low memory usage and ensures the system stays calm and predictable, even at a massive scale.

2. Quota Awareness

Before sending a single job, the Producer checks the daily limit. If your quota is 3,000 and you have already sent 2,900, the Producer only allows 100 new jobs—then it auto-pauses the campaign.

The system knows it is “out of fuel” before it even starts the engine. No crashes, no provider bans, and no panic.

3. Idempotency: Never Send Twice

To solve the problem of restarting campaigns, I implemented a Delivery Log as the source of truth. Before selecting users, the system asks: "Give me users who match the filters AND who have not already received this message."

Because of this:

•You can pause anytime.
•You can restart anytime.
•Even if the server crashes, the system naturally continues exactly from where it stopped

The Admin Control Layer

For admins, the campaign follows a simple lifecycle: Idle → Running → Paused → Completed. But the real power is the Status Guard. Even if jobs already exist in the background queue, the "Consumer" checks the global campaign status before firing an API call. If the admin clicks “Pause,” message sending stops immediately. This is real-time control, not a delayed reaction.

Why This Matters for Business

This is not “over-engineering.” It directly impacts the stability of the product:

•No Crashes: The system pauses instead of failing.
•Safe Scaling: You can add more workers to send faster without changing the core logic.
•Full Transparency: Every action (manual pause, auto-pause, completion) is logged. Admins always know what happened and why.

Final Thought

In serious software—especially Fintech—doing the work is not enough. What really matters is how you control the work. A system that stays calm under pressure is far more valuable than one that is fast but fragile.