Event-Driven Architecture for Small Teams

Event-driven architecture (EDA) has a reputation problem. Mention it to a small team and they picture Kafka clusters, schema registries, complex consumer groups, and a distributed systems PhD on the payroll. The enterprise literature does not help — it assumes you have a platform team, a dedicated infrastructure budget, and services counted in dozens.

But event-driven patterns can dramatically simplify small applications too. The key is choosing the right level of complexity for your team size. A 5-person startup does not need Kafka. They need patterns that decouple their services, handle failures gracefully, and can scale when the time comes.

This article presents event-driven architecture at three levels of complexity, with real implementation patterns for each. Start at level one and graduate to level two or three only when you hit specific scaling walls.

Why Events Matter for Small Teams

Consider a typical SaaS application. A user signs up. What happens next?

Create the user record in the database
Send a welcome email
Create a Stripe customer
Set up default project and workspace
Notify the sales team in Slack
Track the event in analytics

In a synchronous architecture, all of this happens in the signup handler:

// The synchronous nightmare
async function handleSignup(req, res) {
  const user = await db.users.create(req.body);     // 50ms
  await sendWelcomeEmail(user);                       // 800ms
  await stripe.customers.create({ email: user.email }); // 400ms
  await createDefaultWorkspace(user.id);              // 200ms
  await notifySlack(`New signup: ${user.email}`);     // 300ms
  await analytics.track("user.signup", user);         // 150ms

  // Total: ~1900ms — user waits almost 2 seconds
  // If Stripe is down, signup fails entirely
  res.json({ user });
}

This has three critical problems: the user waits for all downstream operations, any single failure blocks the entire signup, and every new side effect makes the handler slower and more fragile.

Event-driven: the signup handler creates the user and publishes an event. Everything else happens asynchronously:

// Event-driven: clean, fast, resilient
async function handleSignup(req, res) {
  const user = await db.users.create(req.body);       // 50ms
  await events.publish("user.signup", { userId: user.id, email: user.email });
  res.json({ user });                                  // Total: ~60ms
}

// Each side effect is an independent handler
events.on("user.signup", sendWelcomeEmail);
events.on("user.signup", createStripeCustomer);
events.on("user.signup", createDefaultWorkspace);
events.on("user.signup", notifySlackSales);
events.on("user.signup", trackAnalytics);

Response time drops from 1900ms to 60ms. If Stripe is down, the user still signs up — the Stripe handler retries independently. Adding a new side effect is adding one handler, not modifying a critical code path.

Level 1: In-Process Events (0-3 Services)

For a single application or a monolith, you do not need a message broker. An in-process event emitter with a persistent job queue handles most event-driven patterns.

Implementation with BullMQ (Node.js)

// events/emitter.ts
import { Queue, Worker, QueueEvents } from "bullmq";
import Redis from "ioredis";

const connection = new Redis(process.env.REDIS_URL);

// One queue per event type
const queues = new Map<string, Queue>();

export function getQueue(eventName: string): Queue {
  if (!queues.has(eventName)) {
    queues.set(eventName, new Queue(eventName, { connection }));
  }
  return queues.get(eventName)!;
}

export async function publish(eventName: string, data: Record<string, unknown>) {
  const queue = getQueue(eventName);
  await queue.add(eventName, data, {
    attempts: 3,
    backoff: { type: "exponential", delay: 1000 },
    removeOnComplete: { count: 1000 },
    removeOnFail: { count: 5000 },
  });
}

// Register a handler for an event
export function on(
  eventName: string,
  handler: (data: Record<string, unknown>) => Promise<void>,
  options?: { concurrency?: number }
) {
  const worker = new Worker(
    eventName,
    async (job) => {
      await handler(job.data);
    },
    {
      connection,
      concurrency: options?.concurrency ?? 5,
      limiter: { max: 10, duration: 1000 }, // Rate limit
    }
  );

  worker.on("failed", (job, err) => {
    console.error(`Handler for ${eventName} failed:`, {
      jobId: job?.id,
      attempt: job?.attemptsMade,
      error: err.message,
    });
  });

  return worker;
}

// handlers/signup.ts — Clean, isolated handlers
import { on } from "../events/emitter";

on("user.signup", async (data) => {
  const { userId, email } = data;
  await emailService.send({
    to: email,
    template: "welcome",
    data: { userId },
  });
}, { concurrency: 10 });

on("user.signup", async (data) => {
  const { userId, email } = data;
  const customer = await stripe.customers.create({
    email,
    metadata: { userId },
  });
  await db.users.update(userId, { stripeCustomerId: customer.id });
}, { concurrency: 3 }); // Lower concurrency for Stripe rate limits

This gives you retry logic, concurrency control, rate limiting, and failure tracking — all with Redis as the only infrastructure dependency (which you likely already have for caching).

Level 2: Message Broker (3-10 Services)

When you have multiple services that need to react to events, an in-process emitter is not enough. You need a message broker. For small teams, there are two solid choices:

Redis Streams

If you already run Redis, Redis Streams is the path of least resistance. It provides pub/sub with persistence, consumer groups, and acknowledgment — essentially a lightweight Kafka.

// Producer: publish events to a Redis Stream
import Redis from "ioredis";

const redis = new Redis(process.env.REDIS_URL);

async function publishEvent(stream: string, event: object) {
  const id = await redis.xadd(
    stream,
    "*", // Auto-generate ID
    "data", JSON.stringify(event),
    "type", event.type,
    "timestamp", Date.now().toString()
  );
  return id;
}

// Consumer group: each service gets its own group
async function createConsumerGroup(stream: string, group: string) {
  try {
    await redis.xgroup("CREATE", stream, group, "0", "MKSTREAM");
  } catch (err) {
    if (!err.message.includes("BUSYGROUP")) throw err;
    // Group already exists, that's fine
  }
}

// Consumer: read events with acknowledgment
async function consumeEvents(
  stream: string,
  group: string,
  consumer: string,
  handler: (event: object) => Promise<void>
) {
  while (true) {
    const results = await redis.xreadgroup(
      "GROUP", group, consumer,
      "COUNT", 10,
      "BLOCK", 5000, // Wait up to 5s for new events
      "STREAMS", stream, ">"
    );

    if (!results) continue;

    for (const [, messages] of results) {
      for (const [id, fields] of messages) {
        const data = JSON.parse(fields[1]); // "data" field
        try {
          await handler(data);
          await redis.xack(stream, group, id);
        } catch (err) {
          console.error(`Failed to process ${id}:`, err);
          // Message will be redelivered to another consumer
        }
      }
    }
  }
}

When to Graduate to RabbitMQ or NATS

Redis Streams works well for simple event routing. When you need:

Complex routing (topic-based, header-based)
Dead letter queues with sophisticated retry policies
Priority queues
Messages larger than 512MB

Then RabbitMQ or NATS is the right step. Both are dramatically simpler to operate than Kafka and handle millions of messages per day on a single node.

Level 3: Event Streaming (10+ Services or Real-Time Needs)

Kafka or Redpanda enter the picture when you need:

Event replay (reprocess historical events)
Event sourcing (derive state from event log)
Real-time stream processing
Guaranteed ordering within partitions

For small teams that genuinely need streaming, Redpanda is the better choice. It is API-compatible with Kafka but runs as a single binary without the ZooKeeper/KRaft complexity.

# docker-compose.yml for Redpanda (single-node, development)
services:
  redpanda:
    image: redpandadata/redpanda:v24.1.1
    command:
      - redpanda
      - start
      - --smp 1
      - --memory 1G
      - --overprovisioned
      - --kafka-addr PLAINTEXT://0.0.0.0:9092
    ports:
      - "9092:9092"
      - "8081:8081"  # Schema registry
      - "9644:9644"  # Admin API

Event Design Patterns

Regardless of which level you operate at, these patterns determine whether your event-driven architecture is maintainable:

1. Event Naming Conventions

// Use past-tense, domain-specific names
// Good:
"user.signup.completed"
"order.payment.failed"
"project.member.invited"

// Bad:
"createUser"        // Imperative — this is a command, not an event
"USER_CREATED"      // No domain context
"handle-signup"     // Implementation detail in the name

2. Event Schema with Versioning

// Every event has a consistent envelope
interface EventEnvelope<T> {
  id: string;           // Unique event ID (UUID)
  type: string;         // Event name
  version: number;      // Schema version
  timestamp: string;    // ISO 8601
  source: string;       // Which service produced this
  correlationId: string; // For tracing across services
  data: T;              // The actual payload
}

// Example
const event: EventEnvelope<UserSignupData> = {
  id: "evt_abc123",
  type: "user.signup.completed",
  version: 2,
  timestamp: "2026-03-30T10:00:00Z",
  source: "auth-service",
  correlationId: "req_xyz789",
  data: {
    userId: "usr_456",
    email: "developer@example.com",
    plan: "starter",
  },
};

3. Idempotent Handlers

Events will be delivered more than once. Every handler must be idempotent:

// Idempotent handler using event ID deduplication
async function handleUserSignup(event: EventEnvelope<UserSignupData>) {
  // Check if we already processed this event
  const processed = await db.processedEvents.findUnique({
    where: { eventId: event.id },
  });
  if (processed) {
    console.log(`Event ${event.id} already processed, skipping`);
    return;
  }

  // Process the event
  await stripe.customers.create({
    email: event.data.email,
    metadata: { userId: event.data.userId },
  });

  // Mark as processed
  await db.processedEvents.create({
    data: { eventId: event.id, processedAt: new Date() },
  });
}

Monitoring Event-Driven Systems

The biggest operational challenge with EDA is visibility. When a signup handler fails silently, nobody knows until a user complains about not receiving their welcome email.

Essential metrics to track:

Queue depth: How many unprocessed events are waiting? A growing queue indicates a consumer cannot keep up.
Processing latency: Time from event publication to handler completion. Spikes indicate downstream service issues.
Error rate per handler: Which handlers are failing, and how often?
Dead letter queue size: How many events have exhausted all retries?

// Minimal monitoring with BullMQ
import { Queue } from "bullmq";

async function getQueueMetrics(queueName: string) {
  const queue = new Queue(queueName, { connection });
  const [waiting, active, completed, failed, delayed] = await Promise.all([
    queue.getWaitingCount(),
    queue.getActiveCount(),
    queue.getCompletedCount(),
    queue.getFailedCount(),
    queue.getDelayedCount(),
  ]);

  return { queueName, waiting, active, completed, failed, delayed };
}

// Expose as a /metrics endpoint for Prometheus
app.get("/metrics", async (req, res) => {
  const queues = ["user.signup", "order.created", "payment.processed"];
  const metrics = await Promise.all(queues.map(getQueueMetrics));

  let output = "";
  for (const m of metrics) {
    const name = m.queueName.replace(/\./g, "_");
    output += `queue_waiting{queue="${name}"} ${m.waiting}\n`;
    output += `queue_failed{queue="${name}"} ${m.failed}\n`;
    output += `queue_active{queue="${name}"} ${m.active}\n`;
  }
  res.type("text/plain").send(output);
});

When Not to Use Events

Event-driven architecture adds indirection. Not every operation benefits from it:

Synchronous user flows: If the user needs an immediate response that depends on the result (e.g., payment confirmation before showing a receipt), keep it synchronous.
Simple CRUD with no side effects: Creating a note in a note-taking app does not need an event system.
Operations requiring strong consistency: If two operations must succeed or fail together, use a database transaction, not events.

The right amount of event-driven architecture for most small teams is selective: use events for side effects, keep core business logic synchronous. You get 80% of the benefits with 20% of the complexity.

Event-Driven Architecture for Small Teams: Practical Patterns Without the Complexity

By

Why Events Matter for Small Teams

Level 1: In-Process Events (0-3 Services)

Implementation with BullMQ (Node.js)

Level 2: Message Broker (3-10 Services)

Redis Streams

When to Graduate to RabbitMQ or NATS

Level 3: Event Streaming (10+ Services or Real-Time Needs)

Event Design Patterns

1. Event Naming Conventions

2. Event Schema with Versioning

3. Idempotent Handlers

Monitoring Event-Driven Systems

When Not to Use Events

By

Related Post

Observability in 2026: OpenTelemetry, eBPF, and the Death of Traditional Monitoring

Supply Chain Security After xz: What Changed and What Didn’t

Why Your Staging Environment Lies to You: Closing the Dev-Prod Gap

Leave a Reply Cancel reply

You missed

Technical Writing for Engineers: How Documentation Becomes Your Competitive Advantage

WebAssembly Beyond the Browser: Server-Side Wasm in 2026

Local-First Software: CRDTs, Sync Engines, and Why the Cloud Isn’t Always the Answer

Observability in 2026: OpenTelemetry, eBPF, and the Death of Traditional Monitoring

By

Why Events Matter for Small Teams

Level 1: In-Process Events (0-3 Services)

Implementation with BullMQ (Node.js)

Level 2: Message Broker (3-10 Services)

Redis Streams

When to Graduate to RabbitMQ or NATS

Level 3: Event Streaming (10+ Services or Real-Time Needs)

Event Design Patterns

1. Event Naming Conventions

2. Event Schema with Versioning

3. Idempotent Handlers

Monitoring Event-Driven Systems

When Not to Use Events

Related Reading

By

Related Post

Leave a Reply Cancel reply

You missed