Architecture, senior software engineering, and technical execution for demanding products.

Event-driven architecture

Design event flows that are useful, observable, and proportionate.

LRJI helps teams use event-driven architecture to solve real integration, decoupling, and resilience problems without turning the platform into a distributed black box.

The work covers domain events, integration events, Kafka, RabbitMQ, contracts, ownership, retries, ordering, idempotency, and observability. The point is not to put events everywhere, but to choose the flows where asynchronous design actually pays for itself.

Signals

When event-driven design becomes an architecture topic

Event-driven architecture is useful when it clarifies real coordination. If it only hides coupling behind a queue, it becomes distributed debt.

01

Events exist, but no one really owns the flows

Producers, consumers, payloads, compatibility, and success criteria are not clearly assigned.

02

Retries, ordering, and idempotency become incidents

Failure cases are handled after the fact, when they should be part of the architecture contract from the start.

03

Services are decoupled in theory, coupled in production

Teams no longer call each other over HTTP, but remain blocked by opaque messages, weak schemas, or temporal dependencies.

04

Kafka or RabbitMQ arrives before the problem

The technology becomes the direction before the business flows, responsibilities, and operating cost are understood.

05

Observability does not explain the business journey

Logs, metrics, and traces exist, but nobody can follow one event end to end with confidence.

Scope

What needs to become explicit

The work connects the domain model, domain boundaries, integrations, and operations. An event flow is a product decision as much as a technical one.

Business flows and events
Identify real business events, integration events, commands, states, and transitions that deserve a contract.
Contracts and ownership
Define payloads, versioning, compatibility, producers, consumers, responsibilities, and change rules.
Resilience and operations
Frame retries, dead letters, ordering, idempotency, backpressure, alerting, and runbooks for critical flows.
Progressive introduction
Introduce asynchronous design on the right flows, with measurable rollout and no big-bang platform migration.

Position

The LRJI position

Event-driven architecture should make the system more legible for teams. If coordination becomes more opaque, the design failed.

Start from the business flow

Do not start from the broker. Start from the journey, invariants, acceptable delay, and what must remain consistent.

Do not treat events as magic APIs

A message is still a public contract. It needs versioning, compatibility, responsibility, and a failure strategy.

Design failure as carefully as success

Retries, duplication, temporary loss, ordering, and replay must be discussed before the flow becomes critical.

Keep architecture proportionate

A modular monolith with a few well-chosen asynchronous flows beats a distributed platform nobody can reason about.

Format

How the engagement works

The work starts from real flows and ends with contracts, decisions, and a first rollout path the team can execute.

  1. 01

    Map the flows

    Business journeys, systems, messages, databases, APIs, errors, and responsibilities are represented together.

  2. 02

    Choose the right contracts

    Useful events are separated from commands, notifications, derived state, and implementation details.

  3. 03

    Define reliability rules

    Idempotency, ordering, retries, dead letters, observability, and rollback criteria become explicit.

  4. 04

    Prove it on one flow

    One high-impact flow becomes the reference for conventions, tests, monitoring, and broader decisions.

Outputs

What the team gets

Outputs should reduce ambiguity and make flows usable by the teams that run them.

  • Business and technical flow map with producers, consumers, and responsibilities.
  • Event contracts: payloads, versioning, compatibility, and evolution rules.
  • Decisions on Kafka, RabbitMQ, REST, tRPC, or simpler alternatives based on the flows.
  • Resilience strategy: retries, idempotency, ordering, dead letters, and replay.
  • Observability model to follow critical journeys end to end.
  • Progressive rollout plan with first flows, risks, tests, and success criteria.

Proof

Relevant experience

The references show flows where reliability, contracts, and the level of distribution had to be handled together.

Kafka

Luxury: identity and customer-data synchronization

Kafka flows in a CIAM, MDM, and customer-data ecosystem with strong consistency and reliability expectations.

Read the luxury case

7 -> 1

Retail: distribution brought back to proportion

Migration of an over-distributed system toward a simpler architecture while preserving the right contracts and decoupling points.

Read the retail case

Identity

Banking: integration around a critical component

Backend architecture around Keycloak with provider abstraction, application boundaries, and more testable integrations.

Read the authentication case

Possible next steps

Depending on the dominant problem

Event-driven design can be the central topic or one part of broader work on boundaries, migration, or execution.

When legacy dictates the flows

Modernize integrations progressively without stopping production.

See legacy migration

FAQ

FAQ

Should every distributed system become event-driven?

No. Event-driven design is useful when it solves real coordination. By default, keep the simplest path that makes the flow legible, reliable, and operable.

Does this cover Kafka and RabbitMQ?

Yes. LRJI can frame Kafka, RabbitMQ, domain events, integration events, consumers, retries, ordering, dead letters, and related observability.

Can this fix an already fragile event-driven architecture?

Yes. The work often starts with contracts, ownership, critical flows, and the failure cases already creating incidents or slowing delivery.

Next step

Bring a critical flow, its producers, its consumers, and its recurring failures.

LRJI turns that context into clearer contracts, explicit responsibilities, and a proportionate event-driven path.