Message - Queue - High Level System Design Document

We implemented a lightweight MySQL‑backed queue to meet MVP needs—low throughput, minimal overhead, and no extra cost—while retaining full control over schema and retry behavior. For local development, it runs entirely on a developer’s machine with no external dependencies, making setup and debugging simple.


📄 1. High Level System Design Document

1.1 Overview

The Echo platform is a modular Retrieval-Augmented Generation (RAG) and document analysis suite, deployed on cPanel with a microservices architecture. Each service is independently deployable, stateless, and secured using JWTs issued by an identity-backend service.

Major Services

  • API Gateway: User-facing, handles uploads, authentication, and orchestration.

  • Worker: Performs background document processing (e.g., parsing, embedding).

  • Message-Queue: Guarantees delivery of cross-service events using a durable MySQL-backed queue.

  • Logging-Backend: Centralized structured log/event sink.

  • Identity-Backend: OAuth2/OIDC authority, issues JWTs for users/services.

  • Notification Service (future): Sends emails, webhooks, etc. in response to events.


1.2 Principles

  • Loose coupling, strong contracts: All services interact via clean REST APIs, authenticated by JWT, never direct DB calls.

  • Event-driven, eventually consistent: Message-queue guarantees delivery and enables fan-out/eventual consistency.

  • Observability: All critical events, errors, and actions are logged centrally.

  • cPanel-friendly: All components are deployable as Flask apps with Passenger, using a shared MySQL instance where needed.


1.3 Data Flow

User → API Gateway → Worker → Message-Queue → Subscriber(s) (e.g., Notifier, Analytics, Logging)

  • All sensitive flows (login, uploads, polling) require JWTs issued by the identity-backend.

  • Non-blocking or async flows use the message-queue for reliable, decoupled delivery.


📊 2. Component & Deployment Diagram

2.1 Component Diagram (PlantUML)



@startuml
package "Echo Microservices" {
  [API Gateway] -down-> [Worker]
  [Worker] -down-> [Message-Queue]
  [API Gateway] -right-> [Message-Queue] : (direct enqueue OK)
  [Message-Queue] -down-> [Subscriber Service(s)]
  [Logging-Backend]
  [Identity-Backend]
}
database "MySQL (cPanel)" {
  [message_queue Table]
}

[API Gateway] ..> [Identity-Backend] : OIDC Login/JWT
[API Gateway] ..> [Logging-Backend] : Log events
[Worker] ..> [Logging-Backend] : Log events
[Message-Queue] ..> [Logging-Backend] : Log events
[Message-Queue] --> [message_queue Table]
[Logging-Backend] --> [MySQL (optional, for logs)]
@enduml

🔎 3. Major Sequence Diagrams

3.1 Synchronous Auth Flow



@startuml
actor User
participant "API Gateway"
participant "Identity-Backend"

User -> "API Gateway" : Login (OIDC)
"API Gateway" -> "Identity-Backend" : /authorize, /token
"Identity-Backend" --> "API Gateway" : id_token (JWT)
"API Gateway" --> User : Set session, allow actions
@enduml

3.2 Document Upload + Async Processing



@startuml
actor User
participant "API Gateway"
participant Worker
participant "Message-Queue"
database "message_queue Table"
participant "Logging-Backend"
participant "Subscriber Service(s)"

User -> "API Gateway" : Upload doc (JWT)
"API Gateway" -> Worker : Queue job (DB or POST)
Worker -> Worker : Parse doc
Worker -> "Message-Queue" : POST /enqueue (doc_processed event, JWT)
"Message-Queue" -> "message_queue Table" : INSERT (status=NEW)
"Message-Queue" -> "Logging-Backend" : Log enqueued event (JWT)
"Message-Queue" -> "Subscriber Service(s)" : POST webhook (event)
"Subscriber Service(s)" -> "Logging-Backend" : Log notification sent
@enduml

3.3 Guaranteed Delivery & Dead Letter Handling



@startuml
participant Worker
participant "Message-Queue"
database "message_queue Table"
participant "Subscriber Service"

Worker -> "Message-Queue" : POST /enqueue
"Message-Queue" -> "message_queue Table" : INSERT event (status=NEW)
"Message-Queue" -> "Subscriber Service" : POST webhook (event)
alt Success
  "Subscriber Service" --> "Message-Queue" : 200 OK
  "Message-Queue" -> "message_queue Table" : UPDATE status=DONE
else Failure
  "Subscriber Service" --> "Message-Queue" : Error/timeout
  "Message-Queue" -> "message_queue Table" : retry_count++
  alt Too many retries
    "Message-Queue" -> "message_queue Table" : UPDATE status=FAILED
  end
end
@enduml

📑 4. API Contract for message-queue

Base URL:

https://aurorahours.com/mq/

Endpoints:

Method Path Auth Description
POST /enqueue JWT Enqueue a new event/message
POST /poll JWT Claim a batch of NEW messages (lock them)
POST /ack JWT Mark a message as processed (DONE)
POST /fail JWT Mark a message as failed
GET /dead-letter JWT View messages that failed permanently
GET /log-test None Test logging to logging-backend
GET /healthz None Health check (no auth)

/enqueue

POST /enqueue
Headers: Authorization: Bearer <JWT>
Body: { "event_type": "doc_processed", "payload": { ... } }
Response: { "ok": true, "message_id": 42 }

/poll

POST /poll
Headers: Authorization: Bearer <JWT>
Body: { "limit": 10 }
Response: { "messages": [ ... ] }

/ack

POST /ack
Headers: Authorization: Bearer <JWT>
Body: { "message_id": 42 }
Response: { "ok": true }

/fail

POST /fail
Headers: Authorization: Bearer <JWT>
Body: { "message_id": 42, "error": "Description" }
Response: { "ok": true }

/dead-letter

GET /dead-letter
Headers: Authorization: Bearer <JWT>
Response: { "failed_messages": [ ... ] }

/log-test

GET /log-test
Response: { "ok": true, "msg": "Logged to centralized logging backend (check logs UI)!" }

/healthz

GET /healthz
Response: "OK"

📈 5. Security & Observability Patterns

  • JWT-Only Service-to-Service Auth: All calls must present a short-lived JWT from the identity-backend. Audience and issuer are strictly validated.

  • Centralized Logging: All key actions and failures are posted to logging-backend (via JWT) and also printed to stderr for local logs.

  • No DB cross-access: Only message-queue talks to its table; all access is via its REST API.


🗂️ 6. Data Model (message_queue Table)

CREATE TABLE message_queue (
    id INT AUTO_INCREMENT PRIMARY KEY,
    event_type VARCHAR(128) NOT NULL,
    payload TEXT NOT NULL,
    status ENUM('NEW','IN_PROGRESS','DONE','FAILED') DEFAULT 'NEW',
    retry_count INT DEFAULT 0,
    last_error TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB;

📌 7. Operational Notes

  • Deployment: Each service can be deployed as a separate Flask app with passenger_wsgi.py.

  • Config: All secrets and config are loaded from .env (never hardcoded).

  • Migrations: Use dal/queue.py's init_schema() for safe bootstrapping.

  • Scaling: If poll frequency or event rate grows, you can swap the DAL for a real queue (RabbitMQ, SQS, etc.) with minimal code change.

  • Monitoring: Health checks are available at /healthz for every service.


📝 8. Example Integration Flow (API Gateway → Worker → Message-Queue → Logging)

  1. API Gateway receives user request, authenticates via JWT.

  2. Worker does background work, finishes, posts to /enqueue on Message-Queue with a JWT signed by identity-backend.

  3. Message-Queue writes event, logs action to Logging-Backend, and delivers event to subscribers.

  4. All logs and audit trails are viewable in Logging-Backend UI.


📚 Summary Table

Service Role Auth DB Scope Logs To
API Gateway User/API entrypoint JWT OIDC Own jobs table Logging-Backend
Worker Processing backend JWT Own tables Logging-Backend
Message-Queue Event bus JWT Own queue table Logging-Backend
Logging-Backend Log sink/viewer JWT (opt) Own logs table Itself
Identity-Backend Auth/JWT provider N/A Own Logging-Backend

You now have:

  • Executive summary (system doc)

  • Component & deployment diagram

  • Sequence diagrams for all major flows

  • Complete API contract and security notes

  • Table schema

  • Operational best practices

If you need any of these as standalone PDF/markdown files, or want editable PlantUML or draw.io diagram source, just say the word!

😂 That’s the energy! You want "one sequence diagram per real interaction" for a professional system review or onboarding—this is absolutely how senior/principal engineers document mature systems.

Below is a clear, labeled, and modular sequence diagram for each major interaction in your stack.
All are PlantUML format for easy copy/paste (e.g., into PlantUML Online).
I’ll keep things modular, so you (or anyone joining) can see at a glance what happens in each flow, including auth and logging steps.


📈 Key Sequence Diagrams (All Interactions)


1. User Login / Auth Code Flow (OIDC)



@startuml
actor User
participant "API Gateway"
participant "Identity-Backend"
participant "Logging-Backend"

User -> "API Gateway": Visit /login
"API Gateway" -> "Identity-Backend": /authorize (redirect, with client_id, state, etc.)
User -> "Identity-Backend": Enter username/password, submit login
"Identity-Backend" -> "Identity-Backend": Authenticate user
"Identity-Backend" -> "Logging-Backend": unified_log(INFO, "User authenticated")
"Identity-Backend" -> User: Redirect back to API Gateway with code+state

User -> "API Gateway": /callback?code=...
"API Gateway" -> "Identity-Backend": /token (exchange code for id_token/JWT)
"Identity-Backend" -> "Logging-Backend": unified_log(INFO, "Token issued")
"Identity-Backend" --> "API Gateway": id_token (JWT)
"API Gateway" -> "Logging-Backend": unified_log(INFO, "User logged in")
"API Gateway" --> User: Set session, show UI
@enduml

2. User Uploads Document



@startuml
actor User
participant "API Gateway"
participant "Logging-Backend"

User -> "API Gateway": POST /upload (id_token JWT)
"API Gateway" -> "Logging-Backend": unified_log(INFO, "Received upload")
"API Gateway" -> "API Gateway": Save file, enqueue job in DB
"API Gateway" --> User: Upload successful (job queued)
@enduml

3. Worker Polls for Job and Processes



@startuml
participant Worker
participant "API Gateway"
participant "Logging-Backend"

Worker -> "API Gateway": Poll for jobs (JWT)
"API Gateway" -> "Logging-Backend": unified_log(DEBUG, "Worker polled for jobs")
"API Gateway" --> Worker: Return queued job(s)
Worker -> Worker: Process document (parse/OCR/embed)
Worker -> "Logging-Backend": unified_log(INFO, "Doc processed")
@enduml

4. Worker Posts Event to Message-Queue



@startuml
participant Worker
participant "Message-Queue"
participant "Logging-Backend"
participant "Identity-Backend"

Worker -> "Identity-Backend": Request JWT for 'mq' audience (service-to-service)
"Identity-Backend" --> Worker: Return JWT

Worker -> "Message-Queue": POST /enqueue (event_type, payload, JWT)
"Message-Queue" -> "Logging-Backend": unified_log(INFO, "Event enqueued")
"Message-Queue" -> "Message-Queue": Write event to MySQL queue
"Message-Queue" --> Worker: {ok:true, message_id}
@enduml

5. Subscriber Service Polls/Receives Events



@startuml
participant "Subscriber Service"
participant "Message-Queue"
participant "Logging-Backend"
participant "Identity-Backend"

"Subscriber Service" -> "Identity-Backend": Request JWT for 'mq' audience
"Identity-Backend" --> "Subscriber Service": Return JWT

"Subscriber Service" -> "Message-Queue": POST /poll (JWT)
"Message-Queue" -> "Logging-Backend": unified_log(DEBUG, "Polled N messages")
"Message-Queue" --> "Subscriber Service": [messages]

"Subscriber Service" -> "Message-Queue": POST /ack (message_id, JWT)
"Message-Queue" -> "Logging-Backend": unified_log(INFO, "Message ack'd")
"Message-Queue" --> "Subscriber Service": {ok:true}
@enduml

6. Centralized Logging for Every Service



@startuml
participant "Any Service"
participant "Logging-Backend"
participant "Identity-Backend"

"Any Service" -> "Identity-Backend": Get JWT for 'logging-service' audience
"Identity-Backend" --> "Any Service": Return JWT

"Any Service" -> "Logging-Backend": POST /log (JWT)
"Logging-Backend" -> "Logging-Backend": Write log to DB
"Logging-Backend" --> "Any Service": {ok:true}
@enduml

7. Health Check Monitoring



@startuml
participant "Health Checker"
participant "API Gateway"
participant "Worker"
participant "Message-Queue"

"Health Checker" -> "API Gateway": GET /healthz
"API Gateway" --> "Health Checker": 200 OK

"Health Checker" -> "Worker": GET /healthz
"Worker" --> "Health Checker": 200 OK

"Health Checker" -> "Message-Queue": GET /healthz
"Message-Queue" --> "Health Checker": 200 OK
@enduml

8. Dead Letter Handling (Failed Delivery)



@startuml
participant "Message-Queue"
database "MySQL (queue)"
participant "Subscriber Service"
participant "Logging-Backend"

"Message-Queue" -> "Subscriber Service": POST webhook/event
alt Failure (timeout/error)
    "Subscriber Service" --x "Message-Queue": No response or error
    "Message-Queue" -> "MySQL (queue)": retry_count++
    alt retry_count > MAX
        "Message-Queue" -> "MySQL (queue)": status=FAILED
        "Message-Queue" -> "Logging-Backend": unified_log(ERROR, "Dead-letter message")
    end
end
@enduml

9. Service-to-Service Auth Failure



@startuml
participant "Caller Service"
participant "Target Service"
participant "Logging-Backend"

"Caller Service" -> "Target Service": POST (JWT)
alt Invalid JWT (expired, bad signature, wrong audience, etc)
    "Target Service" -> "Logging-Backend": unified_log(WARN, "Auth failed")
    "Target Service" --> "Caller Service": 401 Unauthorized
end
@enduml

10. Logging Your Mom (You Asked! 😆)



@startuml
participant "You"
participant "Logging-Backend"

"You" -> "Logging-Backend": POST /log (level=INFO, msg="My mom is awesome")
"Logging-Backend" -> "Logging-Backend": Write log
"Logging-Backend" --> "You": {ok:true}
@enduml

How to Use/Expand

  • Paste any of these PlantUML blocks into PlantUML Online for instant diagrams.

  • Add or request more flows (e.g., password reset, adding a new service, monitoring error recovery) as your system evolves.

  • Feel free to request editable .puml or .drawio files if you want to keep a living architecture repo!


Want these exported as a PDF, or have a specific sequence you want detailed further? Just say the word.
Want to document “integration with third-party webhook” or “analytics event flows”?
Just ask—your wish is my sequence diagram!

Absolutely, let’s diagram the living hell out of message-queue—every perspective, every key interaction, every integration.
This is what principal-level documentation looks like.
Below: Super-detailed diagrams you can use for onboarding, audits, or explaining to your CTO, with PlantUML source for each.


1. High-Level Message-Queue Ecosystem Map



@startuml
package "Echo Platform" {
  [API Gateway]
  [Worker]
  [Message-Queue]
  [Logging-Backend]
  [Identity-Backend]
  [Subscriber Service]
}
database "MySQL (mq)" as MQ

[API Gateway] --> [Message-Queue] : POST /enqueue (events, JWT)
[Worker] --> [Message-Queue] : POST /enqueue (events, JWT)
[Message-Queue] --> MQ : insert/read/update message_queue table
[Message-Queue] --> [Subscriber Service] : POST /webhook or /poll (JWT)
[Message-Queue] --> [Logging-Backend] : POST /log (JWT)
[API Gateway] ..> [Identity-Backend] : OIDC/JWT
[Worker] ..> [Identity-Backend] : OIDC/JWT
[Message-Queue] ..> [Identity-Backend] : JWT verify for inbound, and to log
[Logging-Backend] ..> MQ : (optional, logs in DB)
@enduml

2. Component View: Message-Queue Internals



@startuml
package "Message-Queue" {
    [Flask API] --> [Auth Middleware (JWT Required)]
    [Flask API] --> [DAL: MessageQueueDAL]
    [DAL: MessageQueueDAL] --> [MySQL: message_queue Table]
    [Flask API] --> [Logger Utils]
    [Logger Utils] --> [Logging-Backend] : POST /log (JWT)
}
@enduml

3. Full Life-Cycle: Enqueue to Delivery (Fan-Out)



@startuml
participant "Producer Service (API Gateway/Worker)" as Producer
participant "Message-Queue API"
database "MySQL (message_queue)"
participant "Subscriber Service"
participant "Logging-Backend"
participant "Identity-Backend"

Producer -> "Identity-Backend": Get JWT (aud=mq)
Producer -> "Message-Queue API": POST /enqueue (event_type, payload, JWT)
"Message-Queue API" -> "Identity-Backend": Validate JWT
"Message-Queue API" -> "MySQL (message_queue)": INSERT event (status=NEW)
"Message-Queue API" -> "Logging-Backend": unified_log(INFO, "Enqueued event")
...
"Subscriber Service" -> "Identity-Backend": Get JWT (aud=mq)
"Subscriber Service" -> "Message-Queue API": POST /poll (JWT)
"Message-Queue API" -> "Identity-Backend": Validate JWT
"Message-Queue API" -> "MySQL (message_queue)": SELECT * WHERE status=NEW
"Message-Queue API" -> "MySQL (message_queue)": UPDATE status=IN_PROGRESS
"Message-Queue API" -> "Subscriber Service": Return messages
"Subscriber Service" -> "Message-Queue API": POST /ack (message_id, JWT)
"Message-Queue API" -> "MySQL (message_queue)": UPDATE status=DONE
"Message-Queue API" -> "Logging-Backend": unified_log(INFO, "Message ack'd")
@enduml

4. Failure Handling & Dead-Letter Pattern



@startuml
participant "Message-Queue"
database "MySQL (message_queue)"
participant "Subscriber Service"
participant "Logging-Backend"

"Message-Queue" -> "Subscriber Service": POST /webhook or provide via /poll
alt Delivery Fails (timeout, error)
    "Subscriber Service" --x "Message-Queue": No response or error
    "Message-Queue" -> "MySQL (message_queue)": retry_count++
    alt retry_count > MAX
        "Message-Queue" -> "MySQL (message_queue)": UPDATE status=FAILED
        "Message-Queue" -> "Logging-Backend": unified_log(ERROR, "Dead-lettered message")
    else retry_count <= MAX
        "Message-Queue" -> "Logging-Backend": unified_log(WARN, "Retrying delivery")
    end
end
@enduml

5. Centralized Logging from Message-Queue



@startuml
participant "Message-Queue"
participant "Identity-Backend"
participant "Logging-Backend"

"Message-Queue" -> "Identity-Backend": Get JWT for logging-backend
"Identity-Backend" --> "Message-Queue": Return JWT
"Message-Queue" -> "Logging-Backend": POST /log (JWT, event data)
"Logging-Backend" -> "Logging-Backend": Write to logs DB
@enduml

6. API Security: JWT Validation on Inbound



@startuml
participant "Producer Service"
participant "Message-Queue"
participant "Identity-Backend"

"Producer Service" -> "Message-Queue": POST /enqueue (JWT)
"Message-Queue" -> "Identity-Backend": Validate JWT (issuer, aud, exp)
alt Valid
    "Identity-Backend" --> "Message-Queue": Valid
    "Message-Queue" -> ...: Continue processing
else Invalid
    "Identity-Backend" --> "Message-Queue": Invalid
    "Message-Queue" -> "Producer Service": 401 Unauthorized
end
@enduml

7. Subscriber as Webhook Receiver



@startuml
participant "Message-Queue"
participant "Subscriber Service"
participant "Logging-Backend"

"Message-Queue" -> "Subscriber Service": POST /webhook (event payload, JWT)
alt Success
    "Subscriber Service" -> "Message-Queue": 200 OK
    "Message-Queue" -> "Logging-Backend": unified_log(INFO, "Webhook delivered")
else Failure
    "Subscriber Service" --x "Message-Queue": Timeout/Error
    "Message-Queue" -> "Logging-Backend": unified_log(WARN, "Webhook delivery failed")
end
@enduml

8. Sequence: Log-Test Endpoint



@startuml
participant "User/HealthChecker"
participant "Message-Queue"
participant "Logging-Backend"

"User/HealthChecker" -> "Message-Queue": GET /log-test
"Message-Queue" -> "Logging-Backend": POST /log ("Log test from message-queue")
"Logging-Backend" --> "Message-Queue": {ok: true}
"Message-Queue" --> "User/HealthChecker": {ok: true, msg: ...}
@enduml

9. Sequence: Polling Pattern



@startuml
participant "Subscriber"
participant "Message-Queue"
database "MySQL (message_queue)"

loop Every N seconds
    "Subscriber" -> "Message-Queue": POST /poll (JWT, limit)
    "Message-Queue" -> "MySQL (message_queue)": SELECT * WHERE status=NEW LIMIT N
    "Message-Queue" -> "MySQL (message_queue)": UPDATE status=IN_PROGRESS
    "Message-Queue" --> "Subscriber": [messages]
end
@enduml

10. End-to-End System Use Case Map



@startuml
actor User
participant "API Gateway"
participant Worker
participant "Message-Queue"
database "MySQL (mq)"
participant "Subscriber"
participant "Logging-Backend"
participant "Identity-Backend"

User -> "API Gateway": POST /upload (JWT)
"API Gateway" -> "Logging-Backend": unified_log(INFO, "upload")
"API Gateway" -> Worker: Queue job (DB or POST)
Worker -> Worker: Parse/process file
Worker -> "Message-Queue": POST /enqueue (doc_processed, JWT)
"Message-Queue" -> "MySQL (mq)": INSERT event (status=NEW)
"Message-Queue" -> "Logging-Backend": unified_log(INFO, "Event enqueued")
"Subscriber" -> "Message-Queue": POST /poll (JWT)
"Message-Queue" -> "MySQL (mq)": SELECT * WHERE status=NEW
"Message-Queue" -> "MySQL (mq)": UPDATE status=IN_PROGRESS
"Message-Queue" -> "Subscriber": [messages]
"Subscriber" -> "Message-Queue": POST /ack (message_id, JWT)
"Message-Queue" -> "MySQL (mq)": UPDATE status=DONE
"Message-Queue" -> "Logging-Backend": unified_log(INFO, "Message ack'd")
@enduml

How to Use This

  • Paste each PlantUML block into PlantUML Online or VSCode extension for instant visuals.

  • Keep these diagrams in a /docs/architecture/ folder or as markdown in your monorepo for onboarding.

  • Request new use-case diagrams as your platform expands (“add new subscriber type,” “cross-tenant eventing,” “external integrations”).


Separation of concerns

Is it a job or a message?
  • A job = something to be processed asynchronously

  • A message = data payload representing that job in the queue

The difference is mostly naming + scope:

  • In Echo API Gateway, jobs table = uploaded documents awaiting parsing/embedding.

  • jobs table in API Gateway → evolves into a batch job tracker for document parsing, embedding, or any other asynchronous processing tasks.

  • In message-queue, message_queue table = generic queue for any event/job (not just documents).

  • message_queue table → dedicated to real-time message passing / pub-sub style events.

Keeping them separate (even though both hold “jobs”) lets you:

  • Scale the message queue independently (high-throughput, lightweight messages).

  • Grow the jobs table with richer metadata (owner, file references, retries, statuses).

  • Avoid coupling microservices — API Gateway can enqueue a message into message_queue without sharing schema.

Comments

Popular posts from this blog

Feature: Audit log for one login, and identity service

Getting started - Build your data science lab environment

QA - Run #1 - Results