9.5 KiB
Server Architecture Overview
This document outlines the architecture and operation of the WebSocket relay server found in api-relay-server/src/server.ts. The system acts as an intermediary between browser extensions and an HTTP-based API, routing JSON messages bi-directionally, and managing request flow to prevent overloading the browser extension.
🌐 Server Layers
graph TD
subgraph HTTP Layer
A[Express App] --> B{/v1/chat/completions}
end
subgraph Request Handling Logic
B -- Request --> C{Extension Busy?}
C -- No --> D[processRequest()]
C -- Yes --> E{Behavior: Drop or Queue?}
E -- Drop --> F[Respond 429]
E -- Queue --> G[requestQueue]
G -- Dequeue --> D
end
subgraph WebSocket Communication
D -- SEND_CHAT_MESSAGE --> H[activeConnections[0]]
H -- CHAT_RESPONSE_* --> I[WebSocket Server]
I -- Resolve/Reject --> D
end
D --> J[pendingRequests Map]
D --> K[finishProcessingRequest()]
subgraph Admin UI & Config
L[Express App] --> M[/admin & /v1/admin/*]
M <--> N[server-config.json]
end
📁 Core File
server.ts: Main file where the entire Express server, WebSocket infrastructure, request queuing, and admin interface logic is defined.
🧩 Components
1. Express HTTP API
/v1/chat/completions: Accepts OpenAI-compatible requests.- Implements logic to check if a browser extension is busy (via
activeExtensionProcessingId). - Based on
newRequestBehaviorsetting ('queue' or 'drop'):- Queue: Adds incoming request to
requestQueueif extension is busy. The HTTP response is deferred. - Drop: Responds with 429 Too Many Requests if extension is busy.
- Queue: Adds incoming request to
- If extension is free, directly calls
processRequest().
- Implements logic to check if a browser extension is busy (via
/v1/admin/server-info: Provides current server status and configuration, includingport,requestTimeoutMs, andnewRequestBehavior./v1/admin/update-settings: Allows updatingport,requestTimeoutMs, andnewRequestBehavior. Changes are saved toserver-config.json./v1/admin/message-history: Retrieves recent message logs for the admin UI./v1/admin/restart-server: Triggers a server restart./admin: Serves the admin HTML interface./health: Basic health check.
2. WebSocket Server
WebSocketServer: Accepts WebSocket connections from browser extensions.activeConnections: Array storing active WebSocket client connections. Currently, only the first connection (activeConnections[0]) is used for sending messages.- Message Handling: Receives messages from the extension (e.g.,
CHAT_RESPONSE,CHAT_RESPONSE_CHUNK,CHAT_RESPONSE_ERROR) and resolves or rejects promises in thependingRequestsmap.
3. Queuing & Processing System
activeExtensionProcessingId: number | null: Tracks therequestIdof the message currently being processed by the extension. Ifnull, the extension is considered free.newRequestBehavior: 'queue' | 'drop': Global variable determining how to handle new requests when the extension is busy. Loaded fromserver-config.json(defaults to 'queue').requestQueue: QueuedRequest[]: An in-memory array holdingQueuedRequestobjects whennewRequestBehavioris 'queue' and the extension is busy.QueuedRequestInterface: Defines the structure for storing an original HTTP request (req,res) and its parameters, to be processed later.async function processRequest(queuedItem: QueuedRequest):- Sets
activeExtensionProcessingIdto the currentqueuedItem.requestId. - Logs
CHAT_REQUEST_PROCESSING. - Sends the
SEND_CHAT_MESSAGEto the extension via WebSocket. - Manages a
PromiseinpendingRequestsfor the response, including a timeout (currentRequestTimeoutMs). - On response/error/timeout, formats and sends the HTTP response using the stored
queuedItem.res. - Calls
finishProcessingRequest()in afinallyblock.
- Sets
function finishProcessingRequest(completedRequestId: number):- Clears
activeExtensionProcessingId. - Removes the request from
pendingRequests. - If
newRequestBehavioris 'queue' andrequestQueueis not empty, dequeues the next request and callsprocessRequest()for it.
- Clears
4. State Management
pendingRequests: AMapthat storesPromiseresolve/reject handlers, keyed byrequestId. Used byprocessRequestto await responses from the WebSocket.requestCounter: Generates uniquerequestIds.adminMessageHistory: In-memory store for admin log entries.
🔄 Lifecycle Flow (with Queuing)
sequenceDiagram
participant User
participant ServerAPI
participant RequestLogic
participant ProcessRequestFunc
participant Extension
participant RequestQueue
User->>ServerAPI: POST /v1/chat/completions (req1)
ServerAPI->>RequestLogic: Handle req1
alt Extension is Free
RequestLogic->>ProcessRequestFunc: processRequest(req1)
ProcessRequestFunc->>Extension: SEND_CHAT_MESSAGE (req1)
Note over ProcessRequestFunc,Extension: activeExtensionProcessingId = req1.id
User->>ServerAPI: POST /v1/chat/completions (req2)
ServerAPI->>RequestLogic: Handle req2
RequestLogic->>RequestLogic: Extension Busy (req1.id)
alt newRequestBehavior == 'queue'
RequestLogic->>RequestQueue: Enqueue req2
Note over RequestLogic: HTTP Response for req2 deferred
else newRequestBehavior == 'drop'
RequestLogic-->>ServerAPI: Respond 429 for req2
ServerAPI-->>User: HTTP 429 (req2 dropped)
end
Extension-->>ProcessRequestFunc: CHAT_RESPONSE (req1)
ProcessRequestFunc-->>ServerAPI: Respond HTTP OK (req1)
ServerAPI-->>User: HTTP OK (req1)
ProcessRequestFunc->>RequestLogic: finishProcessingRequest(req1.id)
RequestLogic->>RequestLogic: activeExtensionProcessingId = null
alt Queue Not Empty and Behavior is 'queue'
RequestLogic->>RequestQueue: Dequeue req2
RequestLogic->>ProcessRequestFunc: processRequest(req2)
ProcessRequestFunc->>Extension: SEND_CHAT_MESSAGE (req2)
Note over ProcessRequestFunc,Extension: activeExtensionProcessingId = req2.id
Extension-->>ProcessRequestFunc: CHAT_RESPONSE (req2)
ProcessRequestFunc-->>ServerAPI: Respond HTTP OK (req2 via stored res)
ServerAPI-->>User: HTTP OK (req2)
ProcessRequestFunc->>RequestLogic: finishProcessingRequest(req2.id)
end
else Extension is Busy (initial state)
RequestLogic->>RequestLogic: Extension Busy
alt newRequestBehavior == 'queue'
RequestLogic->>RequestQueue: Enqueue req1
else newRequestBehavior == 'drop'
RequestLogic-->>ServerAPI: Respond 429 for req1
ServerAPI-->>User: HTTP 429 (req1 dropped)
end
end
🛡️ Error Handling
- If no browser extension is connected when a request arrives: Server responds with
503 Service Unavailable. - If no browser extension is connected when
processRequestattempts to send a message (e.g., after being dequeued): The request is failed, and an error is sent to the original client if headers not already sent. - If
newRequestBehavioris 'drop' and the extension is busy: Server responds with429 Too Many Requests. - Request Timeout: Each request processed by
processRequesthas a timeout (currentRequestTimeoutMs, configurable). If the extension doesn't respond in time, the promise is rejected, and an error is sent to the client. - Errors from extension (
CHAT_RESPONSE_ERROR): Logged, and the corresponding request promise is rejected, leading to an error response to the client.
⚙️ Configuration
The server's behavior can be configured via server-config.json located in the dist directory (created/managed by server.ts). The Admin UI also allows viewing and modifying these settings.
Key configurable options:
port: The port on which the server listens. Requires server restart.requestTimeoutMs: Timeout in milliseconds for waiting for a response from the browser extension. Effective immediately.newRequestBehavior: Determines how new requests are handled if the extension is busy. Can be:'queue'(default): New requests are queued and processed sequentially.'drop': New requests are rejected with a 429 error. Effective immediately.
🔌 Connection Monitoring
- The server maintains an array of
activeConnections. - WebSocket connections have built-in ping/pong mechanisms for keep-alive, managed by the
wslibrary. Explicit server-side ping logic is not currently implemented inserver.ts. - Disconnected clients are removed from
activeConnections. pendingRequestsare cleared on timeout or when a request completes (successfully or with an error) viafinishProcessingRequest.
✅ Summary
This architecture creates a decoupled, resilient relay system. The new queuing/dropping mechanism ensures that the browser extension processes only one message at a time, preventing race conditions and allowing for configurable behavior when the extension is busy. The Admin UI provides visibility and control over key operational parameters.