Architecture Decisions
Architecture Decision Records (ADRs) document significant technical choices — what was decided, what alternatives were rejected, and what the real-world consequences turned out to be. Each record includes a "What we learned" section with the gotchas we actually hit.
Use async FastAPI over sync
✓ AcceptedThe marketplace runs Python computation functions that take 5–10 seconds each. We needed the API to remain responsive during computation so that: - Multiple simultaneous jobs could run without blocking the API - The WebSocket endpoint could stream status updates while a job ran - The health endpoint would still respond during heavy computation Alternatives considered: - Sync FastAPI with threading — thread management is error-prone and interacts poorly with asyncpg - Celery + Redis — correct for production, but adds two infrastructure components for a single-user tool - Flask + threading — same threading issues, no native async support
Use FastAPI with full async support throughout: - asyncpg as the PostgreSQL driver (not psycopg2) - SQLAlchemy 2 async ORM (create_async_engine, async_sessionmaker) - BackgroundTasks for job execution — the job runs in a background coroutine without blocking the HTTP response - asyncio.sleep() inside functions to simulate non-blocking computation delays
- The event loop is never blocked; multiple jobs can run concurrently
- WebSocket connections work cleanly alongside regular HTTP traffic
- No thread safety issues — asyncio's cooperative multitasking is simpler to reason about
- Every database call must be awaited — forgetting this causes a greenlet_spawn error that is easy to introduce and hard to debug
- Cannot use any synchronous library that does I/O inside an async route without run_in_executor
- New contributors with sync Python backgrounds need to learn the async model
Store inputs and results as JSONB
✓ AcceptedEach marketplace app has a different set of inputs. The Mortality Simulator takes age and shock_rate; the Portfolio Pricer takes n_assets and volatility. Future apps will have entirely different schemas. We needed a database design that could store arbitrary input/output shapes without requiring a schema migration for every new app. Alternatives considered: - Typed columns per app — creates separate tables per app, impossible to join - EAV (Entity-Attribute-Value) — avoids schema migrations but makes queries painful and loses types - JSONB in PostgreSQL — stores arbitrary JSON with full indexing support
Use JSONB for: - apps.input_schema — the manifest's input field definitions - job_runs.inputs — the actual values submitted for a run - job_results.payload — the complete result ({columns, table, series, summary})
- Adding a new app type requires zero database migrations
- The full result payload is stored atomically
- asyncpg automatically deserializes JSONB → Python dict on read
- Can't efficiently query "all runs where age > 50" without a GIN index and JSONB operators
- No type enforcement at the database level — the function contract is the only schema
- Payload size is unbounded; a function returning 10,000 rows would store all of them
Use WebSocket for job status updates
✓ AcceptedJobs take 5–15 seconds to complete. The UI needs to show a loading state during execution and display results when done. We needed a mechanism for the frontend to learn when a job completes. Alternatives considered: - HTTP polling — simple, but wastes network requests (10 requests for a 20-second job) - Server-Sent Events (SSE) — simpler than WebSocket, but inconsistent browser support over HTTP/2 - WebSocket — bidirectional, persistent, well-supported everywhere - Long polling — risk of timeout on slow jobs; complex retry logic needed
Use WebSocket at GET /ws/runs/{run_id}. The implementation uses server-side polling — the WebSocket handler polls the database every 3 seconds using asyncio.sleep(3), then pushes the current status to the client ("push via polling"): Client connects → server queries DB → sends status → sleeps 3s → repeats When status = SUCCESS, server sends full result and closes the WebSocket
- The frontend has a simple ws.onmessage handler — no retry logic, no polling loop
- Status updates are immediate once the DB is written (within 3 seconds)
- No CORS issues — WebSocket origin is validated separately from HTTP CORS
- The 3-second sleep is a fixed latency: a job completing in 6.1s shows SUCCESS at ~9s
- WebSocket connections must be cleaned up on WebSocketDisconnect — not doing this causes leaks
- The ws:// URL must be configured separately from the http:// API URL
Use MDX within Next.js over Docusaurus
✓ AcceptedThe project needed documentation — architecture, API reference, build logs, and an AI guide. We needed a documentation system that: - Is accessible at /docs within the same app (not a separate site) - Supports rich content: code blocks, callouts, interactive components - Doesn't require a separate deployment or a second frontend container Alternatives considered: - Docusaurus — excellent framework, but runs as a separate Node.js app on a different port - Plain Markdown in Next.js — no component support; can't embed Callout or EndpointCard inline - Notion / GitBook — external services, different auth models, can't be self-hosted alongside the app - MDX via @next/mdx — pages integrate directly into the Next.js pages router
Use @next/mdx with the Next.js pages router. Each MDX file in frontend/pages/docs/ becomes a route at /docs/[filename]. Custom components (DocsLayout, Callout, CodeBlock, EndpointCard) are imported in each MDX file using the layout export pattern.
- Docs are in the same Git repo as the code — they change together
- No additional Docker service or port mapping needed
- Custom React components work inline in MDX content
- MDX v2 parses JSX in prose text — bare {...} outside code blocks causes build failures
- Cannot use ESM-only rehype plugins (like rehype-highlight) in CJS next.config.js
- Sub-directory MDX pages need fragile ../../../ import paths to reach components
Use importlib for app function dispatch
✓ AcceptedApps in the marketplace are registered by cloning a GitHub repository. Each repo contains a function.py file at an arbitrary path on disk. When a job runs, we need to import and execute function.py without knowing its path at startup time. Alternatives considered: - Add function directory to sys.path — causes name collisions when two apps both have function.py - exec() the file content — dangerous (no module scope, globals leak), security concern - Package as installable Python package — too much friction for new app authors - importlib.util.spec_from_file_location — load a module from arbitrary path with unique module name
Use importlib to load function.py at runtime: spec = importlib.util.spec_from_file_location("app_name", "/path/to/function.py") module = importlib.util.module_from_spec(spec) spec.loader.exec_module(module) run_fn = module.run result = await run_fn(inputs)
- Any valid Python file at any disk path can be loaded and executed
- No sys.path pollution — each function loads into its own module namespace
- No infrastructure changes needed to add a new app — clone, register, run
- function.py must expose exactly async def run(inputs: dict) -> dict
- Errors in function.py surface as ImportError at runtime, not startup
- Hot-reload not supported — file changes require a backend restart