Skip to main content

System Overview

                         ┌──────────────────┐
                         │     Caddy         │  ← HTTPS / reverse proxy
                         │   :80 / :443      │
                         └────┬────────┬─────┘
                              │        │
                    /api/*    │        │  /*
                              ▼        ▼
                    ┌──────────┐  ┌──────────┐
                    │ FastAPI  │  │  React   │
                    │ Backend  │  │ Frontend │
                    │  :8000   │  │   :80    │
                    └────┬─────┘  └──────────┘

            ┌────────────┼────────────┐
            ▼            ▼            ▼
     ┌───────────┐ ┌──────────┐ ┌──────────┐
     │ PostgreSQL│ │  Redis   │ │Gotenberg │
     │ + pgvector│ │  Cache   │ │ PDF Gen  │
     │   :5432   │ │  :6379   │ │  :3000   │
     └───────────┘ └──────────┘ └──────────┘

     ┌───────────────────────────────────────┐
     │        Optional Services              │
     │  SendCloak → Presidio (PII)           │
     │  Celery Media Worker (video/audio)    │
     └───────────────────────────────────────┘

Components

Caddy (Reverse Proxy)

Caddy handles all incoming traffic and routes requests:
  • /api/* → FastAPI backend (port 8000)
  • /* → React frontend (port 80)
  • Automatic HTTPS certificate provisioning in production
  • SSE streaming support with flush_interval -1
In development, the frontend Vite dev server proxies API requests directly.

FastAPI Backend

The Python backend is the core of ondoki. It provides:
AreaDetails
API20+ router groups under /api/v1/
AuthSession cookies, OAuth 2.0 PKCE (desktop clients), API keys (MCP)
AILLM gateway (OpenAI, Anthropic, Ollama), 16 AI tools with function calling
MCPModel Context Protocol server at /mcp (FastMCP, stateless HTTP)
SearchHybrid FTS + semantic via PostgreSQL tsvector and pgvector
ExportPDF (Gotenberg), Markdown, HTML, DOCX
WebSocketReal-time notifications via Redis pub/sub (multi-server)
MiddlewareCORS, CSRF, GZip, rate limiting, request ID logging

React Frontend

Single-page application with 23+ pages:
AreaTechnologies
RoutingReact Router 7
StateZustand (global), TanStack Query (server state)
EditorTipTap 3 (block-based, extensible)
UITailwind CSS + Radix UI primitives
ChartsRecharts
HTTPAxios

PostgreSQL + pgvector

Primary datastore with 24 tables. Key capabilities:
  • Full-text search: tsvector columns on documents, workflows, and steps with plainto_tsquery and prefix matching
  • Vector search: pgvector extension stores 1536-dimensional embeddings for semantic similarity
  • Soft deletes: Documents and workflows use deleted_at timestamps
  • Materialized paths: Folders use path-based hierarchy for efficient tree queries

Redis

Used for three purposes:
  1. Caching — Session data and temporary state
  2. WebSocket pub/sub — Multi-server real-time notification delivery
  3. Celery broker — Task queue for async media processing jobs

Gotenberg

Headless Chrome service for converting HTML/documents to PDF. Used for workflow and document export.

Celery Media Worker

Async worker process for video import pipeline:
  1. Extract audio from video
  2. Transcribe audio (Whisper)
  3. Extract key frames
  4. Analyze frames with AI
  5. Generate step-by-step guide

SendCloak + Presidio (Optional)

Privacy layer that obfuscates PII before data reaches AI providers:
  • Presidio — Microsoft’s NER-based PII detection engine
  • SendCloak — Proxy that intercepts AI requests, masks PII, and de-masks responses
  • Supports English and European languages (en, de, fr, es, it)

Data Flow

Workflow Recording

Desktop App / Chrome Extension


  POST /api/v1/process-recording/session/create

        ├── Upload metadata (JSON)
        ├── Upload step screenshots (images)


  POST /api/v1/process-recording/session/{id}/finalize


  Auto-Processing Pipeline (async)
        ├── Generate title (LLM)
        ├── Generate summary (LLM)
        ├── Generate tags (LLM)
        ├── Annotate each step (LLM)
        ├── Generate guide markdown (LLM)
        ├── Index for full-text search (tsvector)
        └── Generate embeddings (pgvector)

Search Query

User Query: "how to deploy"


  GET /api/v1/search/unified-v2?q=how+to+deploy

        ├── Full-text search (tsvector + tsquery)
        ├── Semantic search (pgvector cosine similarity)
        ├── Trigram fallback (for typos)


  RRF Fusion (Reciprocal Rank Fusion)

        ├── Combine rankings from all sources
        ├── Apply boosts (view count, recency, type)


  Ranked Results with Highlighted Snippets

MCP Access

External AI Agent (Claude, Cursor, etc.)


  /mcp (FastMCP HTTP transport)

        ├── API key validation
        ├── Project scoping


  Available Tools:
        ├── list_projects
        ├── search_pages
        ├── search_workflows
        ├── read_document
        ├── read_workflow
        ├── read_folder
        └── create_context_link

Database Schema (Key Tables)

TablePurpose
userUser accounts with email/password auth
projectTeam projects with ownership
project_membersM:N with roles (Viewer → Owner)
documentRich text documents (TipTap JSON content)
document_versionDocument version history
folderHierarchical folders (materialized path)
processrecordingsessionWorkflow recordings with AI-generated metadata
processrecordingstepIndividual steps with screenshots and annotations
processrecordingfileUploaded screenshot files
embeddingpgvector embeddings for semantic search
knowledgesourceUploaded knowledge files (PDF, DOCX, etc.)
knowledgelinkRelationships between resources
contextlinkURL/app pattern → resource mappings
gitsyncconfigGit export configuration per project
auditlogAction tracking for compliance
llmusageToken/cost tracking per LLM call
mcpapikeyProject-scoped API keys for MCP
sessionBrowser session tokens
refreshtokenOAuth refresh tokens for desktop clients
commentThreaded comments on resources
resourcesharePer-resource sharing with permissions
appsettingsKey-value config store (LLM settings)