Skip to main content

Overview

The knowledge base lets you upload external documents (PDFs, Word files, plain text, Markdown) into ondoki. Uploaded content is extracted, indexed for full-text search, and embedded for semantic search — making it available to the AI chat, MCP server, and search.

Supported File Types

FormatExtensionExtraction
PDF.pdfText extraction via PyMuPDF
Word.docxText extraction via python-docx
Plain Text.txtDirect text ingestion
Markdown.mdDirect text ingestion

Uploading Knowledge

Upload files via Knowledge Base in the sidebar or via the API: Endpoint: POST /api/v1/knowledge/upload The upload process:
1

File Upload

File is uploaded and stored on disk.
2

Text Extraction

Content is extracted from the file format into plain text.
3

Full-Text Indexing

Extracted text is indexed for keyword search.
4

Vector Embedding

Content is embedded in the background for semantic search.

Source Types

Knowledge sources are categorized by origin:
TypeDescription
UPLOADManually uploaded files
WEB_CLIPContent clipped from web pages
SLACKImported from Slack
MEETINGMeeting notes/transcripts
GIT_PRPull request content from Git
Currently, UPLOAD is the primary supported source type. Other types are defined in the schema for future integrations.
ondoki supports linking resources together with typed relationships:
Link TypeDescription
RELATEDGeneral relationship
DEPENDS_ONDependency relationship
SUPERSEDESNewer version replaces older
PART_OFComponent of a larger whole
Links can be created manually or auto-detected with a confidence score.

Knowledge Graph

The Knowledge Graph page provides a visual representation of relationships between documents, workflows, and knowledge sources. Navigate to it from the sidebar.

Managing Knowledge Sources

List sources: GET /api/v1/knowledge/sources Delete a source: DELETE /api/v1/knowledge/sources/{source_id} Deleting a knowledge source removes the file, extracted content, and associated embeddings.

How Knowledge Integrates with AI

Knowledge base content is available to:
  1. Search — appears in unified search results alongside documents and workflows
  2. AI Chat — the rag_search tool queries knowledge embeddings to provide context
  3. MCP — external AI agents can search and access knowledge content