Document Intelligence Pipeline
Extract, chunk, embed, search, and ask over any document format — in one command. No API keys required for basic use.
Point Duct at a PDF, DOCX, Markdown, HTML, image, or plain text file — or a whole directory — and it extracts text, splits it into searchable chunks, and lets you query them instantly.
BM25 keyword search works fully offline with zero configuration. Add OpenAI or Gemini for semantic vector search, hybrid retrieval, and full Q&A with source citations. Ollama is the default LLM provider for fully local Q&A.