Descriptions:
In this AI Engineer 2026 session, Ash, a developer relations engineer at IBM, introduces OpenRAG — an open-source RAG stack assembled from three existing projects: Docling (IBM Research, document processing), OpenSearch (search indexing and vector storage), and LangFlow (visual drag-and-drop orchestration). The talk is framed as a response to “RAG is dead” claims, arguing instead that RAG is genuinely hard to do well and that teams need a high-quality, extensible baseline to build from.
Docling handles document ingestion across PDFs, HTML, Word, slides, audio, and video. For PDFs specifically, it offers two pipelines: a standard pipeline using small focused models for layout analysis, table extraction, and OCR, and a newer VLM pipeline using IBM’s Granite Docling 258-million-parameter vision model for all-in-one extraction. Output is converted to a hierarchical intermediate format (DocTags) that feeds into chunking and embedding. On the search side, OpenRAG defaults to the JVector KNN plugin for OpenSearch, which supports live indexing and disk-based architecture so the full vector index doesn’t need to fit in memory — a meaningful scaling advantage over HNSW.
On the generation side, OpenRAG uses agentic retrieval via LangFlow rather than fixed top-K lookup: the agent decides how many searches to run and what to do with results, supporting models from OpenAI, Anthropic, and Ollama. Ash demonstrates the system running locally, showing tool calls and multi-step retrieval in action.
📺 Source: AI Engineer · Published April 08, 2026
🏷️ Format: Hands On Build







