Dynamic Context Management for LLMs. Provides intelligent, dynamic context window management through a 'dynamic RAG' system.
Injects the most relevant knowledge into every LLM request instead of static history.
Hybrid search combining semantic, keyword, recency, and tag-based retrieval.
Built-in telemetry for tracking retrieval times, token usage, and cache hit rates.
Advanced reranking and multi-level caching for reduced latency and costs.
Supports multiple vector backends including In-Memory, ChromaDB, and FAISS.
Provider agnostic via LiteLLM; supports OpenAI, Anthropic, and 100+ others.