Give your AI a memory that never leaves your machine.
Offline-first semantic memory engine โ single binary, zero config, 30ms recall.
๐ฌ๐ง English ยท ๐ฎ๐ฉ Bahasa Indonesia
# Install (macOS, Linux, Windows)
curl -sSL codecora.dev/install | sh
# Store a memory with metadata
uteke remember "Deploy v2.1 to staging" --tags deploy,staging \
--entity staging-server --category infrastructure
# Hybrid search (vector + FTS5, ranked by RRF)
uteke recall "when do we deploy?"
# Stats
uteke statsThat's it. No API keys. No Docker. No Python. First run downloads the embedding model (~188MB) and you're good to go.
๐ Install options ยท Pre-built binaries ยท Docker ยท Full docs
Listens on localhost only by default. See Docker docs for auth setup.
# One-liner (model pre-baked in image)
docker run -d --name uteke -p 127.0.0.1:8767:8767 -v uteke-data:/data \
ghcr.io/codecoradev/uteke:latest
# Or docker compose
docker compose up -d๐ Docker docs ยท Compose file
AI agents forget everything between sessions. Uteke gives them persistent, searchable memory โ entirely offline, in one binary.
| Uteke | Mem0 | Letta | Zep | |
|---|---|---|---|---|
| Setup | Single binary | pip + Docker + Qdrant | pip + Docker + Postgres | pip + Docker + Neo4j |
| API keys needed | โ None | โ OpenAI/LLM key | โ LLM key | โ LLM key |
| Offline | โ Fully | โ Cloud embedding | โ Needs LLM server | โ Needs LLM + vector DB |
| Semantic search | โ Local ONNX + FTS5 hybrid | โ Cloud embedding | โ GraphRAG | |
| Full-text search | โ FTS5 built-in | โ | โ | |
| Recall speed | ~30ms (library) | Network round-trip | Network round-trip | Network round-trip |
| Privacy | โ Data never leaves machine | |||
| License | Apache 2.0 | Apache 2.0 | Apache 2.0 | Apache 2.0 |
- ๐ง Hybrid Search โ Vector similarity + FTS5 full-text search, merged by Reciprocal Rank Fusion (RRF)
- ๐ Rooms โ Group memories by context (meetings, projects) with author attribution
- โณ Time-travel queries โ Recall memories as they existed at any point in time
- ๐ Pluggable embeddings โ Swap ONNX/OpenAI/Ollama backends via config
- ๐ท๏ธ Metadata Enrichment โ Tag, entity, category, and key:value metadata on every memory
- ๐ Relationship graph โ Link memories with typed edges (supersedes, contradicts, references)
- ๐ Smart decay โ Composite importance scoring, pin critical memories
- โก Recall cache โ LRU cache eliminates redundant embedding for repeated queries
- ๐ Benchmarks โ Built-in
uteke benchfor perf testing + LongMemEval retrieval harness - ๐ฅ Multi-Agent Namespaces โ Fully isolated memory per agent, zero overhead
- ๐ฅ๏ธ Server Mode โ Persistent daemon with ~42ms warm recall (75x faster than CLI)
- ๐ฅ Tiered Memory โ Hot/Warm/Cold tracking with auto-cleanup of stale memories
- ๐ Fully Offline โ Local ONNX embeddings (768d), no telemetry, no cloud, no API calls
- ๐ Embed Fallback โ Gracefully falls back to cloud API if local embedder fails; MockEmbedder for testing
- ๐ Batch Import โ Import entire directories (
--batch-dir) with auto-strategy routing (document vs. memory extraction) - ๐ฆ Single Binary โ Zero dependencies. No Docker, no database server, no Python, no API keys
- ๐ฅ Import/Export โ JSONL-based backup and restore
- ๐งฉ Memory Types โ Typed categories (fact, procedure, decision, etc.) with auto-inference
- ๐ Backlinks โ Bidirectional memory edges โ references are automatically reciprocal
- ๐ Timeline Events โ Chronological audit log per memory (created, updated, superseded)
- ๐ Salience + Recency โ Dual-axis recall boost by memory type and age
- ๐ Dream Cycle โ One-command maintenance pipeline (lint โ backlinks โ dedup โ orphans)
- ๐ Orphan Detection โ Find disconnected, low-importance memories for cleanup
- ๐ Citations โ Source attribution on every memory (URL, file, user, import)
- ๐ MCP Server โ JSON-RPC over stdio + Streamable HTTP transport
- ๐ Document Engine โ Wiki/knowledge base with
uteke doc create/get/listand auto-chunking - ๐ค Cosine Auto-Linking โ Automatically creates
similar_toedges between related memories - ๐ Graph API โ
GET /graphendpoint returns nodes + edges JSON for visualization - ๐ View-Only API Keys โ Read-only tokens for safe GET-only access to the server
- ๐ Markdown Chunker โ Splits documents by headings, respects code blocks and token limits
๐ Full documentation ยท CLI reference ยท Configuration
Hybrid search pipeline:
- HNSW (usearch) โ vector similarity, finds by meaning
- FTS5 (SQLite) โ full-text keyword search, finds by exact terms
- Reciprocal Rank Fusion (k=60) โ merges both ranked lists โ best of both worlds
- Local ONNX (EmbeddingGemma Q4, 768d) โ embeddings computed on-device, no API calls
Everything runs in-process. No network. No cloud. No server required (unless you want server mode).
cargo build --workspace # Build
cargo test --workspace # Test (327 unit tests)
cargo clippy -- -D warnings # Lint
cargo fmt # FormatSee CONTRIBUTING.md for the full contribution guide.
Apache License 2.0 โ use it, fork it, ship it.


