Backend resilience: LLM retry/backoff, OpenAI-compatible embeddings, connection guards by shivswami · Pull Request #50 · nikmcfly/MiroFish-Offline

shivswami · 2026-06-22T07:16:54Z

What

Improves backend robustness for local deployments (Ollama / LMStudio / Neo4j in Docker) where services are flaky or use non-Ollama providers.

Why

LLM rate-limits / timeouts mid-extraction aborted the whole build with no retry.
Only Ollama's embedding format was supported; OpenAI-compatible local servers (e.g. LMStudio) couldn't be used.
A failed batch embedding was logged as a warning and silently produced empty vectors, so vector search quietly stopped working.
A build that extracted 0 entities (e.g. the LLM returned no JSON) still reported "completed".

Changes

llm_client.py — retry transient failures (429 / 5xx / timeout / connection) with exponential backoff (2/4/8…s, capped at 60s), honoring Retry-After; non-retryable errors fail fast. Configurable via LLM_MAX_RETRIES.
embedding_service.py — support OpenAI-compatible /v1/embeddings endpoints alongside Ollama; format auto-detected from EMBEDDING_BASE_URL.
neo4j_storage.py — health_check() + pre-flight _verify_connection() before create_graph / add_text; batch-embedding failures now logged as errors (empty vectors silently break vector search).
graph.py — a build that extracts 0 entities now fails the task explicitly with an actionable message instead of showing "completed".

Notes / limitations

Embedding format detection is heuristic (port :11434 → Ollama; /v1 in path → OpenAI). A non-default Ollama port behind a /v1 proxy could misdetect.
_verify_connection adds one RETURN 1 round-trip before each add_text.

Tested

Module imports verified (backend/venv).
[Runtime against Ollama / LMStudio / Neo4j confirmed locally — happy to add repro steps.]

…guards - llm_client: retry transient failures (429/5xx/timeout/connection) with exponential backoff honoring Retry-After; non-retryable errors fail fast - embedding_service: support OpenAI-compatible /v1/embeddings servers (e.g. LMStudio) alongside Ollama, auto-detected from EMBEDDING_BASE_URL - neo4j_storage: verify Neo4j reachable before create_graph/add_text; log batch-embedding failures as errors (empty vectors silently break vector search) - graph build: fail the task explicitly when NER extracts 0 entities instead of reporting a misleading "completed" status

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backend resilience: LLM retry/backoff, OpenAI-compatible embeddings, connection guards#50

Backend resilience: LLM retry/backoff, OpenAI-compatible embeddings, connection guards#50
shivswami wants to merge 1 commit into
nikmcfly:mainfrom
shivswami:pr1-backend-resilience

shivswami commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shivswami commented Jun 22, 2026

What

Why

Changes

Notes / limitations

Tested

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant