Skip to content
View Eleven-hash's full-sized avatar

Block or report Eleven-hash

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Eleven-hash/README.md

Typing SVG


LinkedIn Gmail Portfolio Phone


Profile Views GitHub followers


👋 Hey, I'm Alan

class AlanMonsonChacko:
    role     = "AI / ML Engineer"
    company  = "PropMarker, UK (Remote)"
    location = "Kerala, India 🇮🇳"
    
    def what_i_build(self):
        return [
            "Production RAG chatbots",
            "Gradient-boosted ML pipelines",
            "Agentic LLM systems",
            "Enterprise NLP engines",
        ]
    
    def looking_for(self):
        return "Remote AI/ML role — available NOW"


📈 Production metrics — real numbers, live systems

Metric Value System
🏆 Holdout ROC-AUC 0.8628 UK Sale Propensity Model
QA pass rate 76.9% → 100% Enterprise RAG Chatbot
🛡️ Runtime crashes 0% NLP Tagging Engine (prod)
SHAP inference < 100ms FastAPI explainability endpoint
📦 Records processed 149,000+ Land Registry + ONS + EPC + IMD
🔒 Hallucination rate 0% RAG chatbot (strict prompt masks)
🧩 Semantic chunks 452 From 42 proprietary documents
🏷️ Tags extracted 50+ Per property listing, structured JSON

🚀 Projects in production

🏠 UK Real Estate Sale Propensity Platform — ROC-AUC 0.8628 · Click to expand

Problem: PropMarker needed to rank 149,000+ UK properties by 12-month sale likelihood — manual scoring was impossible at scale.

What I built:

  • 🎯 Gradient-boosted ensemble pipeline (XGBoost + LightGBM + CatBoost + Random Forest)
  • 🔢 Feature engineering across 6 heterogeneous datasets — Land Registry (149k transactions), ONS Census, EPC ratings, IMD deprivation indices
  • 🔍 Temporal cross-validation strategy with zero future-data leakage
  • SHAP explainability served via FastAPI — feature contributions in <100ms
  • 🤖 Optuna Bayesian HPO — training in <10 seconds on 20k-row downsampled sets
  • 📊 Index-adjusted ECV algorithm projecting current property values from HPI records

Stack: Python XGBoost LightGBM CatBoost Optuna SHAP FastAPI SQLite scikit-learn pandas

Metric Value
Holdout ROC-AUC 0.8628
Training time (Optuna) < 10 seconds
SHAP inference < 100ms / prediction
Data leakage 0%

🤖 Enterprise RAG Conversational Chatbot — 100% QA pass rate · Click to expand

Problem: 42 proprietary platform documents were unsearchable — support staff wasted hours finding answers manually.

What I built:

  • 📄 Document ingestion pipeline — 452 semantic chunks via RecursiveCharacterTextSplitter with custom Markdown separators
  • 🧠 Local BAAI/bge-m3 embeddings (1024-dim) on CPU/GPU — 100% data privacy, zero third-party API cost
  • 🔄 History-aware pre-retriever reformulating follow-up questions from conversation history
  • 🚫 Strict negative-control prompt masks — absolute context reliance, zero hallucinations
  • 📎 Automated citation compiler — appends exact file sources and text snippets to every answer
  • 📱 Deployed as Streamlit + Gradio interfaces with full auditability

Stack: LangChain ChromaDB HuggingFace BAAI/bge-m3 OpenAI API Streamlit Gradio Python

Metric Value
QA pass rate 76.9% → 100%
Hallucination rate 0%
Documents indexed 42 (452 chunks)
Data privacy 100% local embeddings

🏷️ LLM Semantic NLP Tagging Engine — 50+ tags · 0% crashes · Click to expand

Problem: UK property listings contained unstructured text that needed 50+ structured tags extracted reliably at scale.

What I built:

  • 🏗️ Type-safe Pydantic schema mapping LLM output to binary feature flags (0/1) with citation strings
  • 🛡️ Rate-limit handling + graceful all-zero fallback — 0% runtime crashes in continuous production
  • 🎯 Prompt-level disambiguation guardrails — e.g. distinguishing Notice of Offer vs In Receipt of Offer
  • 📊 Visual evaluation dashboard — TP/FP/FN colour-coding, live API cost tracking (USD)
  • 🔄 Dual-model architecture — GPT-4o-mini + Gemini 2.0 Flash for cost/performance tradeoffs

Stack: LangChain Pydantic GPT-4o-mini Gemini 2.0 Flash Python HTML/CSS


🔍 AI Job Search Automation Tool — Claude API · 4-stage agentic pipeline · Click to expand

Problem: Job seekers waste hours manually tailoring resumes, writing cover letters, and preparing for interviews.

What I built:

  • 📊 Stage 1 — ATS Scorer: Keyword match analysis, gap identification, score 0–100
  • ✍️ Stage 2 — Resume Tailorer: Rewrites bullets and summary using JD language
  • 📝 Stage 3 — Cover Letter Generator: Personalised with candidate metrics + company context
  • 🎤 Stage 4 — Interview Coach: Mock Q&A graded 1–10 with actionable improvement feedback

Stack: Claude API React Anthropic Structured JSON Prompt Engineering


🛠️ Full tech stack

🤖 LLM & Agentic AI

LangChain LangGraph OpenAI Gemini HuggingFace Claude GPT-4o

🔍 RAG & Vector Databases

ChromaDB BAAI RAG Embeddings

📊 ML & Ensemble Modeling

XGBoost LightGBM CatBoost Optuna SHAP scikit-learn TensorFlow Keras

⚙️ Backend & APIs

Python FastAPI Pydantic SQLite Streamlit Gradio REST API

☁️ Cloud & Data Engineering

AWS Azure Databricks Apache Spark Delta Lake PySpark Airflow DBT

👁️ Computer Vision & Automation

YOLO OpenCV n8n Relevance AI


📊 GitHub stats



GitHub Streak


🎯 Right now

alan = {
    "🔭 working_on"  : "Production AI systems @ PropMarker UK",
    "🌱 learning"    : ["FastAPI + Docker deployment", "MCP servers", "vLLM"],
    "👯 open_to"     : "Remote AI/ML collaborations",
    "💬 ask_me"      : "LangChain · RAG · XGBoost · Prompt Engineering",
    "📫 reach_me"    : "alanmonson44@gmail.com",
    "⚡ fun_fact"    : "My RAG chatbot has a 0% hallucination rate in prod 🎯",
    "🚀 available"   : True,   # hire me!
}

🏆 GitHub trophies

trophy


💡 My engineering philosophy

┌─────────────────────────────────────────────────┐
│                                                 │
│   "If it's not in production, it doesn't count" │
│                                                 │
│   Every project I build ships to real users,    │
│   handles real data, and has real metrics.      │
│                                                 │
└─────────────────────────────────────────────────┘

⭐ Star a repo if it helped you   •   🤝 Open to collabs & remote roles   •   📧 alanmonson44@gmail.com


Pinned Loading

  1. AI-Chatbot-24-7-1 AI-Chatbot-24-7-1 Public

    Python

  2. Data-Intelligence-Suite Data-Intelligence-Suite Public

    Python

  3. Local-RAG-with-Ollama Local-RAG-with-Ollama Public

    Python

  4. Enterprise-RAG-Chatbot Enterprise-RAG-Chatbot Public

    Production-ready Enterprise RAG Chatbot leveraging LangChain, advanced document indexing, and semantic search retrieval.

    Jupyter Notebook

  5. Semantic-Property-Key-Tag-Tagger-Evaluator Semantic-Property-Key-Tag-Tagger-Evaluator Public

    Enterprise-grade LangChain & Pydantic-powered engine for semantic key-tag extraction from UK residential property descriptions.

    HTML