Alan Monson Chacko Eleven-hash

👋 Hey, I'm Alan

class AlanMonsonChacko:
    role     = "AI / ML Engineer"
    company  = "PropMarker, UK (Remote)"
    location = "Kerala, India 🇮🇳"
    
    def what_i_build(self):
        return [
            "Production RAG chatbots",
            "Gradient-boosted ML pipelines",
            "Agentic LLM systems",
            "Enterprise NLP engines",
        ]
    
    def looking_for(self):
        return "Remote AI/ML role — available NOW"

📈 Production metrics — real numbers, live systems

	Metric	Value	System
🏆	Holdout ROC-AUC	0.8628	UK Sale Propensity Model
✅	QA pass rate	76.9% → 100%	Enterprise RAG Chatbot
🛡️	Runtime crashes	0%	NLP Tagging Engine (prod)
⚡	SHAP inference	< 100ms	FastAPI explainability endpoint
📦	Records processed	149,000+	Land Registry + ONS + EPC + IMD
🔒	Hallucination rate	0%	RAG chatbot (strict prompt masks)
🧩	Semantic chunks	452	From 42 proprietary documents
🏷️	Tags extracted	50+	Per property listing, structured JSON

🚀 Projects in production

🏠 UK Real Estate Sale Propensity Platform — ROC-AUC 0.8628 · Click to expand

Problem: PropMarker needed to rank 149,000+ UK properties by 12-month sale likelihood — manual scoring was impossible at scale.

What I built:

🎯 Gradient-boosted ensemble pipeline (XGBoost + LightGBM + CatBoost + Random Forest)
🔢 Feature engineering across 6 heterogeneous datasets — Land Registry (149k transactions), ONS Census, EPC ratings, IMD deprivation indices
🔍 Temporal cross-validation strategy with zero future-data leakage
⚡ SHAP explainability served via FastAPI — feature contributions in <100ms
🤖 Optuna Bayesian HPO — training in <10 seconds on 20k-row downsampled sets
📊 Index-adjusted ECV algorithm projecting current property values from HPI records

Stack: Python XGBoost LightGBM CatBoost Optuna SHAP FastAPI SQLite scikit-learn pandas

Metric	Value
Holdout ROC-AUC	0.8628
Training time (Optuna)	< 10 seconds
SHAP inference	< 100ms / prediction
Data leakage	0%

🤖 Enterprise RAG Conversational Chatbot — 100% QA pass rate · Click to expand

Problem: 42 proprietary platform documents were unsearchable — support staff wasted hours finding answers manually.

What I built:

📄 Document ingestion pipeline — 452 semantic chunks via RecursiveCharacterTextSplitter with custom Markdown separators
🧠 Local BAAI/bge-m3 embeddings (1024-dim) on CPU/GPU — 100% data privacy, zero third-party API cost
🔄 History-aware pre-retriever reformulating follow-up questions from conversation history
🚫 Strict negative-control prompt masks — absolute context reliance, zero hallucinations
📎 Automated citation compiler — appends exact file sources and text snippets to every answer
📱 Deployed as Streamlit + Gradio interfaces with full auditability

Stack: LangChain ChromaDB HuggingFace BAAI/bge-m3 OpenAI API Streamlit Gradio Python

Metric	Value
QA pass rate	76.9% → 100%
Hallucination rate	0%
Documents indexed	42 (452 chunks)
Data privacy	100% local embeddings

🏷️ LLM Semantic NLP Tagging Engine — 50+ tags · 0% crashes · Click to expand

Problem: UK property listings contained unstructured text that needed 50+ structured tags extracted reliably at scale.

What I built:

🏗️ Type-safe Pydantic schema mapping LLM output to binary feature flags (0/1) with citation strings
🛡️ Rate-limit handling + graceful all-zero fallback — 0% runtime crashes in continuous production
🎯 Prompt-level disambiguation guardrails — e.g. distinguishing Notice of Offer vs In Receipt of Offer
📊 Visual evaluation dashboard — TP/FP/FN colour-coding, live API cost tracking (USD)
🔄 Dual-model architecture — GPT-4o-mini + Gemini 2.0 Flash for cost/performance tradeoffs

Stack: LangChain Pydantic GPT-4o-mini Gemini 2.0 Flash Python HTML/CSS

🔍 AI Job Search Automation Tool — Claude API · 4-stage agentic pipeline · Click to expand

Problem: Job seekers waste hours manually tailoring resumes, writing cover letters, and preparing for interviews.

What I built:

📊 Stage 1 — ATS Scorer: Keyword match analysis, gap identification, score 0–100
✍️ Stage 2 — Resume Tailorer: Rewrites bullets and summary using JD language
📝 Stage 3 — Cover Letter Generator: Personalised with candidate metrics + company context
🎤 Stage 4 — Interview Coach: Mock Q&A graded 1–10 with actionable improvement feedback

Stack: Claude API React Anthropic Structured JSON Prompt Engineering

🛠️ Full tech stack

🤖 LLM & Agentic AI

🔍 RAG & Vector Databases

📊 ML & Ensemble Modeling

⚙️ Backend & APIs

☁️ Cloud & Data Engineering

👁️ Computer Vision & Automation

📊 GitHub stats

🎯 Right now

alan = {
    "🔭 working_on"  : "Production AI systems @ PropMarker UK",
    "🌱 learning"    : ["FastAPI + Docker deployment", "MCP servers", "vLLM"],
    "👯 open_to"     : "Remote AI/ML collaborations",
    "💬 ask_me"      : "LangChain · RAG · XGBoost · Prompt Engineering",
    "📫 reach_me"    : "alanmonson44@gmail.com",
    "⚡ fun_fact"    : "My RAG chatbot has a 0% hallucination rate in prod 🎯",
    "🚀 available"   : True,   # hire me!
}

🏆 GitHub trophies

💡 My engineering philosophy

┌─────────────────────────────────────────────────┐
│                                                 │
│   "If it's not in production, it doesn't count" │
│                                                 │
│   Every project I build ships to real users,    │
│   handles real data, and has real metrics.      │
│                                                 │
└─────────────────────────────────────────────────┘

⭐ Star a repo if it helped you • 🤝 Open to collabs & remote roles • 📧 alanmonson44@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alan Monson Chacko Eleven-hash

Block or report Eleven-hash

👋 Hey, I'm Alan

📈 Production metrics — real numbers, live systems

🚀 Projects in production

🛠️ Full tech stack

🤖 LLM & Agentic AI

🔍 RAG & Vector Databases

📊 ML & Ensemble Modeling

⚙️ Backend & APIs

☁️ Cloud & Data Engineering

👁️ Computer Vision & Automation

📊 GitHub stats

🎯 Right now

🏆 GitHub trophies

💡 My engineering philosophy

Pinned Loading

Uh oh!