Final-year B.Tech. Information Technology student at VIT Vellore specializing in ML pipelines, data engineering, and cloud computing. Experienced in building end-to-end data workflows, deploying ML applications, and developing cloud-native solutions using AWS, with foundational exposure to Azure and Docker.
Languages & Web: Java, Python, SQL, HTML, CSS
Cloud: AWS (Glue, Athena, Lambda, EC2, S3, SageMaker, IAM), Azure (Functions, Event Grid, Application Insights, Log Analytics, Azure Monitor, Blob Storage)
Backend & Databases: Express.js, FastAPI, PostgreSQL
Frameworks & Libraries: Flutter, Pandas
Tools: Power BI, Postman, Git, Android Studio
Certifications: AWS Certified Cloud Practitioner - Credential, Generative AI using IBM Watsonx - Credential
Production-grade ML pipeline (including ETL) that ingests flight-delay data, implements a Medallion Architecture (Bronze-Silver-Gold) on S3, fetches data through AWS Glue and Athena, on Power BI for reliable analytics and real-time inference.
Highlights
- Medallion Architecture: Structured Bronze, Silver, and Gold layers on S3 for progressive data refinement and quality control
- Cost-Efficient Querying: Athena CTAS/UNLOAD with Parquet with partitioning, for faster and cheaper queries
- Modeling & Serving: XGBoost model deployed via FastAPI, containerized with Docker and served through Nginx on EC2
- Analytics Integration: Gold layer exposed to Power BI via Athena (ODBC) for self-service BI
- Serverless Variant
Tech Stack
AWS S3, Athena, EC2, XGBoost, FastAPI, Docker, Nginx, Power BI
ML-powered cybersecurity pipeline that analyzes PCAP files using Zeek, engineers behavioral DNS features, and detects covert DNS exfiltration through a Random Forest classifier. Containerized with Docker and deployed for browser-based analysis.
Highlights
- Packet Analysis: Extracts DNS telemetry from PCAP files using Zeek for deep packet inspection
- Behavioral Detection: 11 DNS features and classifies tunneling, DGA callbacks, and encoded payloads using a Random Forest model
- Threat Scoring: Prioritizes suspicious queries using a composite severity score based on ML confidence, entropy, and subdomain depth
- Containerized Deployment: Dockerized the complete detection pipeline and exposed FastAPI endpoint.
Tech Stack
Zeek, FastAPI, Python, scikit-learn, Docker
Official backend system for Hackulus’25, SIAM-VIT’s flagship hackathon with 150+ participants.
Highlights
- REST API Design: Built secure endpoints for tracks, submissions, and admin workflows
- High Reliability: Maintained >99% uptime with minimal runtime failures
- Authentication & Validation: JWT-based auth, structured validation, and error handling
- Deployment: Hosted on Render with stable performance under real-time load
Tech Stack
Express.js, PostgreSQL, REST APIs, JWT
Data-centric ML pipeline that preprocesses large-scale chemical reaction datasets, standardizes reaction SMILES, engineers molecular representations using RDKit, and serves catalyst predictions through a FastAPI application.
Highlights
- Data Processing Pipeline: Built a multi-stage preprocessing workflow to clean, normalize, validate, and balance large-scale reaction datasets from the ORDerly benchmark
- Chemical Data Engineering: Standardized reaction SMILES, handled malformed records, normalized reagent metadata, and generated molecular fingerprints using RDKit
- ML Inference: Trained a catalyst prediction model on processed reaction data and exposed predictions through a FastAPI backend
- Interactive Interface: Developed a lightweight web frontend for real-time catalyst prediction from reaction SMILES
Tech Stack
Python, RDKit, Pandas, scikit-learn, FastAPI, NumPy
Kindly visit my repositories for more such projects.
- Email: dshryng@gmail.com
- LinkedIn: Aryan Deshpande

