Harsha Vardhan

Home

Experience

Research

Projects

Education

Harsha Vardhan

A machine learning engineer passionate about building AI systems that solve real-world problems. Recently completed a Master's in Data Science from Rutgers University, where I focused on applied machine learning, backend systems, and end-to-end product development. Prior to that, I founded Autodub, a speech translation startup that was acqui-hired by Khabri (YC W19), where I worked as a founding ML engineer building large-scale audio intelligence tools. My interests lie in AI agents, foundation models, and the evolving role of intelligent systems in everyday workflows. When I'm not working, you'll probably find me exploring authentic restaurants around town or hunting down the next great meal.

Email / CV / Twitter / Github / LinkedIn

Experience

	Data Analyst Internship June 2024 - Dec 2024 LS Direct Marketing Software: Pytorch, Python, SQL, VectorDB
	Machine Learning Engineer Nov 2021 - June 2023 Vedantu Learning Software: Pytorch, Python, HF-Transformers, SQL, Tensorflow, Docker
	Founding Engineer Dec 2020 - Nov 2021 Khabri Audio (YC W19) Software: Pytorch, Python, HF-Transformers, SQL, Tensorflow, Docker
	Founder April 2018 - Dec 2020 Autodub Software: Pytorch, Python, HF-Transformers, SQL, Tensorflow, Docker

Research

Large Language Models: A survey on Use-case-Mapping & Evaluations
Zihan Xu, Harsha Vardhan, Mhamed Bettaieb,
Research ongoing — Paper in preparation

Projects

	MR MOOLAH — PERSONAL FINANCE AI AGENT Track your spending, compare with peers, and invest smarter with real-time, LLM-powered insights. More details Fine-tuned Llama 3.1-70B with LoRA adapters on AWS SageMaker distributed training (FSDP, 8 × A100 80 GB), cutting memory footprint by 65 % and improving domain-specific perplexity by 18 %. Served the model behind a FastAPI gateway on a SageMaker Large-Model-Inference (Inf2.48xlarge) endpoint, delivering streaming responses (< 150 ms for 512 tokens). Built a LangChain RAG agent that classifies transactions, pulls peer benchmarks from a Pinecone vector store, and generates personalized budgeting + asset-allocation advice. Integrated Plaid OAuth to fetch live bank/credit-card data; implemented PII hashing and AWS KMS encryption-at-rest across PostgreSQL + S3. Website
	CSHORTS – AI NEWS SUMMARIZATION PLATFORM Real-time, category-based news digests delivered in under 60 seconds. More details Fine-tuned & 8-bit-quantized a T5 model, slicing model size by 45 % while preserving ROUGE-L. Deployed on AWS SageMaker with autoscaling & CloudWatch monitoring. Built a FastAPI micro-service to ingest content, clean HTML, and store articles + summaries in MongoDB Atlas. Streamlit front-end groups stories into Trending, Business, Tech & AI, Entertainment, Sports, USA, and India tabs; end-to-end latency < 60 s. Containerised with Docker & CI/CD via GitHub Actions. Code / Live demo
	ENTERPRISE TICKET TRIAGE AUTOMATION SYSTEM Data & Statistical Services Lab, Princeton University — funded by Chase Bank AI-driven router that assigns and prioritizes support tickets with > 90 % accuracy. More details Fine-tuned MiniLM embeddings + CatBoost classifier to route tickets to the correct resolver teams and predict priority levels. Achieved over 90 % macro-F1 on both classification tasks across a 1 M-ticket dataset. Built a real-time FastAPI service that scores incoming tickets and publishes results to internal queues (Kafka). Developed a lightweight Streamlit dashboard for Support Ops to search, monitor, and override triage decisions in-flight. Implemented role-based access, audit logging, and encrypted data-at-rest to meet Chase's security requirements.

Education

	MS in Data Science Sep 2023 - April 2024 Rutgers University
	Bachelor of Technology (B.Tech) in CSE Aug 2016 - May 2020 Mahindra University Ecole Centrale School of Engineering