Pulkit Saxena

Data Scientist & ML Engineer

Building production ML systems, LLM pipelines, and MLOps infrastructure across fintech, telecom, and research.

$50K
saved / year
500K+
users served
5TB+
data analyzed
40%
MoM growth

About Me

Background & Expertise

I'm a ML Engineer & Data Scientist currently at University of Rochester, where I build production data systems, LLM pipelines, and anomaly-detection tools that deliver measurable ROI. With an MS in Data Science from Rochester Institute of Technology, I specialize in bridging the gap between experimental AI research and production-grade systems.

My experience spans fintech (RedCarpetUp YC S15), telecom (T-Mobile), and academic research (UofR). I've led teams of 4–5 engineers, owned end-to-end ML product development for 500k+ users, and built MLOps infrastructure that reduced pipeline friction across the board. I care deeply about reproducibility, observability, and shipping things that actually work in production.

I'm IEEE-published and care about the intersection of rigorous statistics and pragmatic software engineering.

Key Achievements

Production Impact Across Organizations

University of Rochester
Data Scientist · 2024–Present
$50K/yr

Isolation Forest anomaly detection on vendor invoices, eliminating manual review overhead and saving $50K annually across university procurement.

800GB+ data LLM pipelines dbt · MLflow
T-Mobile USA
Data Science Intern · 2023
75% faster

Automated Risk Strategy Simulation Tool with RCA-inclusive alerts, cutting policy deviation investigation time by 75% and boosting team productivity by 40%.

40% productivity gain Snowflake Databricks
RedCarpetUp (YC S15)
Data Scientist · 2019–2021
500K+

End-to-end XGBoost credit risk platform — from 5TB feature engineering to Docker + Flask production deployment — serving 500K+ users with 40% MoM growth.

40% MoM growth ₹5.2M revenue A/B Testing

Technical Expertise

Full-Stack Data Science & MLOps

AI & ML

Ollama Gemma 3 LangChain Agentic AI scikit-learn MLflow XGBoost Isolation Forest A/B Testing

Data Engineering

dbt Airflow Dagster ETL/ELT SQLAlchemy Alembic Spark PySpark

Cloud & Big Data

GCP/BigQuery AWS SageMaker Snowflake Databricks PostgreSQL

Infrastructure & Tools

Docker DevContainers GitLab CI/CD Flask Dash Posit Connect Tableau Linux

Languages

Python SQL R C++ Bash

Let's Work Together

Open to ML Engineering & Data Science Roles