Hi, I'm Nafis (na-fees) Abrar. I'm a Senior Tech Lead in Core ML at Meta, working on Ranking and Foundational AI. I also founded Edutechs, a classroom management platform, where I serve as CTO. I majored in Computer Science from the University of British Columbia.
Timeline
Leading ML efforts in ranking and foundational AI systems at Meta's Core ML organization.
Founded Edutechs and led the engineering team, designing end-to-end roadmaps for batch and real-time ML systems and building distributed microservices for 30,000+ paid users. Built the organization's end-to-end ML infrastructure (data engineering, development, inference, monitoring) with CI/CD, reducing time to production by 2x and cutting costs by ~47%. Trained and deployed a retrieval-augmented system using a novel retriever–ranker–summarizer pipeline that reduced hallucinated responses by 63% compared to GPT-4 and improved search relevance by 55%. Optimized model inference throughput and reduced p99 latency by ~3x using efficient attention, ONNX, and quantization. Drove ~18% weekly user growth in the first year through customer segmentation (k-means++, UMAP, XGBoost). Raised multiple funding rounds from angel investors and VCs, recognized in HolonIQ's South Asia EdTech 100 (2022 & 2023), and selected for Accelerating Asia (top 10 from 600+ startups).
Led research on LLMs for long-text classification under Dr. Kevin Leyton-Brown. Implemented distributed parallel training pipelines for LLaMA-2 (70B), Vicuna, and DeBERTa-v2. Experimented with gradual unfreezing, in-context learning (CoT, ToT, Self-Consistency), and QLoRA fine-tuning — achieving a 30.6% improvement over baseline. Instruction-tuned Llama models with Bayesian hyperparameter search for an additional ~15% gain.
Co-authored BLOOM: A 176B-Parameter Open-Access Multilingual Language Model, trained on 46 languages and 13 programming languages by a collaboration of 1,000+ researchers worldwide.
Contributed to a team of 7 data scientists owning model pipelines for the bank's fraud detection system. Created ETL pipelines from multiple data sources using large-scale data processing and pipeline orchestration, and added ~100 features for a cheque fraud XGBoost model using Spark. Designed and led an end-to-end NLP system for monitoring customer consent, delivering the project ahead of schedule. Built 2 fraud detection training pipelines using XGBoost, Docker, and GCP.
Worked in Dr. Sohrab Shah's computational oncology lab, co-authoring papers on scalable single-cell genome sequencing of 40,000 cells and clonal decomposition and DNA replication states via scaled single-cell genome sequencing.
Majored in Computer Science at UBC.
Outside of work, I've been playing guitar since I was ten and love reading about philosophy.