Nafis Abrar

Hi, I'm Nafis (na-fees) Abrar. I'm a Senior Tech Lead in Core ML at Meta, working on Ranking and Foundational AI. I also founded Edutechs, a classroom management platform, where I serve as CTO. I majored in Computer Science from the University of British Columbia.

Timeline

2023 – Present

Leading ML efforts in ranking and foundational AI systems at Meta's Core ML organization.

2019 – Present

Founded Edutechs and led the engineering team, designing end-to-end roadmaps for batch and real-time ML systems and building distributed microservices for 30,000+ paid users. Built the organization's end-to-end ML infrastructure (data engineering, development, inference, monitoring) with CI/CD, reducing time to production by 2x and cutting costs by ~47%. Trained and deployed a retrieval-augmented system using a novel retriever–ranker–summarizer pipeline that reduced hallucinated responses by 63% compared to GPT-4 and improved search relevance by 55%. Optimized model inference throughput and reduced p99 latency by ~3x using efficient attention, ONNX, and quantization. Drove ~18% weekly user growth in the first year through customer segmentation (k-means++, UMAP, XGBoost). Raised multiple funding rounds from angel investors and VCs, recognized in HolonIQ's South Asia EdTech 100 (2022 & 2023), and selected for Accelerating Asia (top 10 from 600+ startups).

2021 – 2022

Led research on LLMs for long-text classification under Dr. Kevin Leyton-Brown. Implemented distributed parallel training pipelines for LLaMA-2 (70B), Vicuna, and DeBERTa-v2. Experimented with gradual unfreezing, in-context learning (CoT, ToT, Self-Consistency), and QLoRA fine-tuning — achieving a 30.6% improvement over baseline. Instruction-tuned Llama models with Bayesian hyperparameter search for an additional ~15% gain.

2021

Co-authored BLOOM: A 176B-Parameter Open-Access Multilingual Language Model, trained on 46 languages and 13 programming languages by a collaboration of 1,000+ researchers worldwide.

Jan 2019 – Aug 2019

Contributed to a team of 7 data scientists owning model pipelines for the bank's fraud detection system. Created ETL pipelines from multiple data sources using large-scale data processing and pipeline orchestration, and added ~100 features for a cheque fraud XGBoost model using Spark. Designed and led an end-to-end NLP system for monitoring customer consent, delivering the project ahead of schedule. Built 2 fraud detection training pipelines using XGBoost, Docker, and GCP.

Jan 2018 – Aug 2018

Worked in Dr. Sohrab Shah's computational oncology lab, co-authoring papers on scalable single-cell genome sequencing of 40,000 cells and clonal decomposition and DNA replication states via scaled single-cell genome sequencing.

2019

Majored in Computer Science at UBC.

Outside of work, I've been playing guitar since I was ten and love reading about philosophy.