Experience

New York University

Data Science Project Lead
May 2023 – May 2024 | New York, US
  • Led the development of the publicly operational Carbon Compass tool for NYC Local Law 97, promoting energy compliance analysis across numerous buildings and significantly aiding the city?s sustainability efforts.
  • Designed and published a Tableau dashboard integrating energy benchmarking and mortgage data from top banks, offering a comprehensive, detailed, and reliable view of fines totaling around $450 million for major financiers under NYC?s LL97.
  • Engineered and documented data workflows using Python, SQL, Tableau Prep, and AWS services (RDS, S3, and Glue) to manage and analyze data from over 25,000 properties, ensuring robust data integration and accessibility.
  • Integrated financial analytics to identify sustainable investment trends among key financiers of NYC’s LL97 carbon emissions.

Memorial Sloan Kettering Cancer Center

Graduate Student Researcher
June – Dec 2023 | New York, US
  • Led a cancer research initiative, employing Large Language Models and Named Entity Recognition (NER) to automate gene annotation in research articles. Streamlined the updation process of OncoKB database by accelerating gene annotation tasks through the development of a BioMed-BERT-powered model, mitigating manual efforts and reducing time-intensive processes.
  • Engineered an end-to-end pipeline that fetches new research from PubMed, performs predictions, labels diverse genes, and seamlessly updates the OncoKB database, ensuring a continuous and precise flow of annotated genetic information.

Logitix

Data Science Intern
June – Dec 2023 | Florida, US
  • Trained an ensemble machine learning model (XGBoost and SVM) to predict ticket tiers with 94% accuracy, securing lucrative partnerships with multiple prestigious sports venues and directly generating $100K in revenue.
  • Leveraged continuous integration and continuous development practices, including test automation and monitoring, to ensure successful deployment of ML models and application code, while maintaining communication with app development team.
  • Formulated dynamic pricing problem as price forecasting problem and developed custom analytical explainable models using SHAP that generated insights to help the pricing team, reduced the pricing decision making time by 15 minutes..
  • Collaborated with data analytics team to enhance clustering algorithms, focusing on business objectives and model accuracy.
  • Developed business solutions dashboard to convey technical insights to non-technical stakeholders through data storytelling.

Persistent Systems

Machine Learning Intern
Jan – April 2022 | Pune, India
  • Accelerated manual classification of cells in histopathological images, increasing efficiency by 80%, by building Image Segmentation Models to detect and count different types of cells.
  • Enhanced keyword extraction accuracy by 15% and reduced preprocessing time by 40% (down to 3 seconds) by streamlining data pipeline with Apache Airflow and incorporating Deep Learning model for text analysis post speech-to-text conversion.
  • Engineered pipeline to perform face-matching post-enhancement of government IDs and portrait photos using GANs.

AkzoNobel

Data Science Intern
Aug 2021 – Mar 2022 | Netherlands (Remote)
  • Improved color classification model accuracy by 20% by implementing an ensemble of Random Forest and Light Gradient Boosting Models using a Voting classifier, enhancing the model's ability to classify colors based on reflection values.
  • Simplified color recipe-generating process by building Machine Learning models to generate color recipes using solid colors.
  • Rationalized relating colors and toners by analyzing large-scale color recipe datasets and performing ETL processes.

Kenmark ITAN Solutions

Junior Data Science Associate
April – July 2020 | Mumbai, India
  • Led development of text cleaning pipeline, reducing processing time by 40% to 7 seconds and expediting integration of data.
  • Implemented a baseline recommendation system using sentiment analysis for a client?s social media application, increasing user retention time by 50 seconds as validated through A/B testing. Authored end to end documentation.
  • Conducted and facilitated knowledge transfer by hosting a tutoring session for 11 full-time staff members.

Sapio Analytics

Data Analyst Intern
April – June 2020 | Mumbai, India
  • Maximized supply chain efficiency of COVID-19 vaccine deliveries by spearheading the development of a collaborative dashboard (Tableau & Dash), leveraging AWS to extract key metrics. Presented it to three Andhra Pradesh government leaders.
  • Analyzed historical data and market trends to predict need of essential supplies at hyper-granular level in India (ad hoc queries).
  • Managed SQL database (over 40 tables with 100,000 rows) for COVID-19 Project, integrated by mobile and web applications.