Sarang Patil

Sarang Patil

PhD Student, Department of Data Science, New Jersey Institute of Technology

Profile photo

I am a second-year PhD student in the Department of Data Science at the New Jersey Institute of Technology (NJIT), advised by Dr. Mengjia Xu and a member of the Xu Lab. My research explores how hyperbolic geometry, state-space models, and graph neural networks can be used to build efficient large language models and dynamic graph embeddings.

Recent work includes the Hierarchical Mamba framework, which combines efficient Mamba2 state-space models with hyperbolic spaces to capture hierarchical relationships in language data, and a comparative study of dynamic graph embedding approaches using transformers and the Mamba architecture. In addition to these projects, I continue to develop hyperbolic models for other domains, exploring how curvature-aware embeddings can benefit a wide range of applications. I was also involved in surveying, and organizing the rapidly growing body of work on hyperbolic large language models, which forms the basis of my recent survey accepted in SIAM Review.

Outside of research, I enjoy playing chess ♟️, hiking 🥾, exploring new cuisines 🍜🌍, and playing video games 🎮.

Education

New Jersey Institute of Technology

Ph.D. in Data Science (2024 â€“ Present)

University of Maryland Baltimore County

Master of Professional Studies in Data Science (2021 â€“ 2022)

Savitribai Phule Pune University, India

Bachelor of Engineering in Computer Engineering (2016 â€“ 2020)

Interests

  • Hyperbolic geometry and non-Euclidean representation learning
  • State-space models (SSMs) and efficient sequence modeling
  • Large language models and hierarchy-aware embeddings
  • Graph neural networks and dynamic graph embedding
  • Curvature-aware optimization and geometric deep learning

Experience

Research Assistant, New Jersey Institute of Technology, NJ

September 2024 â€“ Present

Research Assistant, University of Maryland Baltimore County, MD

Jan 2022 â€“ Dec 2022

Data Science Intern, CoReCo Technologies, Pune, India

Aug 2019 â€“ Jun 2020

Project Intern, Aalborg University, Copenhagen, Denmark

Jan 2018 â€“ Feb 2018

Publications

Hyperbolic LLM diagram
Sarang Patil, Zeyong Zhang, Yiran Huang, Tengfei Ma, Mengjia Xu
SIAM Review (accepted, 2025)
Large language models excel at many tasks but often fail to capture the non-Euclidean hierarchies present in real-world data. This survey paper reviews recent progress on Hyperbolic LLMs (HypLLMs) and has been accepted for publication in the SIAM Review. It categorizes existing models into four groups—models using exponential/logarithmic maps, hyperbolic fine-tuned models, fully hyperbolic models, and hyperbolic state-space models—and discusses applications across language, vision and multimodal domains. A companion repository of papers, code and datasets is maintained on GitHub.
HiM diagram
Sarang Patil, Ashish Parmanand Pandey, Ioannis Koutis, Mengjia Xu
arXiv preprint, 2025
This work introduces the Hierarchical Mamba (HiM) model, which integrates efficient Mamba2 state-space models with hyperbolic representations (PoincarĂ© and Lorentz manifolds) to learn hierarchy-aware language embeddings. HiM projects Mamba2 outputs into hyperbolic space with learnable curvature and hyperbolic loss functions, capturing relational distances across levels of a hierarchy. Experiments on linguistic and medical datasets show that HiM outperforms Euclidean baselines and highlights the trade-offs between the PoincarĂ© and Lorentz variants. Source code is available online.
Dynamic graph embedding diagram
Ashish Parmanand Pandey, Alan John Varghese, Sarang Patil, Mengjia Xu
arXiv preprint, 2024
Dynamic graph embedding models aim to learn representations of time-evolving networks. This paper compares transformer-based approaches with the recently proposed Mamba state-space architecture and introduces three new models: TransformerG2G augmented with graph convolutional networks, DG-Mamba, and GDG-Mamba. Experiments show that Mamba-based models achieve similar or better link-prediction accuracy than transformer models while scaling linearly with sequence length, making them more efficient for graphs with high temporal variability.

Academic Service

Reviewer:

News

Contact

Email: sp3463@njit.edu
Address: New Jersey Institute of Technology, Newark, NJ 07102

View Larger Map