Muhammad Sohail Danish Building Multimodal Foundation Models for Earth Observation

PhD candidate at MBZUAI, advised by Dr. Salman Khan, working on geospatial foundation models, multisensor representation learning, and vision-language models for Earth observation. My research focuses on building large-scale models that understand optical, SAR, multispectral, and temporal satellite imagery.

Currently, I am a PhD Resident at Microsoft’s AI for Good Lab, developing multimodal multi-agents that combine spatial analytics with vision-language reasoning for querying geospatial, vector, and satellite imagery data.

Placeholder portrait for Muhammad Sohail Danish

Currently Building

Multi-Agent Geospatial AI Systems

Download CV Email Me

Timeline & News

A timeline of my research milestones, publications, and project updates across my academic and professional journey.

Aug 2025

Microsoft PhD Residency

Excited to start my PhD residency at Microsoft’s AI for Good Lab.
Jun 2025

TerraFM preprint released

Posted TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation
Jun 2025

Paper accepted at ICCV 2025 (Highlight)

GEOBench-VLM accepted as an ICCV 2025 Highlight, recognizing our benchmark for evaluating VLMs across diverse remote-sensing modalities.
Apr 2025

Paper accepted to CVPR 2025

EarthDial accepted to CVPR 2025: a conversational vision-language model for multi-sensory Earth observation
Aug 2024

Paper accepted at DICTA 2024 (ORAL)

Perturbing Dominant Feature Modes for Single Domain-Generalized Object Detection
Feb 2024

Two papers accepted at CVPR 2024

GeoChat: the first grounded vision-language model designed for remote-sensing imagery.
DivAlign: Improving Single Domain-Generalized Object Detection.
Aug 2023

Started PhD at MBZUAI

Joined the Vision Lab in Abu Dhabi to build geospatial foundation models under Prof. Salman Khan.
Sep 2022

Joined Intelligent Visual Analytics Lab at MBZUAI

Joined IVA Lab as a Graduate Research Assistant, working on single-domain generalization for object detection under Prof. Muhammad Haris Khan.
Aug 2022

Completed MS in Data Science (ITU)

Completed my MS in Data Science at ITU with a thesis on single-domain generalization for object detection.
Mar 2022

Paper accepted at CVPR 2022

Towards Low-cost and Efficient Malaria Detection

Publications & Releases

* represents equal contribution.

ArXiv 2025

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation

Muhammad Sohail Danish, Muhammad Akhtar Munir, Syed Roshaan Ali Shah, Muhammad Haris Khan, Rao Muhammad Anwer, Jorma Laaksonen, Fahad Shahbaz Khan, Salman Khan

Paper Code

ICCV 2025 (Highlight)

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks

Muhammad Sohail Danish*, Muhammad Akhtar Munir*, Syed Roshaan Ali Shah, Kartik Kuckreja, Fahad Shahbaz Khan, Paolo Fraccaro, Alexandre Lacoste, Salman Khan

Paper Code Dataset

CVPR 2024

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

Muhammad Sohail Danish*, Kartik Kuckreja*, Muzammal Naseer, Abhijit Das, Salman Khan, Fahad Shahbaz Khan

Paper Code Dataset

Single domain generalization illustration

CVPR 2024

Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment

Muhammad Sohail Danish, Muhammad Haris Khan, Muhammad Akhtar Munir, M. Saquib Sarfraz, Mohsen Ali

Paper Code

Perturbing dominant feature modes illustration

DICTA 2024 (Oral)

Perturbing Dominant Feature Modes for Single Domain-Generalized Object Detection

Muhammad Sohail Danish, Javed Iqbal, Mohsen Ali, M. Saquib Sarfraz, Salman Khan, Muhammad Haris Khan

Paper

CVPR 2022

Towards Low-cost and Efficient Malaria Detection

Waqas Sultani, Muhammad Sohail Danish*, Wajahat Nawaz*, Syed Javed*, Asma Saadia, Mohsen Ali

Paper Code Dataset

Experience

Aug 2025 – Present · Kenya

Microsoft · PhD Resident Fellow

AI for Good Lab

Designing multi-modal multi-agent system that enables non-technical decision-makers to interact with and query geospatial disaster assessment data.

Aug 2023 – Present · Abu Dhabi

MBZUAI · PhD Researcher

Intelligent Visual Analytics Lab

Research on geospatial foundation models and large VLMs; proposed GeoChat, GeoBench-VLM, and TerraFM.

Sep 2022 – Aug 2023 · Abu Dhabi

Intelligent Visual Analytics Lab · Graduate Research Assistant

Mohamed bin Zayed University of Artificial Intelligence

Developed single-domain generalization in object detection method; introduced alignment losses to improve out-of-domain performance.

Sep 2020 – Aug 2022 · Lahore

Intelligent Machine Lab · Graduate Student Researcher

Information Technology University

Explored domain adaptation for medical imaging and face recognition, including the low-cost malaria detection framework.

Sep 2016 – Aug 2020 · Remote

Freelance Web & Mobile Engineer · Fiverr

Independent

Delivered 60+ full-stack products leveraging React, React Native, Django, and PostgreSQL with consistent 5★ ratings.

Education

PhD · Computer Vision

MBZUAI · GPA 3.57 / 4.0

Courses: Advanced CV, Advanced 3D CV, LVLMs, Lifelong Learning, Visual Recognition. Focus on geospatial foundation models.

Advisor: Prof. Salman Khan · Co-Advisor: Prof. Muhammad Haris Khan

MS · Data Science

Information Technology University · GPA 3.47 / 4.0

Thesis on single-domain generalization for object detection. Funded by Graduate Student Fellowship.

Advisor: Prof. Mohsen Ali

BS · Computer Science

Qurtuba University · GPA 3.97 / 4.0

FYP project: An autonomous drone.
Selected Courses: Object Oriented Programming, Data Structure and Algorithm, Artificial Intelligence, Software Engineering, Mobile Apps development

Talks, Tutorials & Reviewing

October 2025

Guest speaker, Saint Mary’s College of California

Shared my research on GeoChat and GeoBench-VLM and LLM Agents for geospatial data.
March 2025

Guest speaker, ADIA Lab Research Seminar

Presented GeoChat's design with demos and applied use cases.
2023 – Present

Reviewing

ICCV 2023, CVPR 2024, ICCV 2025, CVPR 2025

Technical Skills

Domains

Deep Learning, Computer Vision, Large Language Models, Multi-Modal Learning, Foundation Models, Remote Sensing, Distributed Training

Libraries

PyTorch, TensorFlow, Keras, Scikit-learn, AutoGen

Programming Languages

Python, JavaScript, PHP

Applications & Databases

React, React Native, Redux, Django, Flask, MySQL, PostgreSQL, Firebase

Muhammad Sohail Danish Building Multimodal Foundation Models for Earth Observation

Timeline & News

Microsoft PhD Residency

TerraFM preprint released

Paper accepted at ICCV 2025 (Highlight)

Paper accepted to CVPR 2025

Paper accepted at DICTA 2024 (ORAL)

Two papers accepted at CVPR 2024

Started PhD at MBZUAI

Joined Intelligent Visual Analytics Lab at MBZUAI

Completed MS in Data Science (ITU)

Paper accepted at CVPR 2022

Publications & Releases

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment

Perturbing Dominant Feature Modes for Single Domain-Generalized Object Detection

Towards Low-cost and Efficient Malaria Detection

Experience

Microsoft · PhD Resident Fellow

MBZUAI · PhD Researcher

Intelligent Visual Analytics Lab · Graduate Research Assistant

Intelligent Machine Lab · Graduate Student Researcher

Freelance Web & Mobile Engineer · Fiverr

Education

PhD · Computer Vision

MS · Data Science

BS · Computer Science

Talks, Tutorials & Reviewing

Guest speaker, Saint Mary’s College of California

Guest speaker, ADIA Lab Research Seminar

Reviewing

Technical Skills

Domains

Libraries

Programming Languages

Applications & Databases