Muhammad Sohail Danish Building Multimodal Foundation Models for Earth Observation

PhD candidate at MBZUAI, advised by Dr. Salman Khan, working on geospatial foundation models, multisensor representation learning, and vision-language models for Earth observation. My research focuses on building large-scale models that understand optical, SAR, multispectral, and temporal satellite imagery.

Currently, I am a PhD Resident at Microsoft’s AI for Good Lab, developing multimodal multi-agents that combine spatial analytics with vision-language reasoning for querying geospatial, vector, and satellite imagery data.

Placeholder portrait for Muhammad Sohail Danish

Currently Building

Multi-Agent Geospatial AI Systems

Timeline & News

A timeline of my research milestones, publications, and project updates across my academic and professional journey.

  1. Microsoft PhD Residency

    Excited to start my PhD residency at Microsoft’s AI for Good Lab.

  2. TerraFM preprint released

    Posted TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation

  3. Paper accepted at ICCV 2025 (Highlight)

    GEOBench-VLM accepted as an ICCV 2025 Highlight, recognizing our benchmark for evaluating VLMs across diverse remote-sensing modalities.

  4. Paper accepted to CVPR 2025

    EarthDial accepted to CVPR 2025: a conversational vision-language model for multi-sensory Earth observation

  5. Paper accepted at DICTA 2024 (ORAL)

    Perturbing Dominant Feature Modes for Single Domain-Generalized Object Detection

  6. Two papers accepted at CVPR 2024

    GeoChat: the first grounded vision-language model designed for remote-sensing imagery.
    DivAlign: Improving Single Domain-Generalized Object Detection.

  7. Started PhD at MBZUAI

    Joined the Vision Lab in Abu Dhabi to build geospatial foundation models under Prof. Salman Khan.

  8. Joined Intelligent Visual Analytics Lab at MBZUAI

    Joined IVA Lab as a Graduate Research Assistant, working on single-domain generalization for object detection under Prof. Muhammad Haris Khan.

  9. Completed MS in Data Science (ITU)

    Completed my MS in Data Science at ITU with a thesis on single-domain generalization for object detection.

  10. Paper accepted at CVPR 2022

    Towards Low-cost and Efficient Malaria Detection

Publications & Releases

* represents equal contribution.

TerraFM cover art

ArXiv 2025

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation

Muhammad Sohail Danish, Muhammad Akhtar Munir, Syed Roshaan Ali Shah, Muhammad Haris Khan, Rao Muhammad Anwer, Jorma Laaksonen, Fahad Shahbaz Khan, Salman Khan

GEOBench-VLM benchmark teaser

ICCV 2025 (Highlight)

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks

Muhammad Sohail Danish*, Muhammad Akhtar Munir*, Syed Roshaan Ali Shah, Kartik Kuckreja, Fahad Shahbaz Khan, Paolo Fraccaro, Alexandre Lacoste, Salman Khan

GeoChat cover graphic

CVPR 2024

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

Muhammad Sohail Danish*, Kartik Kuckreja*, Muzammal Naseer, Abhijit Das, Salman Khan, Fahad Shahbaz Khan

Single domain generalization illustration

CVPR 2024

Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment

Muhammad Sohail Danish, Muhammad Haris Khan, Muhammad Akhtar Munir, M. Saquib Sarfraz, Mohsen Ali

Perturbing dominant feature modes illustration

DICTA 2024 (Oral)

Perturbing Dominant Feature Modes for Single Domain-Generalized Object Detection

Muhammad Sohail Danish, Javed Iqbal, Mohsen Ali, M. Saquib Sarfraz, Salman Khan, Muhammad Haris Khan

Malaria detection paper graphic

CVPR 2022

Towards Low-cost and Efficient Malaria Detection

Waqas Sultani, Muhammad Sohail Danish*, Wajahat Nawaz*, Syed Javed*, Asma Saadia, Mohsen Ali

Experience

Aug 2025 – Present · Kenya

Microsoft · PhD Resident Fellow

AI for Good Lab

Designing multi-modal multi-agent system that enables non-technical decision-makers to interact with and query geospatial disaster assessment data.

Aug 2023 – Present · Abu Dhabi

MBZUAI · PhD Researcher

Intelligent Visual Analytics Lab

Research on geospatial foundation models and large VLMs; proposed GeoChat, GeoBench-VLM, and TerraFM.

Sep 2022 – Aug 2023 · Abu Dhabi

Intelligent Visual Analytics Lab · Graduate Research Assistant

Mohamed bin Zayed University of Artificial Intelligence

Developed single-domain generalization in object detection method; introduced alignment losses to improve out-of-domain performance.

Sep 2020 – Aug 2022 · Lahore

Intelligent Machine Lab · Graduate Student Researcher

Information Technology University

Explored domain adaptation for medical imaging and face recognition, including the low-cost malaria detection framework.

Sep 2016 – Aug 2020 · Remote

Freelance Web & Mobile Engineer · Fiverr

Independent

Delivered 60+ full-stack products leveraging React, React Native, Django, and PostgreSQL with consistent 5★ ratings.

Education

PhD · Computer Vision

MBZUAI · GPA 3.57 / 4.0

Courses: Advanced CV, Advanced 3D CV, LVLMs, Lifelong Learning, Visual Recognition. Focus on geospatial foundation models.

Advisor: Prof. Salman Khan · Co-Advisor: Prof. Muhammad Haris Khan

MS · Data Science

Information Technology University · GPA 3.47 / 4.0

Thesis on single-domain generalization for object detection. Funded by Graduate Student Fellowship.

Advisor: Prof. Mohsen Ali

BS · Computer Science

Qurtuba University · GPA 3.97 / 4.0

FYP project: An autonomous drone.
Selected Courses: Object Oriented Programming, Data Structure and Algorithm, Artificial Intelligence, Software Engineering, Mobile Apps development

Talks, Tutorials & Reviewing

Technical Skills

Domains

Deep Learning, Computer Vision, Large Language Models, Multi-Modal Learning, Foundation Models, Remote Sensing, Distributed Training

Libraries

PyTorch, TensorFlow, Keras, Scikit-learn, AutoGen

Programming Languages

Python, JavaScript, PHP

Applications & Databases

React, React Native, Redux, Django, Flask, MySQL, PostgreSQL, Firebase