Datasets:

Qingyun
/

remote-sensing-sft-data

Name: remote-sensing-sft-data
Creator: Qingyun Li
License: https://choosealicense.com/licenses/cc-by-4.0/

Modalities:

Image

Languages:

English

ArXiv:

Tags:

aerial

geoscience

remote sensing

License:

Dataset card Data Studio Files Files and versions

xet

Community

You need to agree to share your contact information to access this dataset

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

RSCoVLM: Co-Training Vision Language Models for Remote Sensing Multi-task Learning

Qingyun Li* Shuran Ma* Junwei Luo* Yi Yu* Yue Zhou Fengxiang Wang Xudong Lu Xiaoxing Wang Xin He Yushi Chen Xue Yang

If you find our work helpful, please consider giving us a ⭐!

ArXiv Paper: https://arxiv.org/abs/2511.21272
Published Paper: https://www.mdpi.com/2072-4292/18/2/222
GitHub Repo: https://github.com/VisionXLab/RSCoVLM
HuggingFace Page: https://huggingface.co/collections/Qingyun/rscovlm

This repo is a collection of supervised fine-tuning data of remote sensing domain, which is the data folder for RSCoVLM, which is a technical practice to fine-tune Large Multimodal language Models for remote sensing image understanding, ultra-high-resolution image reasoning, oriented object detection, and so on. This repo hosts the official model weight of the paper: RSCoVLM: Co-Training Vision Language Models for Remote Sensing Multi-task Learning.

Downloading Guide

You can download with your web browser on the file page.

We recommand downloading in terminal using hf (pip install --upgrade huggingface_hub). You can refer to the document for more usages.

# Set Huggingface Mirror for Chinese users (if required):
export HF_ENDPOINT=https://hf-mirror.com 
# Download the whole folder (you can also modify local-dir with your data path and make soft link here):
hf download Qingyun/remote-sensing-sft-data --repo-type dataset --local-dir playground/data/
# If any error (such as network error) interrupts the downloading, you just need to execute the same command, the latest huggingface_hub will resume downloading.

If you already download some data, you can also exclude them to save time. For example, you can exclude DOTA(split_ss_dota) trainval images with the --exclude option. You can also only download certain file with the position arg filenames or the --include option.

# This will exclude the files and just download the others.
hf download Qingyun/remote-sensing-sft-data --repo-type dataset --local-dir playground/data --exclude **split_ss_dota_trainval**
# This will download the file and should put it in the folder.
hf download Qingyun/remote-sensing-sft-data split_ss_dota/trainval/split_ss_dota_trainval_annfiles.tar.gz --repo-type dataset --local-dir playground/data
# This will download the files and put them like the arrangement in the repo.
hf download Qingyun/remote-sensing-sft-data --repo-type dataset --local-dir playground/data --include **split_ss_dota_trainval**

Then, extract all files from the compressed files. (merge xxx.tar.gz.partx and uncompress all xxx.tar.gz)

find . \( -name "*.tar.gz" -o -name "*.part0" \) -execdir bash -c 'f="{}"; if [[ "$f" =~ \.part0$ ]]; then base="${f%.part0}"; cat "$base".part* | tar -zxvf -; else tar -zxvf "$f"; fi' \;

At last, if required, you can delete all the compressed files.

# list the files to delete for checking (if required)
find . -type f -name "*.tar.gz*" -print
# delete
find . -type f -name "*.tar.gz*" -exec rm -f {} \;

Statement and ToU

We release the data under a CC-BY-4.0 license, with the primary intent of supporting research activities. We do not impose any additional using limitation, but the users must comply with the terms of use (ToUs) of the source dataset. This dataset is a processed version, intended solely for academic sharing by the owner, and does not involve any commercial use or other violations of the ToUs. Any usage of this dataset by users should be regarded as usage of the original dataset. If there are any concerns regarding potential copyright infringement in the release of this dataset, please contact me, and I will remove any data that may pose a risk.

Cite

RSCoVLM and LMMRotate paper:


@ARTICLE{li2026rscovlm,
  author={Li, Qingyun and Ma, Shuran and Luo, Junwei and Yu, Yi and Zhou, Yue and Wang, Fengxiang and Lu, Xudong and Wang, Xiaoxing and He, Xin and Chen, Yushi and Yang, Xue},
  title{Co-Training Vision-Language Models for Remote Sensing Multi-Task Learning},
  journal={Remote Sensing},
  volume={18},
  year={2026},
  number={2},
  article-number={222},
  url={https://www.mdpi.com/2072-4292/18/2/222},
  issn={2072-4292},
  doi={10.3390/rs18020222}
}

@INPROCEEDINGS{11242725,
  author={Li, Qingyun and He, Xin and Shu, Xinya and Yu, Yi and Chen, Dong and Chen, Yushi and Yang, Xue},
  booktitle={IGARSS 2025 - 2025 IEEE International Geoscience and Remote Sensing Symposium}, 
  title={A Simple Aerial Detection Baseline of Multimodal Language Models}, 
  year={2025},
  pages={6833-6837},
  doi={10.1109/IGARSS55030.2025.11242725}
}

Downloads last month: 2,128

Models trained or fine-tuned on Qingyun/remote-sensing-sft-data

Collection including Qingyun/remote-sensing-sft-data

RSCoVLM 🤖

Collection

[ArXiv 2025] Co-Training Vision Language Models for Remote Sensing Multi-task Learning. https://github.com/VisionXLab/RSCoVLM • 4 items • Updated 2 days ago

Paper for Qingyun/remote-sensing-sft-data

Co-Training Vision Language Models for Remote Sensing Multi-task Learning

Paper • 2511.21272 • Published Nov 26, 2025