AI & ML interests

Retrieval, Computer Vision, LLM

Recent Activity

vidore 's collections 12

ViDoRe Benchmark V3
ViDoRe V3 is our latest benchmark, engineered to set a new industry gold standard for multi-modal, enterprise document retrieval evaluation.
Hf-native ColVision Models
Models that can be used with the native transformers 🤗 implementation instead of colpali-engine.
ViDoRe Benchmark (BEIR)
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the BEIR format.
ColPali Paper Resources
Main resources for the paper: "ColPali: Efficient Document Retrieval with Vision Language Models"
ViDoRe Community benchmark contributions
This collection regroups works done by the community to improve together Visual Retrieval !
ViDoRe Benchmark
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format.
ViDoRe Chunk OCR (baseline)
The ViDoRe benchmark was passed to Unstructured to partition each page into text chunks. Detected figures/tables were captioned with Claude 3-Sonnet.
ViDoRe Page OCR (artifact)
ViDoRe benchmark with the full OCR text of each page. ⚠️ This dataset serves a intermediate step → use "ViDoRe Chunk OCR (baseline)" for evaluation!
ViDoRe Benchmark V3
ViDoRe V3 is our latest benchmark, engineered to set a new industry gold standard for multi-modal, enterprise document retrieval evaluation.
ViDoRe Community benchmark contributions
This collection regroups works done by the community to improve together Visual Retrieval !
Hf-native ColVision Models
Models that can be used with the native transformers 🤗 implementation instead of colpali-engine.
ViDoRe Benchmark
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format.
ViDoRe Benchmark (BEIR)
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the BEIR format.
ViDoRe Chunk OCR (baseline)
The ViDoRe benchmark was passed to Unstructured to partition each page into text chunks. Detected figures/tables were captioned with Claude 3-Sonnet.
ColPali Paper Resources
Main resources for the paper: "ColPali: Efficient Document Retrieval with Vision Language Models"
ViDoRe Page OCR (artifact)
ViDoRe benchmark with the full OCR text of each page. ⚠️ This dataset serves a intermediate step → use "ViDoRe Chunk OCR (baseline)" for evaluation!