CVPR 2026

PriVi

Towards a General-Purpose Video Model for Primate Behavior in the Wild

Felix B. Mueller, Jan F. Meier, Timo Lueddecke, Richard Vogg, Roger L. Freixanet, Valentin Hassler, Tiffany Bosshard, Elif Karakoc, William J. O'Hearn, Sofia M. Pereira, Sandro Sehner, Kaja Wierucka, Judith Burkart, Claudia Fichtel, Julia Fischer, Alexander Gail, Catherine Hobaiter, Julia Ostner, Liran Samuni, Oliver Schülke, Neda Shahidi, Erin G. Wessling, Alexander S. Ecker

Call for Data Contributions: We want to turn this project into a broad data collection and curation effort to give computer vision researchers easy access to large-scale high-quality diverse primate videos. The goal is to make it easier to develop methods that work across varying settings and species. If you have primates videos to contribute (observational or part of experiments, no labels or curation needed, can be from test runs or failed experiments), please email felix.mueller@cs.uni-goettingen.de to become part of the effort!
PriVi teaser figure showing SOTA improvements across four primate behavior benchmarks

Pretraining on PriVi improves state-of-the-art across four primate behavior benchmarks. Frozen evaluation of a V-JEPA model pretrained on PriVi consistently outperforms prior work, including fully finetuned baselines. Additionally pretraining on the target datasets improves performance further.

Abstract

Non-human primates are our closest living relatives, and analyzing their behavior is central to research in cognition, evolution, and conservation. Computer vision could greatly aid this research, but existing methods often rely on human-centric pretrained models and focus on single datasets, which limits generalization. We address this limitation by shifting from a model-centric to a data-centric approach and introduce PriVi, a large-scale primate-centric video pretraining dataset. PriVi contains 424 hours of curated video, combining 174 hours from behavioral research across 11 settings with 250 hours of diverse web-sourced footage, assembled through a scalable data curation pipeline. We continue pretraining V-JEPA, a large-scale video model, on PriVi to learn primate-specific representations and evaluate it using a lightweight frozen classifier. Across four benchmark datasets — ChimpACT, PanAf500, BaboonLand, and ChimpBehave — our approach consistently outperforms prior work, including fully finetuned baselines, and scales favorably with fewer labels. These results demonstrate for the first time that domain-level pretraining, where pretraining is conducted on similar data but not the target dataset itself, works for video models. Our primate-centric pretraining substantially improves data efficiency and generalization, making it a promising approach for low-label applications.

Dataset Samples

PriVi spans a wide range of primate species, environments, and recording conditions — from camera traps in the wild to controlled lab settings.

Eulemur in Madagascar forest
Eulemur · Wild
Baboon in Senegal
Baboon · Wild
Ring-tailed lemur
Ring-tailed Lemur · Wild-like
Assamese macaque
Assamese Macaque · Wild
YouTube primate footage
Primate · YouTube
YouTube primate footage
Primate · YouTube
YouTube primate footage
Primate · YouTube
YouTube primate footage
Primate · YouTube

Dataset Overview

424 h
Total Video
7+
Primate Species
11
Research Settings
720K
Video Clips

PriVi combines two complementary subsets: R&O (174h) consists of research video from 11 behavioral studies spanning lemurs, baboons, chimpanzees, macaques, marmosets, and more. YT-Filtered (250h) adds diverse web-sourced footage curated through our automated pipeline to maximize coverage of species and settings.

Species Distribution

Genus / FamilyYT-Filt.R&OPriVi
Macaques63.1%14.1%43.0%
Chimpanzees7.8%35.7%19.3%
Baboons1.3%16.2%7.4%
True Lemurs<1%22.7%9.8%
Marmosets<1%5.7%2.1%
Squirrel monkeys<1%5.4%2.0%
Orangutans4.1%0%2.4%
Others8.1%0%4.8%

Setting Distribution

SettingYT-Filt.R&OPriVi
In the wild59.6%62.7%60.9%
Wild-like27.8%22.4%25.6%
Indoors4.1%14.6%8.4%
Not identifiable8.6%0%5.1%
PriVi data curation pipeline

Data curation pipeline. Our automated pipeline filters, deduplicates, and curates web-sourced primate videos at scale.

Method

We continue pretraining V-JEPA (ViT-L) on PriVi to learn primate-specific video representations. For evaluation, we develop a lightweight attentive classifier with only 220K trainable parameters on top of the frozen encoder. This enables efficient training for diverse primate behavior benchmarks without finetuning the backbone.

Model architecture: V-JEPA continual pretraining with attentive classifier

Model overview. We continue pretraining V-JEPA on PriVi and optionally the target dataset, then evaluated with a frozen backbone and lightweight attentive classifier.

Benchmark Results

PriVi consistently outperforms prior methods across four primate behavior benchmarks, including fully finetuned baselines with orders of magnitude more trainable parameters.

Method ChimpACT PanAf500 BaboonLand ChimpBehave
mAPmAPw AccB-Acc AccB-Acc AccB-Acc
X3D 27.0551.60 80.0050.35 64.8931.41 89.362.8
VideoMAEv2 92.374.8
UniformerV2-B 63.4528.67
InternVideo-L 25.7 78.5754.01
ChimpVLM 84.9161.94
VideoPrism-g 31.5
Our Classifier
V-JEPA 36.3355.50 82.9656.69 74.9126.99 94.9968.41
PriVi 39.2558.16 86.7462.75 75.4333.99 95.5871.30
PriVi + DaLP 40.0059.29 85.0162.96 76.4238.57 96.0275.14

Bold = best, underline = second best. Results on test sets.

Data efficiency on PanAf500
Data efficiency on ChimpACT

PriVi scales favorably with fewer labeled data. Error bars are 95% confidence intervals estimated over three different subsets. Results on validation sets.

Dataset Examples Gallery

Sample frames from each of the 11 R&O research subsets and the YT-Filtered web data.

R&O — Research Data

YT-Filtered — Web Data

Data Contributors

PriVi was made possible by contributions from researchers across multiple institutions. The table below lists the contributors of each dataset subset.

DatasetContributors
eulemurElif Karakoc (DPZ-BE), Claudia Fichtel (DPZ-BE), Peter Kappeler (UGoe)
baboon_wWilliam J. O'Hearn (UGoe/DPZ-CE), Julia Fischer (UGoe/DPZ-CE)
baboon_aJulia Fischer (UGoe/DPZ-CE)
lemurKaja Wierucka (DPZ-BE)
assameseSofia M. Pereira (UGoe/DPZ-SE), Julia Ostner (UGoe/DPZ-SE), Oliver Schülke (UGoe/DPZ-SE)
barbary_aJulia Fischer (UGoe/DPZ-CE)
barbary_tTiffany Bosshard (DPZ-CE), Julia Fischer (UGoe/DPZ-CE)
saimiriSandro Sehner (UZH), Judith Burkart (UZH)
marmosetSandro Sehner (UZH), Judith Burkart (UZH)
rhesusAlexander Gail (UGoe/DPZ-CN), Neda Shahidi (UGoe/DPZ-CN)

UGoe = University of Göttingen · DPZ-BE = DPZ, Behavioral Ecology & Sociobiology · DPZ-CE = DPZ, Cognitive Ethology · DPZ-SE = DPZ, Social Evolution in Primates · DPZ-CN = DPZ, Cognitive Neuroscience · UZH = University of Zurich · DPZ = German Primate Center — Leibniz Institute for Primate Research

Citation

@article{mueller2024privi,
  title   = {Towards a General-Purpose Video Model for Primate
             Behavior in the Wild},
  author  = {Mueller, Felix B. and Meier, Jan F. and Lueddecke, Timo
             and Vogg, Richard and Freixanet, Roger L. and Hassler,
             Valentin and Bosshard, Tiffany and Karakoc, Elif and
             O'Hearn, William J. and Pereira, Sofia M. and Sehner,
             Sandro and Wierucka, Kaja and Burkart, Judith and
             Fichtel, Claudia and Fischer, Julia and Gail, Alexander
             and Hobaiter, Catherine and Ostner, Julia and Samuni,
             Liran and Sch{\"u}lke, Oliver and Shahidi, Neda and
             Wessling, Erin G. and Ecker, Alexander S.},
  journal = {arXiv preprint arXiv:2511.09675},
  year    = {2025}
}