CVPR 2026

PriVi

Towards a General-Purpose Video Model for Primate Behavior in the Wild

Felix B. Mueller, Jan F. Meier, Timo Lueddecke, Richard Vogg, Roger L. Freixanet, Valentin Hassler, Tiffany Bosshard, Elif Karakoc, William J. O'Hearn, Sofia M. Pereira, Sandro Sehner, Kaja Wierucka, Judith Burkart, Claudia Fichtel, Julia Fischer, Alexander Gail, Catherine Hobaiter, Julia Ostner, Liran Samuni, Oliver Schülke, Neda Shahidi, Erin G. Wessling, Alexander S. Ecker

ArXiv Code (coming soon) Dataset (coming soon)

Call for Data Contributions: We want to turn this project into a broad data collection and curation effort to give computer vision researchers easy access to large-scale high-quality diverse primate videos. The goal is to make it easier to develop methods that work across varying settings and species. If you have primates videos to contribute (observational or part of experiments, no labels or curation needed, can be from test runs or failed experiments), please email felix.mueller@cs.uni-goettingen.de to become part of the effort!

PriVi teaser figure showing SOTA improvements across four primate behavior benchmarks

Pretraining on PriVi improves state-of-the-art across four primate behavior benchmarks. Frozen evaluation of a V-JEPA model pretrained on PriVi consistently outperforms prior work, including fully finetuned baselines. Additionally pretraining on the target datasets improves performance further.

Abstract

Non-human primates are our closest living relatives, and analyzing their behavior is central to research in cognition, evolution, and conservation. Computer vision could greatly aid this research, but existing methods often rely on human-centric pretrained models and focus on single datasets, which limits generalization. We address this limitation by shifting from a model-centric to a data-centric approach and introduce PriVi, a large-scale primate-centric video pretraining dataset. PriVi contains 424 hours of curated video, combining 174 hours from behavioral research across 11 settings with 250 hours of diverse web-sourced footage, assembled through a scalable data curation pipeline. We continue pretraining V-JEPA, a large-scale video model, on PriVi to learn primate-specific representations and evaluate it using a lightweight frozen classifier. Across four benchmark datasets — ChimpACT, PanAf500, BaboonLand, and ChimpBehave — our approach consistently outperforms prior work, including fully finetuned baselines, and scales favorably with fewer labels. These results demonstrate for the first time that domain-level pretraining, where pretraining is conducted on similar data but not the target dataset itself, works for video models. Our primate-centric pretraining substantially improves data efficiency and generalization, making it a promising approach for low-label applications.

Dataset Samples

PriVi spans a wide range of primate species, environments, and recording conditions — from camera traps in the wild to controlled lab settings.

Eulemur · Wild

Baboon · Wild

Ring-tailed Lemur · Wild-like

Assamese Macaque · Wild

Primate · YouTube

Dataset Overview

424 h

Total Video

Primate Species

Research Settings

720K

Video Clips

PriVi combines two complementary subsets: R&O (174h) consists of research video from 11 behavioral studies spanning lemurs, baboons, chimpanzees, macaques, marmosets, and more. YT-Filtered (250h) adds diverse web-sourced footage curated through our automated pipeline to maximize coverage of species and settings.

Species Distribution

Genus / Family	YT-Filt.	R&O	PriVi
Macaques	63.1%	14.1%	43.0%
Chimpanzees	7.8%	35.7%	19.3%
Baboons	1.3%	16.2%	7.4%
True Lemurs	<1%	22.7%	9.8%
Marmosets	<1%	5.7%	2.1%
Squirrel monkeys	<1%	5.4%	2.0%
Orangutans	4.1%	0%	2.4%
Others	8.1%	0%	4.8%

Setting Distribution

Setting	YT-Filt.	R&O	PriVi
In the wild	59.6%	62.7%	60.9%
Wild-like	27.8%	22.4%	25.6%
Indoors	4.1%	14.6%	8.4%
Not identifiable	8.6%	0%	5.1%

Data curation pipeline. Our automated pipeline filters, deduplicates, and curates web-sourced primate videos at scale.

Method

We continue pretraining V-JEPA (ViT-L) on PriVi to learn primate-specific video representations. For evaluation, we develop a lightweight attentive classifier with only 220K trainable parameters on top of the frozen encoder. This enables efficient training for diverse primate behavior benchmarks without finetuning the backbone.

Model architecture: V-JEPA continual pretraining with attentive classifier

Model overview. We continue pretraining V-JEPA on PriVi and optionally the target dataset, then evaluated with a frozen backbone and lightweight attentive classifier.

Benchmark Results

PriVi consistently outperforms prior methods across four primate behavior benchmarks, including fully finetuned baselines with orders of magnitude more trainable parameters.

Method	ChimpACT		PanAf500		BaboonLand		ChimpBehave
Method	mAP	mAP_w	Acc	B-Acc	Acc	B-Acc	Acc	B-Acc
X3D	27.05	51.60	80.00	50.35	64.89	31.41	89.3	62.8
VideoMAEv2	—	—	—	—	—	—	92.3	74.8
UniformerV2-B	—	—	—	—	63.45	28.67	—	—
InternVideo-L	25.7	—	78.57	54.01	—	—	—	—
ChimpVLM	—	—	84.91	61.94	—	—	—	—
VideoPrism-g	31.5	—	—	—	—	—	—	—
Our Classifier
V-JEPA	36.33	55.50	82.96	56.69	74.91	26.99	94.99	68.41
PriVi	39.25	58.16	86.74	62.75	75.43	33.99	95.58	71.30
PriVi + DaLP	40.00	59.29	85.01	62.96	76.42	38.57	96.02	75.14

Bold = best, underline = second best. Results on test sets.

PriVi scales favorably with fewer labeled data. Error bars are 95% confidence intervals estimated over three different subsets. Results on validation sets.

Dataset Examples Gallery

Sample frames from each of the 11 R&O research subsets and the YT-Filtered web data.

R&O — Research Data

eulemur

baboon_w

baboon_a

lemur

assamese

barbary_a

barbary_t

marmoset

saimiri

rhesus

YT-Filtered — Web Data

youtube

Data Contributors

PriVi was made possible by contributions from researchers across multiple institutions. The table below lists the contributors of each dataset subset.

Dataset	Contributors
eulemur	Elif Karakoc (DPZ-BE), Claudia Fichtel (DPZ-BE), Peter Kappeler (UGoe)
baboon_w	William J. O'Hearn (UGoe/DPZ-CE), Julia Fischer (UGoe/DPZ-CE)
baboon_a	Julia Fischer (UGoe/DPZ-CE)
lemur	Kaja Wierucka (DPZ-BE)
assamese	Sofia M. Pereira (UGoe/DPZ-SE), Julia Ostner (UGoe/DPZ-SE), Oliver Schülke (UGoe/DPZ-SE)
barbary_a	Julia Fischer (UGoe/DPZ-CE)
barbary_t	Tiffany Bosshard (DPZ-CE), Julia Fischer (UGoe/DPZ-CE)
saimiri	Sandro Sehner (UZH), Judith Burkart (UZH)
marmoset	Sandro Sehner (UZH), Judith Burkart (UZH)
rhesus	Alexander Gail (UGoe/DPZ-CN), Neda Shahidi (UGoe/DPZ-CN)

UGoe = University of Göttingen · DPZ-BE = DPZ, Behavioral Ecology & Sociobiology · DPZ-CE = DPZ, Cognitive Ethology · DPZ-SE = DPZ, Social Evolution in Primates · DPZ-CN = DPZ, Cognitive Neuroscience · UZH = University of Zurich · DPZ = German Primate Center — Leibniz Institute for Primate Research

Citation

@article{mueller2024privi,
  title   = {Towards a General-Purpose Video Model for Primate
             Behavior in the Wild},
  author  = {Mueller, Felix B. and Meier, Jan F. and Lueddecke, Timo
             and Vogg, Richard and Freixanet, Roger L. and Hassler,
             Valentin and Bosshard, Tiffany and Karakoc, Elif and
             O'Hearn, William J. and Pereira, Sofia M. and Sehner,
             Sandro and Wierucka, Kaja and Burkart, Judith and
             Fichtel, Claudia and Fischer, Julia and Gail, Alexander
             and Hobaiter, Catherine and Ostner, Julia and Samuni,
             Liran and Sch{\"u}lke, Oliver and Shahidi, Neda and
             Wessling, Erin G. and Ecker, Alexander S.},
  journal = {arXiv preprint arXiv:2511.09675},
  year    = {2025}
}