Population coding of strategic variables during foraging in freely moving macaques

https://doi.org/10.1038/s41593-024-01575-w

Neda Shahidi, Melissa Franch, Arun Parajuli, Paul Schrater, Anthony Wright, Xaq Pitkow and Valentin Dragoi

3/5/2024

Objectives and context

The study investigates neural representations that support natural foraging decisions in freely moving primates, addressing limitations of prior trial-based and restrained tasks.
Researchers test whether the dorsolateral prefrontal cortex (dlPFC) encodes subjective reward predictors and whether such neural representations can forecast where and when the animal chooses to act.

Study design and methods

Subjects and setting: two macaques were observed in a trial-free foraging setup with two concurrent reward sources (boxes) separated by 120 cm, enabling natural locomotion and decisions.
Behavioral task: rewards became available at exponentially distributed times after a prior reward, with hidden schedules from 10, 15, 25, or 30 s blocks. A button press delivered the reward; availability persisted until pressed. The design induced continuous decision-making about when to press and where to press.
Neural recordings: a chronic Utah array in dlPFC (area 46) captured population activity during free movement, with wireless transmission for continuous recording.
Data scale: monkeys G (11 sessions) and T (19 sessions) contributed 1,323 single/multi-unit neurons; 8,862 presses were analyzed across 30 sessions.
Predictors and analyses: reward predictors included waiting time since last press and a reward ratio (current box vs. total scheduled rate). The study examined how these predictors correlated with next rewards and choices, using Pearson correlations and ROC/AUC metrics.
Decoding and latent structure: pre-press neuronal activity (1 s window) was decoded to estimate waiting time and reward ratio. Canonical correlation analysis (CCA) linked task variables (via basis functions) to neural population activity, identifying low-dimensional neural components that track reward predictors, choices, and waiting times.

Key findings and arguments

Behavioral sensitivity to reward dynamics: animals adjusted waiting times and switching behavior based on deviations from predicted reward outcomes, evidencing subjective reward expectations beyond simple reward history.
Neural encoding of reward predictors: a substantial fraction of dlPFC neurons showed pre-press activity covarying with waiting time and reward ratio; decoding showed that waiting time and reward ratio could be predicted from neural activity above chance.
Population-level representations: CCA revealed latent neural components that align with task predictors (waiting time, reward ratio) and outcomes (reward, stay/switch). These components provided a compressed, interpretable neural scaffold for reward dynamics.
Predictive power of latent representations: projecting neural activity onto the reward-predictor components predicted the animal’s next choice and waiting time as well as, or better than, the full neural population or raw task variables. This held across sessions and animals, indicating informative latent structure in dlPFC related to subjective reward dynamics.

Ethical and policy considerations

The study employs an unrestrained, trial-free paradigm with wireless neural recording, addressing welfare concerns by reducing confinement-related distortions in movement and perception. While not detailed in the abstract, such designs implicate animal welfare oversight (IACUC-equivalent in practice) and raise considerations for minimizing distress while maximizing ecological validity.
There are no human participants or human data in this work; policy implications primarily concern animal research practices, data sharing, and methodological standards for naturalistic neuroscience.

Limitations

Generalizability to other reward dynamics remains uncertain; the stochastic, time-based schedule may not capture non-Markovian or press-rate–driven reward structures.
The study analyzes two subjects; while rich, broader replication would strengthen cross-subject generalizability.
While movement confounds were mitigated, residual non-task-related variance could influence some neural signals.

Implications for policy and practice

Demonstrates the utility of naturalistic, freely moving paradigms for neural computation studies, informing future research design and ethical considerations in neuroscience.
Highlights the value of targeted dimensionality reduction (latent neural components) to interpret complex brain–behavior relationships, with potential cross-disciplinary applications in AI and decision theory.