Llama-F-Actor
This HF repository contains F-Actor, a controllable full-duplex model based on Llama 3.2 introduced in F-Actor: Controllable Conversational Behaviour in Full-Duplex Models.
About our work: Spoken conversational systems require more than accurate speech generation to have human-like conversations: to feel natural and engaging, they must produce conversational behaviour that adapts dynamically to the context. Current spoken conversational systems, however, rarely allow such customization, limiting their naturalness and usability. In this work, we present the first open, instruction-following full-duplex conversational speech model that can be trained efficiently under typical academic resource constraints. By keeping the audio encoder frozen and finetuning only the language model, our model requires just 2,000 hours of data, without relying on large-scale pretraining or multi-stage optimization. The model can follow explicit instructions to control speaker voice, conversation topic, conversational behaviour (e.g., backchanneling and interruptions), and dialogue initiation. We propose a single-stage training protocol and systematically analyze design choices. Both the model and training code is released to enable reproducible research on controllable full-duplex speech systems.
Please refer to the codebase for the usage of the model.
For more information, please have a look at the paper.
Citation
If you use this model, please cite:
@misc{züfle2026factorcontrollableconversationalbehaviour,
title={F-Actor: Controllable Conversational Behaviour in Full-Duplex Models},
author={Maike Züfle and Ondrej Klejch and Nicholas Sanders and Jan Niehues and Alexandra Birch and Tsz Kin Lam},
year={2026},
eprint={2601.11329},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.11329},
}
- Downloads last month
- -
Model tree for maikezu/f-actor
Base model
meta-llama/Llama-3.2-1B-Instruct