← Back to browse
Editor score 73
Media
moshi
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Why use it
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Best for
media
Deployment options
compose · docker
Resource requirements
server
Alternative to
No mapping yet
Common setup stack
Reverse proxy · HTTPS certs · auth gateway · backups