Multi-modal emotion console
Capture voice, text, and facial movement in real time to infer emotions using a Plutchik-based taxonomy. Start a session to stream local audio/video features to the FastAPI backend and watch the fused prediction update live.
Idle – start a capture session to analyze emotion signals.
Live camera feed
Enable the session to capture facial expression metrics.Text transcript
The full transcript is sent with each inference window to enrich the prediction.Voice metrics
Start capturing to compute RMS energy, pitch, tempo, and jitter.Aggregated predictions will appear here once the backend returns the first inference window.