Robot Control API
Complete API reference for controlling robot animations, expressions, and voice synthesis. All endpoints accept and return JSON.
Health check and queue monitoring endpoints
Server health check. Returns status and timestamp.
Click "Send" to test
Returns current number of commands waiting in the Redis queue.
Click "Send" to test
Pop the next command from queue (FIFO). Used by the robot terminal to consume commands. Warning: this removes the command from the queue.
Click "Send" to test
Control robot expressions and Spine animations via the command queue
Send an expression command. The robot terminal will play the corresponding "ĺć" animation.
| Parameter | Type | Description |
|---|---|---|
expressionId REQUIRED | string | Expression identifier: happy sad angry surprised thinking embarrassed neutral |
duration optional | number | Duration in ms (default: 3000) |
clientId optional | string | Client identifier for tracking |
Expression â Animation Mapping:
ĺć垎çŹ
ĺćĺ
ĺćĺć
ĺćéŠč¨
ĺććč
ĺććąéĄ
ĺćĺž ćş
Click an expression above, then "Send"
Send a Spine animation action command directly by animation name.
| Parameter | Type | Description |
|---|---|---|
actionName REQUIRED | string | Action/animation name |
spineAnimation REQUIRED | string | Spine animation name (usually same as actionName) |
loop optional | boolean | Whether to loop the animation (default: false) |
clientId optional | string | Client identifier |
Available Spine Animations:
Click an animation above, then "Send"
Text-to-Speech synthesis with lip-sync viseme data
Send a TTS command to the robot queue. The robot terminal will synthesize speech and play with lip-sync.
| Parameter | Type | Description |
|---|---|---|
text REQUIRED | string | Text to speak |
voice optional | string | zh-CN zh-TW en-US ja-JP |
speed optional | number | Speed multiplier (default: 1.0) |
mouthAnimation optional | string | Lip-sync method (default: "phoneme") |
clientId optional | string | Client identifier |
Click "Send" to test
Azure Speech REST API synthesis. Returns audio (base64 MP3) and estimated viseme data for lip-sync. Visemes are generated from text heuristics.
| Parameter | Type | Description |
|---|---|---|
text REQUIRED | string | Text to synthesize |
voice optional | string | Language hint (e.g. "en-US" or "zh-CN") |
speed optional | number | Speed multiplier (default: 1.0) |
Click "Send" to test (requires AZURE_SPEECH_KEY)
Azure Speech SDK synthesis with accurate real-time viseme events. Provides precise lip-sync timing from the SDK's viseme callback. Recommended for production use.
| Parameter | Type | Description |
|---|---|---|
text REQUIRED | string | Text to synthesize |
voice optional | string | Language hint |
speed optional | number | Speed multiplier (default: 1.0) |
| Response Fields | |
|---|---|
audio | Base64 encoded MP3 audio |
visemes | Array of {start, end, visemeId, spineSlot} |
duration | Total audio duration in seconds |
voice | Voice name used |
method | "SDK" |
rawVisemeCount | Number of raw viseme events from SDK |
mergedVisemeCount | Number of merged visemes (optimized) |
Click "Send" to test (requires AZURE_SPEECH_KEY)
Send multiple commands in a single request
Send an array of commands to be pushed into the queue at once.
| Parameter | Type | Description |
|---|---|---|
commands REQUIRED | array | Array of command objects. Each must have type and data. |
Click "Send" to test
Azure Viseme ID to Spine slot mapping used for lip-sync
| Viseme ID | Phonemes | Spine Slot |
|---|---|---|
| 0 | Silence | ĺ´ćŁĺ¸¸ |
| 1 | ae, ax, ah | A,e,H,i,I |
| 2 | aa | a,c,C,w,! |
| 3, 6, 7, 8, 20 | ao, uh, ow, uw, w | o.u |
| 4, 5, 9, 10, 21 | ey, eh, iy, ih, y | A,e,H,i,I |
| 11, 12 | r, l | r |
| 13, 14 | s, z, sh, ch, jh, zh | s,z |
| 15, 17 | th, dh, d, t, n | T,D,L |
| 16 | f, v | F,v |
| 18 | k, g, ng | G |
| 19 | p, b, m | b,m,p |