LIKU Socket

WebSocket API Integration Guide

Overview

This document provides detailed information about integrating with our real-time voice chat WebSocket API. The API enables real-time bidirectional communication for voice chat functionality with AI assistants.

Authentication

Authentication is handled via a token-based system. The token is obtained through the Liku authentication API.

Authentication Endpoint

POST https://torooc-liku-api.med2lab.com/api/liku/authentication/

Request Body:

{
  "email": "[email protected]",
  "password": "user_password"
}

Response:

{
  "token_key": "5fbbffb246009bde6b732f42d57b8f3a62c545ad"
}

The token_key should be stored and used for subsequent API calls.

Connection Details

WebSocket Endpoint

wss://voice-liku-chat.med2lab.com/chat-stream  (secure WebSocket connection)

Sequence Diagram

The following sequence diagram illustrates the communication flow between the client and server:

Sequence Description

  1. Authentication

    • Client sends authentication request to the Liku API

    • Server validates credentials and returns a token_key

  2. Connection Initialization

    • Client requests connection data using the token

    • Server generates and returns encoded connection data

  3. WebSocket Connection

    • Client establishes WebSocket connection with the server

    • Client sends connection data to the server

    • Server confirms connection is established

  4. Conversation Flow

    • Client sends audio data to the server

    • Server processes and returns user transcript

    • Server sends assistant audio to client

    • Server sends assistant transcript to client

Connection Process

  1. Authentication First, authenticate to obtain a token:

    async function authenticate(email, password) {
      const response = await fetch('https://torooc-liku-api.med2lab.com/api/liku/authentication/', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'Accept': 'application/json'
        },
        body: JSON.stringify({ 
          email: email,
          password: password 
        })
      });
      
      if (response.ok) {
        const data = await response.json();
        return data.token_key;
      } else {
        throw new Error('Authentication failed');
      }
    }
  2. Initialize Connection Data Make a GET request to obtain the connection initialization data:

    GET https://voice-liku-chat.med2lab.com/api/init-websocket

    Query Parameters:

    • token (required): Your API authentication token

    • topic_id (required): Topic ID for the conversation

    • face_id (optional): Face ID for the avatar

    • liku_id (optional): Liku ID for the assistant

    • language (optional): Language code (e.g., "ko" for Korean, "en" for English)

    Example:

    GET https://voice-liku-chat.med2lab.com/api/init-websocket?token=your_token&topic_id=55&face_id=1&liku_id=33&language=ko
  3. Establish WebSocket Connection Use the obtained connection data to establish the WebSocket connection:

    const websocket = new WebSocket('wss://voice-liku-chat.med2lab.com/chat-stream');
    websocket.binaryType = 'arraybuffer';  // Important for audio data handling
  4. Send Connection Data Once connected, send the connection data as the first message:

    websocket.onopen = function() {
      const message = JSON.stringify({
        connection_data: connectionData
      });
      websocket.send(message);
    };

Message Types

Client to Server

  1. Connection Data

    • Format: JSON string

    • Sent as the first message after connection is established

    • Contains authentication and configuration information

  2. Audio Data

    • Format: Binary audio data (16-bit PCM, 16kHz sample rate)

    • Send as binary WebSocket messages

Server to Client

  1. Text Messages (JSON format)

    {
      "type": "user_transcript",
      "text": "Transcribed user speech",
      "message_id": "unique_id",
      "status": "completed"
    }
    {
      "type": "assistant_transcript",
      "text": "Assistant's speech transcript",
      "message_id": "unique_id",
      "response_id": "response_id"
    }
    {
      "type": "assistant_message",
      "text": "Assistant's text response",
      "message_id": "unique_id"
    }
  2. Audio Data

    • Format: Binary audio data (16-bit PCM)

    • Received as binary WebSocket messages

Audio Processing

The API handles audio processing with the following parameters:

  • Sample Rate: 16kHz (input), 24kHz (output)

  • Silence Duration: 700ms (for turn detection)

  • Audio Format: 16-bit PCM

Saving Transcripts

To save conversation transcripts to the server:

POST https://voice-liku-chat.med2lab.com/api/save-transcript

Headers:

Content-Type: application/json
Authorization: Token {your_token}

Request Body:

{
  "connection_data": "your_connection_data",
  "topic_id": "topic_id",
  "conversation_id": "conversation_id"
}

Complete Implementation Example

// Authentication
async function handleLogin(email, password) {
  try {
    const response = await fetch('https://torooc-liku-api.med2lab.com/api/liku/authentication/', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Accept': 'application/json'
      },
      body: JSON.stringify({ 
        email: email,
        password: password 
      })
    });
    
    if (response.ok) {
      const data = await response.json();
      if (data.token_key) {
        return data.token_key;
      } else {
        throw new Error('Invalid response format from server');
      }
    } else {
      throw new Error('Authentication failed');
    }
  } catch (error) {
    console.error('Login error:', error);
    throw error;
  }
}

// Get connection data
async function getConnectionData(token, topicId, faceId, likuId, language) {
  try {
    let url = `/api/init-websocket?token=${encodeURIComponent(token)}&topic_id=${encodeURIComponent(topicId)}`;
    
    // Add optional parameters if they exist
    if (faceId) url += `&face_id=${encodeURIComponent(faceId)}`;
    if (likuId) url += `&liku_id=${encodeURIComponent(likuId)}`;
    if (language) url += `&language=${encodeURIComponent(language)}`;

    const response = await fetch(url);

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.detail || 'Failed to get init data');
    }

    return await response.text();
  } catch (error) {
    console.error('Error getting init websocket data:', error);
    throw error;
  }
}

// Initialize WebSocket
async function initWebSocket(connectionData) {
  const url = 'wss://voice-liku-chat.med2lab.com/chat-stream';

  const websocket = new WebSocket(url);
  websocket.binaryType = 'arraybuffer';

  websocket.onopen = function() {
    const message = JSON.stringify({
      connection_data: connectionData
    });
    websocket.send(message);
  };

  websocket.onmessage = function(event) {
    if (typeof event.data === 'string') {
      try {
        const jsonData = JSON.parse(event.data);
        if (jsonData.type === 'user_transcript') {
          console.log('User said:', jsonData.text);
        } else if (jsonData.type === 'assistant_transcript' || jsonData.type === 'assistant_message') {
          console.log('Assistant said:', jsonData.text);
        }
      } catch (e) {
        console.log('Received text message:', event.data);
      }
    } else {
      // Handle binary audio data
      playAudio(event.data);
    }
  };

  websocket.onclose = function(event) {
    console.log('WebSocket connection closed:', event.code, event.reason);
    if (event.code === 4000) {
      console.error('Connection closed due to invalid connection data');
    }
  };

  websocket.onerror = function(error) {
    console.error('WebSocket error:', error);
  };

  return websocket;
}

// Save transcript to server
async function saveTranscriptToServer(token, connectionData, topicId, conversationId) {
  try {
    const response = await fetch('https://voice-liku-chat.med2lab.com/api/save-transcript', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Token ${token}`
      },
      body: JSON.stringify({
        connection_data: connectionData,
        topic_id: topicId,
        conversation_id: conversationId
      })
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.detail || 'Failed to save transcript');
    }

    return await response.json();
  } catch (error) {
    console.error('Error saving transcript:', error);
    throw error;
  }
}

Error Handling

WebSocket close codes:

  • 1000: Normal closure

  • 4000: Invalid connection data

  • 1005: No status code present

Audio Requirements

  • Input Audio Format: 16-bit PCM

  • Sample Rate: 16kHz

  • Channels: Mono

  • Chunk Size: Recommended 4096 bytes

Best Practices

  1. Connection Management

    • Implement reconnection logic with exponential backoff

    • Handle connection timeouts

    • Clean up resources when connection closes

  2. Audio Processing

    • Implement proper audio preprocessing (noise reduction, gain control)

    • Handle silence detection

    • Buffer audio data appropriately

  3. Error Handling

    • Implement proper error handling for all WebSocket events

    • Log errors for debugging

    • Provide user feedback for connection issues

  4. Security

    • Always use secure WebSocket connections (WSS)

    • Never expose tokens in client-side code

    • Implement proper token refresh mechanisms

Support

For additional support or questions, please contact our support team at [email protected].

Last updated