Whispeak API (0.2)

Download OpenAPI specification:Download

Introduction

WhispeakAPI allows you to enroll, authenticate, and identify a speaker with his voice. • Enrollment: generate a signature from a speaker voice record. • Authentication: Compare a voice record with a signature to check if they belong to the same speaker. • Identification: Find the closest signatures of an audio file in a group of previously enrolled speakers. The purpose is to find the owner of the record voice.

Applications

To use this API, you need a dedicated application. An "Application" needs to be configured with a specific type (e.g. Authentification, Identification). Each application can be referred inside the URI (e.g. /customers/whispeak/auth-app) for the concerned endpoints.

Configurations

A "Configuration" is used to assign the operating parameters of an application.

Expressions

Enrollment, authentication, and identification can be customized during the configuration of an application. The configuration use arithmetic and boolean expression for this purpose. An "Expression" can refer to the engines, which offers predefined function. Depending on the type of application, different expressions are used.

Enrollement

The enrollment stage consists to generate a signature from audio files recorded with the speaker's voice. Enrollment consists of creating your application users (aka speakers) by feeding it with their signatures. To do so you need to pass an audio file to the enrollment endpoint and the signature will be generated regarding the parameters you set in the application configuration.

Authentication

The aim of "Authentication" is to compare a signature against an audio file to determine whether or not they belong to the same speaker.

Identification

The "Identification" process finds the signatures that are most similar to an audio file to find to which speaker it belongs. Regarding the related configuration of the processed identification, the response will be a list of n speakers ordered from the closest one to the furthest. So to perform "Identification", you need to pass an audio file and a "group" of speaker signatures. This "group" can be generated with the corresponding endpoint.

Signatures

Signatures can either be stored API side or client side. This parameter is fixed in each application. This choice modifies the request and response parameters of different endpoints. If signatures are stored client side, then enrollment will return them as an encrypted string that needs to be stored somewhere. Signatures stored client side need to be sent for each enrollment update and each authentication.

A signature can be enriched with several audio files to make it stronger.

Groups

To perform an identification you need to create a "group" of signatures.

Audio support

This API supports multi-channels audio files in multiple format: wav, mp3, ogg, flac, aiff

Security

This API uses session tokens in authorization headers in most of the operations. These tokens have a short duration validity to avoid replay attacks. All the signatures are encrypted and signed before being returned, which prevents users to falsify them.

Authentication

app-api-key

Application private API key

Security Scheme Type API Key
parameter name:

speaker-token

Speaker session token

Security Scheme Type HTTP
HTTP Authorization Scheme bearer
Bearer format "JWT"

user-token

User access token

Security Scheme Type HTTP
HTTP Authorization Scheme bearer
Bearer format "JWT"

Enrollment API

Functions to manage signatures

Create enrollment session token

Create a session token to perform an enrollment

Authorizations:
path Parameters
customer
required
Example: my-customer

Pathname of the customer

application
required
Example: my-application

Pathname of the application

configuration
Example: my-configuration

Pathname of the configuration (production configuration is used by default)

header Parameters
Accept-Language
string
Enum: "fr" "en"

Indicate the speaking language used in the audio file

Responses

200

Success

400

Incorrect or missing parameters

401

Missing authorization

403

Invalid credentials

get/customers/{customer}/{application}/{configuration}/enroll
https://api.whispeak.io:3000/customers/{customer}/{application}/{configuration}/enroll

Request samples

Copy
const api_key = 'Gk9-kPAnxs6-zyloITz2KMrLNG3tBpy27CeCfQIvQy1fHBTc5_gHqRdJbAsbCTiJXtAFdVQPiLakCLPurbk_RA';

const res = await axios.get('https://api.whispeak.io:3002/customers/my-customer/my-application/enroll', {
	headers: {
		'Authorization': 'Bearer ' + api_key,
		'Accept-Language': 'fr'
	}
});

const token = res.data.token;
const text = res.data.text;

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "token": "string",
  • "text": "string"
}

Create an enrollment

Create a signature associated to a configuration (overwrite old signature)

Authorizations:
path Parameters
customer
required
Example: my-customer

Pathname of the customer

application
required
Example: my-application

Pathname of the application

configuration
Example: my-configuration

Pathname of the configuration (production configuration is used by default)

header Parameters
Accept-Language
string
Enum: "fr" "en"

Indicate the speaking language used in the audio file

Request Body schema: multipart/form-data
speaker
string

Speaker associated to the new signature. If not provided, this operation create a new speaker

file
required
string <binary>

Audio file

Responses

201

Signature created

202

Signature needs to be enriched to be valid

400

Incorrect or missing parameters

401

Missing authorization

403

Invalid credentials

415

Unsupported audio file

420

Audio file does not meet all the conditions

post/customers/{customer}/{application}/{configuration}/enroll
https://api.whispeak.io:3000/customers/{customer}/{application}/{configuration}/enroll

Request samples

Copy
const token = '...'; // Generated with enrollment token creation endpoint
const file = 'enroll.wav';

const form = new FormData();
form.append('file', fs.createReadStream(file));

const res = await axios.post('https://api.whispeak.io:3002/customers/my-customer/my-application/enroll', form, {
	headers: {
		'Authorization': 'Bearer '+token,
		'Accept-Language': 'fr',
		...form.getHeaders()
	}
});

const speaker = res.data.speaker;
console.log(res.data.details);
console.log(res.data.asserts);

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "speaker": "123e4567-e89b-12d3-a456-426655440000",
  • "signature": "string",
  • "details":
    [
    ],
  • "asserts":
    [
    ]
}

Update an enrollment

Update the signature associated to the configuration

Authorizations:
path Parameters
customer
required
Example: my-customer

Pathname of the customer

application
required
Example: my-application

Pathname of the application

configuration
Example: my-configuration

Pathname of the configuration (production configuration is used by default)

header Parameters
Accept-Language
string
Enum: "fr" "en"

Indicate the speaking language used in the audio file

Request Body schema: multipart/form-data
speaker
required
string

Speaker associated to the signature

signature
string

Previous signature to be updated (only when signature are stored client-side)

file
required
string <binary>

Audio file

Responses

200

Signature updated

202

Signature needs to be enriched to be valid

400

Incorrect or missing parameters

401

Missing authorization

403

Invalid credentials

404

Speaker not found

415

Unsupported audio file

420

Audio file does not meet all the conditions

put/customers/{customer}/{application}/{configuration}/enroll
https://api.whispeak.io:3000/customers/{customer}/{application}/{configuration}/enroll

Request samples

Copy
const endpoint = 'http://api.whispeak.io:3000/customers/my-customer/my-application';
const token = '...'; // Generated with enrollment token creation endpoint
const file = 'enroll.wav';
const speaker = '...'; // Speaker identifier previouly created
const res = await request({
    method: 'PUT',
    uri : endpoint + '/enroll',
    json: true,
    headers: {
        'Authorization' : 'Bearer ' + token
    },
    formData: {
        speaker,
        file: {
            value: fs.createReadStream(file),
            options: {
                filename: path.basename(file),
            }
        }
    }
});

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "signature": "string",
  • "details":
    [
    ],
  • "asserts":
    [
    ]
}

Delete an enrollment

Delete the signature associated to the configuration

Authorizations:
path Parameters
customer
required
Example: my-customer

Pathname of the customer

application
required
Example: my-application

Pathname of the application

configuration
Example: my-configuration

Pathname of the configuration (production configuration is used by default)

query Parameters
speaker
required
string
Example: speaker=123e4567-e89b-12d3-a456-426655440000

Speaker associated to the signature

Responses

200

Signature deleted

401

Missing authorization

403

Invalid credentials

404

Speaker not found

delete/customers/{customer}/{application}/{configuration}/enroll
https://api.whispeak.io:3000/customers/{customer}/{application}/{configuration}/enroll

Request samples

Copy
const api_key = 'Gk9-kPAnxs6-zyloITz2KMrLNG3tBpy27CeCfQIvQy1fHBTc5_gHqRdJbAsbCTiJXtAFdVQPiLakCLPurbk_RA';
const speaker = '...'; // Speaker identifier previouly created

const res = await axios.delete('https://api.whispeak.io:3002/customers/my-customer/my-application/enroll', {
	headers: {
		'Authorization': 'Bearer ' + api_key,
		'Accept-Language': 'fr'
	},
	params: {
		'speaker': speaker
	}
});

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "status": "500",
  • "message": "Internal serveur error"
}

Authentication API

Function to check that a sample match a signature

Create an authentication session token

Create a session token to perform an authentication

Authorizations:
path Parameters
customer
required
Example: my-customer

Pathname of the customer

application
required
Example: my-application

Pathname of the application

configuration
Example: my-configuration

Pathname of the configuration (production configuration is used by default)

header Parameters
Accept-Language
string
Enum: "fr" "en"

Indicate the speaking language used in the audio file

Responses

200

Success

400

Incorrect or missing parameters

401

Missing authorization

403

Invalid credentials

get/customers/{customer}/{application}/{configuration}/auth
https://api.whispeak.io:3000/customers/{customer}/{application}/{configuration}/auth

Request samples

Copy
const api_key = 'Gk9-kPAnxs6-zyloITz2KMrLNG3tBpy27CeCfQIvQy1fHBTc5_gHqRdJbAsbCTiJXtAFdVQPiLakCLPurbk_RA';

const res = await axios.get('https://api.whispeak.io:3002/customers/my-customer/my-application/auth', {
	headers: {
		'Authorization': 'Bearer ' + api_key,
		'Accept-Language': 'fr'
	}
});

const token = res.data.token;
const text = res.data.text;

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "token": "string",
  • "text": "string"
}

Perform an authentication

Authenticate a speaker with an audio file

Authorizations:
path Parameters
customer
required
Example: my-customer

Pathname of the customer

application
required
Example: my-application

Pathname of the application

configuration
Example: my-configuration

Pathname of the configuration (production configuration is used by default)

Request Body schema: multipart/form-data
speaker
required
string

Speaker associated to the signature that will be tested

file
required
string <binary>

Audio file that will be tested

signature
string

Signature to be tested (only when signature are stored client-side)

Responses

200

Successful authentication

400

Incorrect or missing parameters

401

Missing authorization

403

Invalid credentials

404

Speaker not found

415

Unsupported audio file

419

Authentication failed

420

Audio file does not meet all the conditions

430

Signature does not meet all the conditions

post/customers/{customer}/{application}/{configuration}/auth
https://api.whispeak.io:3000/customers/{customer}/{application}/{configuration}/auth

Request samples