API instruction for Text to Speech

30-11--0001 , 20-03-2026

Convert text to speech programmatically. Send an HTTP request, get an audio file back — MP3, WAV, FLAC, OGG, or Opus.

What is SpeechGen API

SpeechGen API lets you convert text to speech programmatically — no browser, no manual work. Send your text and voice parameters via an HTTP request and receive a ready-to-use audio file.

What you can build:

Auto-narrate articles, news feeds, and notifications
Integrate TTS into your app, chatbot, or CRM
Batch-process large volumes of text
Generate podcasts, audiobooks, and e-learning content

Base URL for all requests:

https://speechgen.io/index.php?r=api

Voice list (JSON) · Voice catalog

Getting started 1. Top up your balance — API access requires a paid account.
2. Copy your token from the Dashboard.
3. Make your first request — ready-to-use examples below in cURL, PHP, Python, and JavaScript.

Choosing the Right Method

The API offers three endpoints for speech synthesis. Choose the one that fits your use case:

Method	How it works	Best for
`/text`	Send a request — get the audio file immediately. Single step.	Short texts up to 2,000 characters. Instant results.
`/longtext`	Send a request — receive a task `id`. Poll `/result` until the file is ready. Two steps.	Long texts up to 1,000,000 characters. Books, articles, bulk content.
`/subs`	Same as `/longtext`, but also returns timestamps (subtitles). Two steps.	Text-to-time alignment for video, presentations, or karaoke.

Tip: instant results for long texts Split your text into sentences (up to 2,000 chars per request), synthesize each chunk via /text, and concatenate the audio files on your end (e.g., with ffmpeg). This gives you immediate results without waiting for the /longtext queue.

Authentication

Every request to the API must include two required parameters:

Parameter	Type	Description
`token`	string	Your secret API key. Found in the Dashboard. Never share it.
`email`	string	The email address associated with your account.

Accepted data formats The API accepts data in three ways — use whichever is convenient:
1. POST with application/x-www-form-urlencoded (most common)
2. POST with application/json in the request body
3. GET with query string parameters

/text — Instant Synthesis (1 Step)

POST GET

https://speechgen.io/index.php?r=api/text

The simplest way to convert text to speech. Send a single request and get the audio file immediately in the response. No polling, no second request needed.

Limit: 2,000 characters per request. For longer texts, use /longtext or split into chunks.

Required parameters

Parameter	Type	Description
`token`	string	Your API key
`email`	string	Account email
`voice`	string	Voice name, e.g. `Matthew plus`, `Aria`. Full list — api/voices
`text`	string	Text to convert (up to 2,000 characters)

Optional parameters

All parameters below are optional. Default values are used when omitted.

Voice & style

Parameter	Type	Default	Description
`speed`	float	1	Speech speed. Range 0.1–2.0. E.g. `0.8` = slower, `1.3` = faster.
`pitch`	int	0	Voice pitch. Range −20 to 20.
`style`	string	—	Emotional style: `newscast`, `cheerful`, `sad`, etc. Not available for all voices. Browse voices. (Legacy name `emotion` also accepted.)
`styledegree`	string	—	Style intensity, e.g. `1.5` for stronger emotion. Works only with `style`.
`role`	string	—	Voice role: `YoungAdultMale`, `OlderAdultFemale`, etc. Not available for all voices.

Pauses & volume

Parameter	Type	Default	Description
`pause_sentence`	int	—	Pause between sentences in milliseconds, e.g. `300` = 0.3 s.
`pause_paragraph`	int	—	Pause between paragraphs in milliseconds, e.g. `500` = 0.5 s.
`volume`	int	100	Output volume. `100` = normal, `150` = louder, `50` = quieter. Range 10–200.
`effect`	string	—	Audio effect, e.g. `car` (in-car sound). Available for some voices.

Audio format & quality

Parameter	Type	Default	Description
`format`	string	mp3	Output format: `mp3`, `wav`, `ogg`, `opus`, `flac`
`sample_rate`	int	—	Sample rate in Hz, e.g. `24000`, `44100`, `48000`. Higher = better quality.
`bitrate`	int	—	Bitrate in kbps for lossy formats (mp3, ogg, opus). Range 6–320, e.g. `128`, `192`, `320`.
`channels`	int	—	Channels: `1` = mono, `2` = stereo.

Background music

Add background music that plays under the voice.

Parameter	Type	Description
`music`	int	Background music ID from the catalog. Omit for voice only.
`musik_volume`	int	Music volume. `100` = normal, `50` = quiet, `150` = loud. Range 5–200.
`musik_loop`	int	Loop music: `1` = yes (music repeats until speech ends), `0` = no.

Example: /text request

curl -X POST "https://speechgen.io/index.php?r=api/text" \
  -d "token=YOUR_TOKEN" \
  -d "email=you@example.com" \
  -d "voice=Matthew plus" \
  -d "text=Hello! This is a test voiceover via the API." \
  -d "format=mp3" \
  -d "speed=1" \
  -d "sample_rate=24000" \
  -d "bitrate=192" \
  -d "channels=2"

<?php
$url = "https://speechgen.io/index.php?r=api/text";
$data = [
    'token'  => 'YOUR_TOKEN',
    'email'  => 'you@example.com',
    'voice'  => 'Matthew plus',
    'text'   => 'Hello! This is a test voiceover via the API.',
    'format' => 'mp3',
    'speed'  => 1,
    'sample_rate' => 24000,
    'bitrate'     => 192,
    'channels'    => 2,
];

$ch = curl_init();
curl_setopt_array($ch, [
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_URL  => $url,
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => http_build_query($data),
    CURLOPT_SSL_VERIFYHOST => false,
    CURLOPT_SSL_VERIFYPEER => false,
]);

$response = curl_exec($ch);
if (curl_errno($ch)) die("cURL error: " . curl_error($ch));
curl_close($ch);

$result = json_decode($response, true);

if ($result['status'] == 1) {
    echo "File: " . $result['file'] . "\n";
    echo "Duration: " . $result['duration'] . " sec\n";
    echo "Cost: " . $result['cost'] . "\n";
    copy($result['file'], 'output.' . $result['format']);
} else {
    echo "Error: " . ($result['error'] ?? 'Unknown error');
}

import requests

url = "https://speechgen.io/index.php?r=api/text"
data = {
    "token": "YOUR_TOKEN",
    "email": "you@example.com",
    "voice": "Matthew plus",
    "text": "Hello! This is a test voiceover via the API.",
    "format": "mp3",
    "speed": 1,
    "sample_rate": 24000,
    "bitrate": 192,
    "channels": 2,
}

response = requests.post(url, data=data, timeout=60)
result = response.json()

if result.get("status") == 1:
    print("File:", result["file"])
    print("Duration:", result["duration"], "sec")
    print("Cost:", result["cost"])

    audio = requests.get(result["file"])
    with open(f'output.{result["format"]}', "wb") as f:
        f.write(audio.content)
else:
    print("Error:", result.get("error", "Unknown error"))

const url = "https://speechgen.io/index.php?r=api/text";

const data = new URLSearchParams({
    token: "YOUR_TOKEN",
    email: "you@example.com",
    voice: "Matthew plus",
    text: "Hello! This is a test voiceover via the API.",
    format: "mp3",
    speed: "1",
    sample_rate: "24000",
    bitrate: "192",
    channels: "2",
});

const resp = await fetch(url, {
    method: "POST",
    headers: { "Content-Type": "application/x-www-form-urlencoded" },
    body: data.toString(),
});

const result = await resp.json();

if (result.status === 1) {
    console.log("File:", result.file);
    console.log("Duration:", result.duration, "sec");
} else {
    console.error("Error:", result.error);
}

Response: /text (JSON)

On success (status = 1), the response contains a link to the audio file:

{
  "id": "2870459",
  "status": 1,
  "file": "https://speechgen.io/texttomp3/20260114/p_2870459_342.mp3",
  "file_cors": "https://speechgen.io/index.php?r=site/download&prj=2870459&cors=...",
  "parts": "1",
  "parts_done": "1",
  "duration": 3,
  "format": "mp3",
  "error": "",
  "balans": "11563.364",
  "cost": 0.042
}

Response fields

Field	Description
`id`	Unique synthesis ID
`status`	`1` = file ready, `-1` = error (see `error` field)
`file`	Direct link to the audio file — download or stream it
`file_cors`	CORS-enabled link — use when downloading from the browser (JavaScript)
`duration`	Audio duration in seconds
`format`	File format (mp3, wav, flac, …)
`balans`	Remaining account balance
`cost`	Amount charged for this synthesis
`error`	Error message (when `status = -1`)

Tip: batch processing via /text Need to synthesize a long text but want instant results? Split the text by sentences (up to 2,000 chars each), process each chunk sequentially via /text, and concatenate the audio files on your end (e.g., with ffmpeg). This is faster than waiting for /longtext queue processing.

/longtext — Long Text Synthesis (2 Steps)

POST GET

https://speechgen.io/index.php?r=api/longtext

Designed for large texts — up to 1,000,000 characters (books, articles, documents). Unlike /text, the result is not immediate — the task is queued and processed on the server.

How it works

Two-step process Step 1. Send your text to /longtext — you get a task id and status = 0 (processing).
Step 2. Poll /result with that id every 2–5 seconds. When status becomes 1 — the file is ready; grab the link from the file field.

Request parameters are identical to /text (voice, text, format, speed, pitch, style, sample_rate, bitrate, channels, music, etc.). The only difference: no 2,000-character limit.

Example: /longtext + /result

# Step 1: submit text, get task id
curl -X POST "https://speechgen.io/index.php?r=api/longtext" \
  -d "token=YOUR_TOKEN" \
  -d "email=you@example.com" \
  -d "voice=Matthew plus" \
  -d "text=Your long text goes here, up to 1 million characters..." \
  -d "format=mp3" \
  -d "sample_rate=24000" \
  -d "bitrate=128" \
  -d "channels=1"

# Response: {"id":"4153594","status":0,...}

# Step 2: poll for result (repeat every 2 sec until status != 0)
curl -X POST "https://speechgen.io/index.php?r=api/result" \
  -d "token=YOUR_TOKEN" \
  -d "email=you@example.com" \
  -d "id=4153594"

# When status=1, the "file" field contains the audio URL

<?php
$BASE  = "https://speechgen.io/index.php?r=api";
$TOKEN = "YOUR_TOKEN";
$EMAIL = "you@example.com";

function postForm($url, $data) {
    $ch = curl_init();
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_URL  => $url,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => http_build_query($data),
        CURLOPT_SSL_VERIFYHOST => false,
        CURLOPT_SSL_VERIFYPEER => false,
    ]);
    $raw = curl_exec($ch);
    if (curl_errno($ch)) die("cURL error: " . curl_error($ch));
    curl_close($ch);
    return json_decode($raw, true);
}

// STEP 1: Submit text for synthesis
$create = postForm($BASE . "/longtext", [
    'token'  => $TOKEN,
    'email'  => $EMAIL,
    'voice'  => 'Matthew plus',
    'text'   => 'Your long text goes here...',
    'format' => 'mp3',
    'sample_rate' => 24000,
    'bitrate'     => 128,
    'channels'    => 1,
]);

$id = $create['id'] ?? 0;
if (!$id) die("Error: " . ($create['error'] ?? 'No id returned'));
echo "Task created, id=$id\n";

// STEP 2: Poll for result every 2 seconds
for ($i = 0; $i < 600; $i++) {
    $res = postForm($BASE . "/result", [
        'token' => $TOKEN, 'email' => $EMAIL, 'id' => $id
    ]);

    if (($res['status'] ?? 0) == 1) {
        echo "File ready: " . $res['file'] . "\n";
        echo "Duration: " . $res['duration'] . " sec\n";
        copy($res['file'], 'output.' . $res['format']);
        break;
    }

    if (($res['status'] ?? 0) == -1) {
        die("Error: " . ($res['error'] ?? 'Unknown'));
    }

    echo "Processing... ({$res['parts_done']}/{$res['parts']})\n";
    sleep(2);
}

import time
import requests

BASE  = "https://speechgen.io/index.php?r=api"
TOKEN = "YOUR_TOKEN"
EMAIL = "you@example.com"

# STEP 1: Submit text for synthesis
create = requests.post(f"{BASE}/longtext", data={
    "token": TOKEN,
    "email": EMAIL,
    "voice": "Matthew plus",
    "text": "Your long text goes here...",
    "format": "mp3",
    "sample_rate": 24000,
    "bitrate": 128,
    "channels": 1,
}, timeout=60).json()

pid = create.get("id")
if not pid:
    raise RuntimeError(f"Error: {create.get('error', 'No id returned')}")
print(f"Task created, id={pid}")

# STEP 2: Poll for result every 2 seconds
for _ in range(600):
    res = requests.post(f"{BASE}/result", data={
        "token": TOKEN, "email": EMAIL, "id": pid
    }, timeout=60).json()

    if res.get("status") == 1:
        print(f"File ready: {res['file']}")
        print(f"Duration: {res['duration']} sec")
        audio = requests.get(res["file"])
        with open(f"output.{res['format']}", "wb") as f:
            f.write(audio.content)
        break

    if res.get("status") == -1:
        raise RuntimeError(f"Error: {res.get('error')}")

    print(f"Processing... ({res.get('parts_done')}/{res.get('parts')})")
    time.sleep(2)

const BASE = "https://speechgen.io/index.php?r=api";

async function postForm(path, obj) {
    const resp = await fetch(`${BASE}/${path}`, {
        method: "POST",
        headers: { "Content-Type": "application/x-www-form-urlencoded" },
        body: new URLSearchParams(obj).toString(),
    });
    return await resp.json();
}

// STEP 1: Submit text for synthesis
const create = await postForm("longtext", {
    token: "YOUR_TOKEN",
    email: "you@example.com",
    voice: "Matthew plus",
    text: "Your long text goes here...",
    format: "mp3",
    sample_rate: "24000",
    bitrate: "128",
    channels: "1",
});

if (!create.id) throw new Error(create.error || "No id returned");
const id = String(create.id);
console.log(`Task created, id=${id}`);

// STEP 2: Poll for result every 2 seconds
for (let i = 0; i < 600; i++) {
    const res = await postForm("result", {
        token: "YOUR_TOKEN", email: "you@example.com", id
    });

    if (res.status === 1) {
        console.log("File ready:", res.file);
        console.log("Duration:", res.duration, "sec");
        break;
    }

    if (res.status === -1) throw new Error(res.error || "Error");

    console.log(`Processing... (${res.parts_done}/${res.parts})`);
    await new Promise(r => setTimeout(r, 2000));
}

Response: Step 1

When the task is created, the server returns status = 0 (queued) and an id for tracking:

{
  "id": "4153594",
  "status": 0,
  "parts": "5",
  "parts_done": "0",
  "format": "mp3",
  "error": "",
  "balans": "3331.272",
  "cost": 0.00
}

Response: Step 2 (file ready)

When status becomes 1, the file and duration fields appear:

{
  "id": "4153594",
  "status": 1,
  "file": "https://speechgen.io/texttomp3/.../result.mp3",
  "file_cors": "https://speechgen.io/index.php?r=site/download&prj=4153594&cors=...",
  "parts": "5",
  "parts_done": "5",
  "duration": 42,
  "format": "mp3",
  "error": "",
  "balans": "3331.272",
  "cost": 1.26
}

Polling frequency Poll every 2 seconds for short texts. For very long texts (books), increase the interval to 5 seconds to reduce server load.

/subs — Synthesis with Subtitles (2 Steps)

POST GET

https://speechgen.io/index.php?r=api/subs

Works exactly like /longtext (two-step process: submit → poll), but additionally returns timestamps — text-to-time alignment for each fragment. Useful for creating subtitles for video, karaoke, or synchronized presentations.

Additional parameters

In addition to all parameters from /text, the /subs endpoint accepts:

Parameter	Type	Description
`speed_type`	int	Speed control type: `1` or `2`
`speed_floor`	int	Additional speed/pause parameter

Usage is identical to /longtext — just change the URL to /subs and add the extra parameters. The /result response will include a cuts array with links to individual audio fragments and their timestamps.

/result — Get Result

POST GET

https://speechgen.io/index.php?r=api/result

Used only after /longtext or /subs. Not needed for /text — that returns the file immediately.

Pass the task id received in Step 1, and the server returns the current status. Poll in a loop until status becomes 1 (ready) or -1 (error).

Parameters

Parameter	Type	Required	Description
`token`	string	yes	API key
`email`	string	yes	Account email
`id`	int	yes	Task ID from `/longtext` or `/subs`

Response fields

Field	Description
`status`	`0` = still processing (wait), `1` = ready (download file), `-1` = error
`file`	Direct link to the audio file (appears when `status = 1`)
`file_cors`	CORS-enabled link for browser downloads
`cuts`	Array of audio fragments (when using `/subs` or the `obrezka` tag)
`parts` / `parts_done`	Total fragments / completed — useful for a progress bar
`duration`	Duration in seconds (when `status = 1`)
`balans`	Remaining balance
`cost`	Total cost (increases as fragments are processed)

Helper Endpoints

/voices — List all voices

GET

https://speechgen.io/index.php?r=api/voices

Returns a JSON array of all available voices. Use it to build a voice picker in your app.

Filter by language:

https://speechgen.io/index.php?r=api/voices&langs=en,ru

/balance — Check balance

POST GET

https://speechgen.io/index.php?r=api/balance

Parameters: token, email. Returns your current account balance.

/delete — Delete a project

POST GET

https://speechgen.io/index.php?r=api/delete

Parameters: token, email, id. Deletes the project and its audio file from the server.

Audio Formats & Quality

The API supports five audio formats:

Format	Type	When to use
`mp3`	lossy	Universal format. Works everywhere, good for most use cases.
`ogg`	lossy	Good quality at small file sizes. Popular on the web.
`opus`	lossy	Best quality at low bitrates. Ideal for streaming.
`wav`	lossless	No compression. Maximum quality, but large file size.
`flac`	lossless	Lossless compression. Same quality as WAV, smaller file.

sample_rate vs. bitrate

Parameter	What it is	Unit	Examples
`sample_rate`	Sample rate — how many times per second the sound is sampled. Higher = more detail.	Hz	`24000`, `44100`, `48000`
`bitrate`	Bitrate — how much data per second of audio. Only matters for lossy formats (mp3, ogg, opus).	kbps	`64`, `128`, `192`, `320`

Backward compatibility: bitrate / sample_rate In earlier versions, the bitrate parameter actually controlled the sample rate (Hz), not bitrate. This has been fixed: bitrate is now real bitrate (kbps), and sample_rate was added for frequency control.

For compatibility: if sample_rate is omitted and bitrate contains a value that looks like Hz (e.g. 48000), the server automatically interprets it as sample_rate. Legacy code will continue to work.

FLAC / WAV note Lossless formats don't have a fixed bitrate. If Windows shows 300–1200+ kbps for a FLAC file, that's normal — it's how lossless compression works.

Errors & Status Codes

Every API response contains a status field:

Status	Meaning	Action
`1`	Ready	File is ready — download from the `file` field
`0`	Processing	Retry `/result` in 2 seconds
`-1`	Error	Check the `error` field for details

Common errors:

Empty or invalid token — verify your API key in the Dashboard
Text exceeds 2,000 characters for /text — use /longtext or split into chunks
Unknown voice name — check available voices via /voices
Insufficient balance — top up your account