API instruction for Text to Speech

, 20-03-2026

Convert text to speech programmatically. Send an HTTP request, get an audio file back — MP3, WAV, FLAC, OGG, or Opus.

What is SpeechGen API

SpeechGen API lets you convert text to speech programmatically — no browser, no manual work. Send your text and voice parameters via an HTTP request and receive a ready-to-use audio file.

What you can build:

  • Auto-narrate articles, news feeds, and notifications
  • Integrate TTS into your app, chatbot, or CRM
  • Batch-process large volumes of text
  • Generate podcasts, audiobooks, and e-learning content

Base URL for all requests:

https://speechgen.io/index.php?r=api
Getting started 1. Top up your balance — API access requires a paid account.
2. Copy your token from the Dashboard.
3. Make your first request — ready-to-use examples below in cURL, PHP, Python, and JavaScript.

Choosing the Right Method

The API offers three endpoints for speech synthesis. Choose the one that fits your use case:

Method How it works Best for
/text Send a request — get the audio file immediately. Single step. Short texts up to 2,000 characters. Instant results.
/longtext Send a request — receive a task id. Poll /result until the file is ready. Two steps. Long texts up to 1,000,000 characters. Books, articles, bulk content.
/subs Same as /longtext, but also returns timestamps (subtitles). Two steps. Text-to-time alignment for video, presentations, or karaoke.
Tip: instant results for long texts Split your text into sentences (up to 2,000 chars per request), synthesize each chunk via /text, and concatenate the audio files on your end (e.g., with ffmpeg). This gives you immediate results without waiting for the /longtext queue.

Authentication

Every request to the API must include two required parameters:

Parameter Type Description
tokenstringYour secret API key. Found in the Dashboard. Never share it.
emailstringThe email address associated with your account.
Accepted data formats The API accepts data in three ways — use whichever is convenient:
1. POST with application/x-www-form-urlencoded (most common)
2. POST with application/json in the request body
3. GET with query string parameters

/text — Instant Synthesis (1 Step)

POST GET
https://speechgen.io/index.php?r=api/text

The simplest way to convert text to speech. Send a single request and get the audio file immediately in the response. No polling, no second request needed.

Limit: 2,000 characters per request. For longer texts, use /longtext or split into chunks.

Required parameters

Parameter Type Description
tokenstringYour API key
emailstringAccount email
voicestringVoice name, e.g. Matthew plus, Aria. Full list — api/voices
textstringText to convert (up to 2,000 characters)

Optional parameters

All parameters below are optional. Default values are used when omitted.

Voice & style

Parameter Type Default Description
speedfloat1Speech speed. Range 0.1–2.0. E.g. 0.8 = slower, 1.3 = faster.
pitchint0Voice pitch. Range −20 to 20.
stylestringEmotional style: newscast, cheerful, sad, etc. Not available for all voices. Browse voices. (Legacy name emotion also accepted.)
styledegreestringStyle intensity, e.g. 1.5 for stronger emotion. Works only with style.
rolestringVoice role: YoungAdultMale, OlderAdultFemale, etc. Not available for all voices.

Pauses & volume

Parameter Type Default Description
pause_sentenceintPause between sentences in milliseconds, e.g. 300 = 0.3 s.
pause_paragraphintPause between paragraphs in milliseconds, e.g. 500 = 0.5 s.
volumeint100Output volume. 100 = normal, 150 = louder, 50 = quieter. Range 10–200.
effectstringAudio effect, e.g. car (in-car sound). Available for some voices.

Audio format & quality

Parameter Type Default Description
formatstringmp3Output format: mp3, wav, ogg, opus, flac
sample_rateintSample rate in Hz, e.g. 24000, 44100, 48000. Higher = better quality.
bitrateintBitrate in kbps for lossy formats (mp3, ogg, opus). Range 6–320, e.g. 128, 192, 320.
channelsintChannels: 1 = mono, 2 = stereo.

Background music

Add background music that plays under the voice.

Parameter Type Description
musicintBackground music ID from the catalog. Omit for voice only.
musik_volumeintMusic volume. 100 = normal, 50 = quiet, 150 = loud. Range 5–200.
musik_loopintLoop music: 1 = yes (music repeats until speech ends), 0 = no.

Example: /text request

curl -X POST "https://speechgen.io/index.php?r=api/text" \
  -d "token=YOUR_TOKEN" \
  -d "email=you@example.com" \
  -d "voice=Matthew plus" \
  -d "text=Hello! This is a test voiceover via the API." \
  -d "format=mp3" \
  -d "speed=1" \
  -d "sample_rate=24000" \
  -d "bitrate=192" \
  -d "channels=2"
<?php
$url = "https://speechgen.io/index.php?r=api/text";
$data = [
    'token'  => 'YOUR_TOKEN',
    'email'  => 'you@example.com',
    'voice'  => 'Matthew plus',
    'text'   => 'Hello! This is a test voiceover via the API.',
    'format' => 'mp3',
    'speed'  => 1,
    'sample_rate' => 24000,
    'bitrate'     => 192,
    'channels'    => 2,
];

$ch = curl_init();
curl_setopt_array($ch, [
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_URL  => $url,
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => http_build_query($data),
    CURLOPT_SSL_VERIFYHOST => false,
    CURLOPT_SSL_VERIFYPEER => false,
]);

$response = curl_exec($ch);
if (curl_errno($ch)) die("cURL error: " . curl_error($ch));
curl_close($ch);

$result = json_decode($response, true);

if ($result['status'] == 1) {
    echo "File: " . $result['file'] . "\n";
    echo "Duration: " . $result['duration'] . " sec\n";
    echo "Cost: " . $result['cost'] . "\n";
    copy($result['file'], 'output.' . $result['format']);
} else {
    echo "Error: " . ($result['error'] ?? 'Unknown error');
}
import requests

url = "https://speechgen.io/index.php?r=api/text"
data = {
    "token": "YOUR_TOKEN",
    "email": "you@example.com",
    "voice": "Matthew plus",
    "text": "Hello! This is a test voiceover via the API.",
    "format": "mp3",
    "speed": 1,
    "sample_rate": 24000,
    "bitrate": 192,
    "channels": 2,
}

response = requests.post(url, data=data, timeout=60)
result = response.json()

if result.get("status") == 1:
    print("File:", result["file"])
    print("Duration:", result["duration"], "sec")
    print("Cost:", result["cost"])

    audio = requests.get(result["file"])
    with open(f'output.{result["format"]}', "wb") as f:
        f.write(audio.content)
else:
    print("Error:", result.get("error", "Unknown error"))
const url = "https://speechgen.io/index.php?r=api/text";

const data = new URLSearchParams({
    token: "YOUR_TOKEN",
    email: "you@example.com",
    voice: "Matthew plus",
    text: "Hello! This is a test voiceover via the API.",
    format: "mp3",
    speed: "1",
    sample_rate: "24000",
    bitrate: "192",
    channels: "2",
});

const resp = await fetch(url, {
    method: "POST",
    headers: { "Content-Type": "application/x-www-form-urlencoded" },
    body: data.toString(),
});

const result = await resp.json();

if (result.status === 1) {
    console.log("File:", result.file);
    console.log("Duration:", result.duration, "sec");
} else {
    console.error("Error:", result.error);
}

Response: /text (JSON)

On success (status = 1), the response contains a link to the audio file:

{
  "id": "2870459",
  "status": 1,
  "file": "https://speechgen.io/texttomp3/20260114/p_2870459_342.mp3",
  "file_cors": "https://speechgen.io/index.php?r=site/download&prj=2870459&cors=...",
  "parts": "1",
  "parts_done": "1",
  "duration": 3,
  "format": "mp3",
  "error": "",
  "balans": "11563.364",
  "cost": 0.042
}

Response fields

Field Description
idUnique synthesis ID
status1 = file ready, -1 = error (see error field)
fileDirect link to the audio file — download or stream it
file_corsCORS-enabled link — use when downloading from the browser (JavaScript)
durationAudio duration in seconds
formatFile format (mp3, wav, flac, …)
balansRemaining account balance
costAmount charged for this synthesis
errorError message (when status = -1)
Tip: batch processing via /text Need to synthesize a long text but want instant results? Split the text by sentences (up to 2,000 chars each), process each chunk sequentially via /text, and concatenate the audio files on your end (e.g., with ffmpeg). This is faster than waiting for /longtext queue processing.

/longtext — Long Text Synthesis (2 Steps)

POST GET
https://speechgen.io/index.php?r=api/longtext

Designed for large texts — up to 1,000,000 characters (books, articles, documents). Unlike /text, the result is not immediate — the task is queued and processed on the server.

How it works

Two-step process Step 1. Send your text to /longtext — you get a task id and status = 0 (processing).
Step 2. Poll /result with that id every 2–5 seconds. When status becomes 1 — the file is ready; grab the link from the file field.

Request parameters are identical to /text (voice, text, format, speed, pitch, style, sample_rate, bitrate, channels, music, etc.). The only difference: no 2,000-character limit.

Example: /longtext + /result

# Step 1: submit text, get task id
curl -X POST "https://speechgen.io/index.php?r=api/longtext" \
  -d "token=YOUR_TOKEN" \
  -d "email=you@example.com" \
  -d "voice=Matthew plus" \
  -d "text=Your long text goes here, up to 1 million characters..." \
  -d "format=mp3" \
  -d "sample_rate=24000" \
  -d "bitrate=128" \
  -d "channels=1"

# Response: {"id":"4153594","status":0,...}

# Step 2: poll for result (repeat every 2 sec until status != 0)
curl -X POST "https://speechgen.io/index.php?r=api/result" \
  -d "token=YOUR_TOKEN" \
  -d "email=you@example.com" \
  -d "id=4153594"

# When status=1, the "file" field contains the audio URL
<?php
$BASE  = "https://speechgen.io/index.php?r=api";
$TOKEN = "YOUR_TOKEN";
$EMAIL = "you@example.com";

function postForm($url, $data) {
    $ch = curl_init();
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_URL  => $url,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => http_build_query($data),
        CURLOPT_SSL_VERIFYHOST => false,
        CURLOPT_SSL_VERIFYPEER => false,
    ]);
    $raw = curl_exec($ch);
    if (curl_errno($ch)) die("cURL error: " . curl_error($ch));
    curl_close($ch);
    return json_decode($raw, true);
}

// STEP 1: Submit text for synthesis
$create = postForm($BASE . "/longtext", [
    'token'  => $TOKEN,
    'email'  => $EMAIL,
    'voice'  => 'Matthew plus',
    'text'   => 'Your long text goes here...',
    'format' => 'mp3',
    'sample_rate' => 24000,
    'bitrate'     => 128,
    'channels'    => 1,
]);

$id = $create['id'] ?? 0;
if (!$id) die("Error: " . ($create['error'] ?? 'No id returned'));
echo "Task created, id=$id\n";

// STEP 2: Poll for result every 2 seconds
for ($i = 0; $i < 600; $i++) {
    $res = postForm($BASE . "/result", [
        'token' => $TOKEN, 'email' => $EMAIL, 'id' => $id
    ]);

    if (($res['status'] ?? 0) == 1) {
        echo "File ready: " . $res['file'] . "\n";
        echo "Duration: " . $res['duration'] . " sec\n";
        copy($res['file'], 'output.' . $res['format']);
        break;
    }

    if (($res['status'] ?? 0) == -1) {
        die("Error: " . ($res['error'] ?? 'Unknown'));
    }

    echo "Processing... ({$res['parts_done']}/{$res['parts']})\n";
    sleep(2);
}
import time
import requests

BASE  = "https://speechgen.io/index.php?r=api"
TOKEN = "YOUR_TOKEN"
EMAIL = "you@example.com"

# STEP 1: Submit text for synthesis
create = requests.post(f"{BASE}/longtext", data={
    "token": TOKEN,
    "email": EMAIL,
    "voice": "Matthew plus",
    "text": "Your long text goes here...",
    "format": "mp3",
    "sample_rate": 24000,
    "bitrate": 128,
    "channels": 1,
}, timeout=60).json()

pid = create.get("id")
if not pid:
    raise RuntimeError(f"Error: {create.get('error', 'No id returned')}")
print(f"Task created, id={pid}")

# STEP 2: Poll for result every 2 seconds
for _ in range(600):
    res = requests.post(f"{BASE}/result", data={
        "token": TOKEN, "email": EMAIL, "id": pid
    }, timeout=60).json()

    if res.get("status") == 1:
        print(f"File ready: {res['file']}")
        print(f"Duration: {res['duration']} sec")
        audio = requests.get(res["file"])
        with open(f"output.{res['format']}", "wb") as f:
            f.write(audio.content)
        break

    if res.get("status") == -1:
        raise RuntimeError(f"Error: {res.get('error')}")

    print(f"Processing... ({res.get('parts_done')}/{res.get('parts')})")
    time.sleep(2)
const BASE = "https://speechgen.io/index.php?r=api";

async function postForm(path, obj) {
    const resp = await fetch(`${BASE}/${path}`, {
        method: "POST",
        headers: { "Content-Type": "application/x-www-form-urlencoded" },
        body: new URLSearchParams(obj).toString(),
    });
    return await resp.json();
}

// STEP 1: Submit text for synthesis
const create = await postForm("longtext", {
    token: "YOUR_TOKEN",
    email: "you@example.com",
    voice: "Matthew plus",
    text: "Your long text goes here...",
    format: "mp3",
    sample_rate: "24000",
    bitrate: "128",
    channels: "1",
});

if (!create.id) throw new Error(create.error || "No id returned");
const id = String(create.id);
console.log(`Task created, id=${id}`);

// STEP 2: Poll for result every 2 seconds
for (let i = 0; i < 600; i++) {
    const res = await postForm("result", {
        token: "YOUR_TOKEN", email: "you@example.com", id
    });

    if (res.status === 1) {
        console.log("File ready:", res.file);
        console.log("Duration:", res.duration, "sec");
        break;
    }

    if (res.status === -1) throw new Error(res.error || "Error");

    console.log(`Processing... (${res.parts_done}/${res.parts})`);
    await new Promise(r => setTimeout(r, 2000));
}

Response: Step 1

When the task is created, the server returns status = 0 (queued) and an id for tracking:

{
  "id": "4153594",
  "status": 0,
  "parts": "5",
  "parts_done": "0",
  "format": "mp3",
  "error": "",
  "balans": "3331.272",
  "cost": 0.00
}

Response: Step 2 (file ready)

When status becomes 1, the file and duration fields appear:

{
  "id": "4153594",
  "status": 1,
  "file": "https://speechgen.io/texttomp3/.../result.mp3",
  "file_cors": "https://speechgen.io/index.php?r=site/download&prj=4153594&cors=...",
  "parts": "5",
  "parts_done": "5",
  "duration": 42,
  "format": "mp3",
  "error": "",
  "balans": "3331.272",
  "cost": 1.26
}
Polling frequency Poll every 2 seconds for short texts. For very long texts (books), increase the interval to 5 seconds to reduce server load.

/subs — Synthesis with Subtitles (2 Steps)

POST GET
https://speechgen.io/index.php?r=api/subs

Works exactly like /longtext (two-step process: submit → poll), but additionally returns timestamps — text-to-time alignment for each fragment. Useful for creating subtitles for video, karaoke, or synchronized presentations.

Additional parameters

In addition to all parameters from /text, the /subs endpoint accepts:

Parameter Type Description
speed_typeintSpeed control type: 1 or 2
speed_floorintAdditional speed/pause parameter

Usage is identical to /longtext — just change the URL to /subs and add the extra parameters. The /result response will include a cuts array with links to individual audio fragments and their timestamps.

/result — Get Result

POST GET
https://speechgen.io/index.php?r=api/result

Used only after /longtext or /subs. Not needed for /text — that returns the file immediately.

Pass the task id received in Step 1, and the server returns the current status. Poll in a loop until status becomes 1 (ready) or -1 (error).

Parameters

Parameter Type Required Description
tokenstringyesAPI key
emailstringyesAccount email
idintyesTask ID from /longtext or /subs

Response fields

Field Description
status0 = still processing (wait), 1 = ready (download file), -1 = error
fileDirect link to the audio file (appears when status = 1)
file_corsCORS-enabled link for browser downloads
cutsArray of audio fragments (when using /subs or the obrezka tag)
parts / parts_doneTotal fragments / completed — useful for a progress bar
durationDuration in seconds (when status = 1)
balansRemaining balance
costTotal cost (increases as fragments are processed)

Helper Endpoints

/voices — List all voices

GET
https://speechgen.io/index.php?r=api/voices

Returns a JSON array of all available voices. Use it to build a voice picker in your app.

Filter by language:

https://speechgen.io/index.php?r=api/voices&langs=en,ru

/balance — Check balance

POST GET
https://speechgen.io/index.php?r=api/balance

Parameters: token, email. Returns your current account balance.

/delete — Delete a project

POST GET
https://speechgen.io/index.php?r=api/delete

Parameters: token, email, id. Deletes the project and its audio file from the server.

Audio Formats & Quality

The API supports five audio formats:

Format Type When to use
mp3lossyUniversal format. Works everywhere, good for most use cases.
ogglossyGood quality at small file sizes. Popular on the web.
opuslossyBest quality at low bitrates. Ideal for streaming.
wavlosslessNo compression. Maximum quality, but large file size.
flaclosslessLossless compression. Same quality as WAV, smaller file.

sample_rate vs. bitrate

Parameter What it is Unit Examples
sample_rateSample rate — how many times per second the sound is sampled. Higher = more detail.Hz24000, 44100, 48000
bitrateBitrate — how much data per second of audio. Only matters for lossy formats (mp3, ogg, opus).kbps64, 128, 192, 320
Backward compatibility: bitrate / sample_rate In earlier versions, the bitrate parameter actually controlled the sample rate (Hz), not bitrate. This has been fixed: bitrate is now real bitrate (kbps), and sample_rate was added for frequency control.

For compatibility: if sample_rate is omitted and bitrate contains a value that looks like Hz (e.g. 48000), the server automatically interprets it as sample_rate. Legacy code will continue to work.
FLAC / WAV note Lossless formats don't have a fixed bitrate. If Windows shows 300–1200+ kbps for a FLAC file, that's normal — it's how lossless compression works.

Errors & Status Codes

Every API response contains a status field:

Status Meaning Action
1ReadyFile is ready — download from the file field
0ProcessingRetry /result in 2 seconds
-1ErrorCheck the error field for details

Common errors:

  • Empty or invalid token — verify your API key in the Dashboard
  • Text exceeds 2,000 characters for /text — use /longtext or split into chunks
  • Unknown voice name — check available voices via /voices
  • Insufficient balance — top up your account

We use cookies to ensure you get the best experience on our website. Learn more: Privacy Policy

Accept Cookies