API instruction for Text to Speech
30-11--0001 , 20-03-2026
Convert text to speech programmatically. Send an HTTP request, get an audio file back — MP3, WAV, FLAC, OGG, or Opus.
What is SpeechGen API
SpeechGen API lets you convert text to speech programmatically — no browser, no manual work. Send your text and voice parameters via an HTTP request and receive a ready-to-use audio file.
What you can build:
- Auto-narrate articles, news feeds, and notifications
- Integrate TTS into your app, chatbot, or CRM
- Batch-process large volumes of text
- Generate podcasts, audiobooks, and e-learning content
Base URL for all requests:
2. Copy your
token from the Dashboard.3. Make your first request — ready-to-use examples below in cURL, PHP, Python, and JavaScript.
Choosing the Right Method
The API offers three endpoints for speech synthesis. Choose the one that fits your use case:
| Method | How it works | Best for |
|---|---|---|
/text |
Send a request — get the audio file immediately. Single step. | Short texts up to 2,000 characters. Instant results. |
/longtext |
Send a request — receive a task id. Poll /result until the file is ready. Two steps. |
Long texts up to 1,000,000 characters. Books, articles, bulk content. |
/subs |
Same as /longtext, but also returns timestamps (subtitles). Two steps. |
Text-to-time alignment for video, presentations, or karaoke. |
/text, and concatenate the audio files on your end (e.g., with ffmpeg). This gives you immediate results without waiting for the /longtext queue.
Authentication
Every request to the API must include two required parameters:
| Parameter | Type | Description |
|---|---|---|
token | string | Your secret API key. Found in the Dashboard. Never share it. |
email | string | The email address associated with your account. |
1.
POST with application/x-www-form-urlencoded (most common)2.
POST with application/json in the request body3.
GET with query string parameters
/text — Instant Synthesis (1 Step)
The simplest way to convert text to speech. Send a single request and get the audio file immediately in the response. No polling, no second request needed.
Limit: 2,000 characters per request. For longer texts, use /longtext or split into chunks.
Required parameters
| Parameter | Type | Description |
|---|---|---|
token | string | Your API key |
email | string | Account email |
voice | string | Voice name, e.g. Matthew plus, Aria. Full list — api/voices |
text | string | Text to convert (up to 2,000 characters) |
Optional parameters
All parameters below are optional. Default values are used when omitted.
Voice & style
| Parameter | Type | Default | Description |
|---|---|---|---|
speed | float | 1 | Speech speed. Range 0.1–2.0. E.g. 0.8 = slower, 1.3 = faster. |
pitch | int | 0 | Voice pitch. Range −20 to 20. |
style | string | — | Emotional style: newscast, cheerful, sad, etc. Not available for all voices. Browse voices. (Legacy name emotion also accepted.) |
styledegree | string | — | Style intensity, e.g. 1.5 for stronger emotion. Works only with style. |
role | string | — | Voice role: YoungAdultMale, OlderAdultFemale, etc. Not available for all voices. |
Pauses & volume
| Parameter | Type | Default | Description |
|---|---|---|---|
pause_sentence | int | — | Pause between sentences in milliseconds, e.g. 300 = 0.3 s. |
pause_paragraph | int | — | Pause between paragraphs in milliseconds, e.g. 500 = 0.5 s. |
volume | int | 100 | Output volume. 100 = normal, 150 = louder, 50 = quieter. Range 10–200. |
effect | string | — | Audio effect, e.g. car (in-car sound). Available for some voices. |
Audio format & quality
| Parameter | Type | Default | Description |
|---|---|---|---|
format | string | mp3 | Output format: mp3, wav, ogg, opus, flac |
sample_rate | int | — | Sample rate in Hz, e.g. 24000, 44100, 48000. Higher = better quality. |
bitrate | int | — | Bitrate in kbps for lossy formats (mp3, ogg, opus). Range 6–320, e.g. 128, 192, 320. |
channels | int | — | Channels: 1 = mono, 2 = stereo. |
Background music
Add background music that plays under the voice.
| Parameter | Type | Description |
|---|---|---|
music | int | Background music ID from the catalog. Omit for voice only. |
musik_volume | int | Music volume. 100 = normal, 50 = quiet, 150 = loud. Range 5–200. |
musik_loop | int | Loop music: 1 = yes (music repeats until speech ends), 0 = no. |
Example: /text request
curl -X POST "https://speechgen.io/index.php?r=api/text" \
-d "token=YOUR_TOKEN" \
-d "email=you@example.com" \
-d "voice=Matthew plus" \
-d "text=Hello! This is a test voiceover via the API." \
-d "format=mp3" \
-d "speed=1" \
-d "sample_rate=24000" \
-d "bitrate=192" \
-d "channels=2"
<?php
$url = "https://speechgen.io/index.php?r=api/text";
$data = [
'token' => 'YOUR_TOKEN',
'email' => 'you@example.com',
'voice' => 'Matthew plus',
'text' => 'Hello! This is a test voiceover via the API.',
'format' => 'mp3',
'speed' => 1,
'sample_rate' => 24000,
'bitrate' => 192,
'channels' => 2,
];
$ch = curl_init();
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_URL => $url,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => http_build_query($data),
CURLOPT_SSL_VERIFYHOST => false,
CURLOPT_SSL_VERIFYPEER => false,
]);
$response = curl_exec($ch);
if (curl_errno($ch)) die("cURL error: " . curl_error($ch));
curl_close($ch);
$result = json_decode($response, true);
if ($result['status'] == 1) {
echo "File: " . $result['file'] . "\n";
echo "Duration: " . $result['duration'] . " sec\n";
echo "Cost: " . $result['cost'] . "\n";
copy($result['file'], 'output.' . $result['format']);
} else {
echo "Error: " . ($result['error'] ?? 'Unknown error');
}
import requests
url = "https://speechgen.io/index.php?r=api/text"
data = {
"token": "YOUR_TOKEN",
"email": "you@example.com",
"voice": "Matthew plus",
"text": "Hello! This is a test voiceover via the API.",
"format": "mp3",
"speed": 1,
"sample_rate": 24000,
"bitrate": 192,
"channels": 2,
}
response = requests.post(url, data=data, timeout=60)
result = response.json()
if result.get("status") == 1:
print("File:", result["file"])
print("Duration:", result["duration"], "sec")
print("Cost:", result["cost"])
audio = requests.get(result["file"])
with open(f'output.{result["format"]}', "wb") as f:
f.write(audio.content)
else:
print("Error:", result.get("error", "Unknown error"))
const url = "https://speechgen.io/index.php?r=api/text";
const data = new URLSearchParams({
token: "YOUR_TOKEN",
email: "you@example.com",
voice: "Matthew plus",
text: "Hello! This is a test voiceover via the API.",
format: "mp3",
speed: "1",
sample_rate: "24000",
bitrate: "192",
channels: "2",
});
const resp = await fetch(url, {
method: "POST",
headers: { "Content-Type": "application/x-www-form-urlencoded" },
body: data.toString(),
});
const result = await resp.json();
if (result.status === 1) {
console.log("File:", result.file);
console.log("Duration:", result.duration, "sec");
} else {
console.error("Error:", result.error);
}
Response: /text (JSON)
On success (status = 1), the response contains a link to the audio file:
{
"id": "2870459",
"status": 1,
"file": "https://speechgen.io/texttomp3/20260114/p_2870459_342.mp3",
"file_cors": "https://speechgen.io/index.php?r=site/download&prj=2870459&cors=...",
"parts": "1",
"parts_done": "1",
"duration": 3,
"format": "mp3",
"error": "",
"balans": "11563.364",
"cost": 0.042
}
Response fields
| Field | Description |
|---|---|
id | Unique synthesis ID |
status | 1 = file ready, -1 = error (see error field) |
file | Direct link to the audio file — download or stream it |
file_cors | CORS-enabled link — use when downloading from the browser (JavaScript) |
duration | Audio duration in seconds |
format | File format (mp3, wav, flac, …) |
balans | Remaining account balance |
cost | Amount charged for this synthesis |
error | Error message (when status = -1) |
/text, and concatenate the audio files on your end (e.g., with ffmpeg). This is faster than waiting for /longtext queue processing.
/longtext — Long Text Synthesis (2 Steps)
Designed for large texts — up to 1,000,000 characters (books, articles, documents). Unlike /text, the result is not immediate — the task is queued and processed on the server.
How it works
/longtext — you get a task id and status = 0 (processing).Step 2. Poll
/result with that id every 2–5 seconds. When status becomes 1 — the file is ready; grab the link from the file field.
Request parameters are identical to /text (voice, text, format, speed, pitch, style, sample_rate, bitrate, channels, music, etc.). The only difference: no 2,000-character limit.
Example: /longtext + /result
# Step 1: submit text, get task id
curl -X POST "https://speechgen.io/index.php?r=api/longtext" \
-d "token=YOUR_TOKEN" \
-d "email=you@example.com" \
-d "voice=Matthew plus" \
-d "text=Your long text goes here, up to 1 million characters..." \
-d "format=mp3" \
-d "sample_rate=24000" \
-d "bitrate=128" \
-d "channels=1"
# Response: {"id":"4153594","status":0,...}
# Step 2: poll for result (repeat every 2 sec until status != 0)
curl -X POST "https://speechgen.io/index.php?r=api/result" \
-d "token=YOUR_TOKEN" \
-d "email=you@example.com" \
-d "id=4153594"
# When status=1, the "file" field contains the audio URL
<?php
$BASE = "https://speechgen.io/index.php?r=api";
$TOKEN = "YOUR_TOKEN";
$EMAIL = "you@example.com";
function postForm($url, $data) {
$ch = curl_init();
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_URL => $url,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => http_build_query($data),
CURLOPT_SSL_VERIFYHOST => false,
CURLOPT_SSL_VERIFYPEER => false,
]);
$raw = curl_exec($ch);
if (curl_errno($ch)) die("cURL error: " . curl_error($ch));
curl_close($ch);
return json_decode($raw, true);
}
// STEP 1: Submit text for synthesis
$create = postForm($BASE . "/longtext", [
'token' => $TOKEN,
'email' => $EMAIL,
'voice' => 'Matthew plus',
'text' => 'Your long text goes here...',
'format' => 'mp3',
'sample_rate' => 24000,
'bitrate' => 128,
'channels' => 1,
]);
$id = $create['id'] ?? 0;
if (!$id) die("Error: " . ($create['error'] ?? 'No id returned'));
echo "Task created, id=$id\n";
// STEP 2: Poll for result every 2 seconds
for ($i = 0; $i < 600; $i++) {
$res = postForm($BASE . "/result", [
'token' => $TOKEN, 'email' => $EMAIL, 'id' => $id
]);
if (($res['status'] ?? 0) == 1) {
echo "File ready: " . $res['file'] . "\n";
echo "Duration: " . $res['duration'] . " sec\n";
copy($res['file'], 'output.' . $res['format']);
break;
}
if (($res['status'] ?? 0) == -1) {
die("Error: " . ($res['error'] ?? 'Unknown'));
}
echo "Processing... ({$res['parts_done']}/{$res['parts']})\n";
sleep(2);
}
import time
import requests
BASE = "https://speechgen.io/index.php?r=api"
TOKEN = "YOUR_TOKEN"
EMAIL = "you@example.com"
# STEP 1: Submit text for synthesis
create = requests.post(f"{BASE}/longtext", data={
"token": TOKEN,
"email": EMAIL,
"voice": "Matthew plus",
"text": "Your long text goes here...",
"format": "mp3",
"sample_rate": 24000,
"bitrate": 128,
"channels": 1,
}, timeout=60).json()
pid = create.get("id")
if not pid:
raise RuntimeError(f"Error: {create.get('error', 'No id returned')}")
print(f"Task created, id={pid}")
# STEP 2: Poll for result every 2 seconds
for _ in range(600):
res = requests.post(f"{BASE}/result", data={
"token": TOKEN, "email": EMAIL, "id": pid
}, timeout=60).json()
if res.get("status") == 1:
print(f"File ready: {res['file']}")
print(f"Duration: {res['duration']} sec")
audio = requests.get(res["file"])
with open(f"output.{res['format']}", "wb") as f:
f.write(audio.content)
break
if res.get("status") == -1:
raise RuntimeError(f"Error: {res.get('error')}")
print(f"Processing... ({res.get('parts_done')}/{res.get('parts')})")
time.sleep(2)
const BASE = "https://speechgen.io/index.php?r=api";
async function postForm(path, obj) {
const resp = await fetch(`${BASE}/${path}`, {
method: "POST",
headers: { "Content-Type": "application/x-www-form-urlencoded" },
body: new URLSearchParams(obj).toString(),
});
return await resp.json();
}
// STEP 1: Submit text for synthesis
const create = await postForm("longtext", {
token: "YOUR_TOKEN",
email: "you@example.com",
voice: "Matthew plus",
text: "Your long text goes here...",
format: "mp3",
sample_rate: "24000",
bitrate: "128",
channels: "1",
});
if (!create.id) throw new Error(create.error || "No id returned");
const id = String(create.id);
console.log(`Task created, id=${id}`);
// STEP 2: Poll for result every 2 seconds
for (let i = 0; i < 600; i++) {
const res = await postForm("result", {
token: "YOUR_TOKEN", email: "you@example.com", id
});
if (res.status === 1) {
console.log("File ready:", res.file);
console.log("Duration:", res.duration, "sec");
break;
}
if (res.status === -1) throw new Error(res.error || "Error");
console.log(`Processing... (${res.parts_done}/${res.parts})`);
await new Promise(r => setTimeout(r, 2000));
}
Response: Step 1
When the task is created, the server returns status = 0 (queued) and an id for tracking:
{
"id": "4153594",
"status": 0,
"parts": "5",
"parts_done": "0",
"format": "mp3",
"error": "",
"balans": "3331.272",
"cost": 0.00
}
Response: Step 2 (file ready)
When status becomes 1, the file and duration fields appear:
{
"id": "4153594",
"status": 1,
"file": "https://speechgen.io/texttomp3/.../result.mp3",
"file_cors": "https://speechgen.io/index.php?r=site/download&prj=4153594&cors=...",
"parts": "5",
"parts_done": "5",
"duration": 42,
"format": "mp3",
"error": "",
"balans": "3331.272",
"cost": 1.26
}
/subs — Synthesis with Subtitles (2 Steps)
Works exactly like /longtext (two-step process: submit → poll), but additionally returns timestamps — text-to-time alignment for each fragment. Useful for creating subtitles for video, karaoke, or synchronized presentations.
Additional parameters
In addition to all parameters from /text, the /subs endpoint accepts:
| Parameter | Type | Description |
|---|---|---|
speed_type | int | Speed control type: 1 or 2 |
speed_floor | int | Additional speed/pause parameter |
Usage is identical to /longtext — just change the URL to /subs and add the extra parameters. The /result response will include a cuts array with links to individual audio fragments and their timestamps.
/result — Get Result
Used only after /longtext or /subs. Not needed for /text — that returns the file immediately.
Pass the task id received in Step 1, and the server returns the current status. Poll in a loop until status becomes 1 (ready) or -1 (error).
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
token | string | yes | API key |
email | string | yes | Account email |
id | int | yes | Task ID from /longtext or /subs |
Response fields
| Field | Description |
|---|---|
status | 0 = still processing (wait), 1 = ready (download file), -1 = error |
file | Direct link to the audio file (appears when status = 1) |
file_cors | CORS-enabled link for browser downloads |
cuts | Array of audio fragments (when using /subs or the obrezka tag) |
parts / parts_done | Total fragments / completed — useful for a progress bar |
duration | Duration in seconds (when status = 1) |
balans | Remaining balance |
cost | Total cost (increases as fragments are processed) |
Helper Endpoints
/voices — List all voices
Returns a JSON array of all available voices. Use it to build a voice picker in your app.
Filter by language:
https://speechgen.io/index.php?r=api/voices&langs=en,ru
/balance — Check balance
Parameters: token, email. Returns your current account balance.
/delete — Delete a project
Parameters: token, email, id. Deletes the project and its audio file from the server.
Audio Formats & Quality
The API supports five audio formats:
| Format | Type | When to use |
|---|---|---|
mp3 | lossy | Universal format. Works everywhere, good for most use cases. |
ogg | lossy | Good quality at small file sizes. Popular on the web. |
opus | lossy | Best quality at low bitrates. Ideal for streaming. |
wav | lossless | No compression. Maximum quality, but large file size. |
flac | lossless | Lossless compression. Same quality as WAV, smaller file. |
sample_rate vs. bitrate
| Parameter | What it is | Unit | Examples |
|---|---|---|---|
sample_rate | Sample rate — how many times per second the sound is sampled. Higher = more detail. | Hz | 24000, 44100, 48000 |
bitrate | Bitrate — how much data per second of audio. Only matters for lossy formats (mp3, ogg, opus). | kbps | 64, 128, 192, 320 |
bitrate parameter actually controlled the sample rate (Hz), not bitrate. This has been fixed: bitrate is now real bitrate (kbps), and sample_rate was added for frequency control.For compatibility: if
sample_rate is omitted and bitrate contains a value that looks like Hz (e.g. 48000), the server automatically interprets it as sample_rate. Legacy code will continue to work.
Errors & Status Codes
Every API response contains a status field:
| Status | Meaning | Action |
|---|---|---|
1 | Ready | File is ready — download from the file field |
0 | Processing | Retry /result in 2 seconds |
-1 | Error | Check the error field for details |
Common errors:
- Empty or invalid
token— verify your API key in the Dashboard - Text exceeds 2,000 characters for
/text— use/longtextor split into chunks - Unknown voice name — check available voices via /voices
- Insufficient balance — top up your account