edge-tts
✓CleanText-to-speech synthesis using Microsoft Edge's free TTS service. Use when users request converting text to audio, generating speech from text, creating audio files, or mention TTS, text-to-speech, voice synthesis, or want to convert txt files to mp3/wav. Supports Chinese and other languages with multiple voices, adjustable speed and volume.
Install Command
npx skills add lisniuse/edge-ttsSKILL.md
---
name: edge-tts
description: Text-to-speech synthesis using Microsoft Edge's free TTS service. Use when users request converting text to audio, generating speech from text, creating audio files, or mention TTS, text-to-speech, voice synthesis, or want to convert txt files to mp3/wav. Supports Chinese and other languages with multiple voices, adjustable speed and volume.
---
# Edge TTS - Text-to-Speech Synthesis
Convert text to natural-sounding speech using Microsoft Edge's free TTS service.
## Quick Start
Generate audio from text:
```bash
python scripts/edge_tts_synthesizer.py -t "ä½ å¥½ï¼ä¸çï¼" -o output.mp3
```
Generate audio from a text file:
```bash
python scripts/edge_tts_synthesizer.py -f input.txt -o output.mp3
```
## Common Usage Patterns
### Basic text synthesis
For simple text-to-speech:
```bash
python scripts/edge_tts_synthesizer.py \
-t "Your text here" \
-o output.mp3
```
### Using different voices
Choose from available voices (default is zh-CN-XiaoxiaoNeural - female voice):
```bash
# Male voice (Yunxi)
python scripts/edge_tts_synthesizer.py \
-t "è¿æ¯ç·å£°ç¤ºä¾" \
-o output.mp3 \
-v zh-CN-YunxiNeural
# Female voice (Xiaoyi)
python scripts/edge_tts_synthesizer.py \
-t "è¿æ¯å¥³å£°ç¤ºä¾" \
-o output.mp3 \
-v zh-CN-XiaoyiNeural
```
**Common Chinese voices:**
- `zh-CN-XiaoxiaoNeural` - ææ (female, default)
- `zh-CN-YunxiNeural` - äºå¸ (male)
- `zh-CN-XiaoyiNeural` - æä¼ (female)
- `zh-CN-YunjianNeural` - äºå¥ (male)
- `zh-CN-YunyangNeural` - äºæ¬ (male)
To list all available voices:
```bash
python scripts/edge_tts_synthesizer.py --list-voices
```
### Adjusting speed and volume
Control speech rate and volume:
```bash
# Faster speech (+50%)
python scripts/edge_tts_synthesizer.py \
-t "å¿«éæè¯»" \
-o output.mp3 \
--rate +50%
# Slower speech (-20%)
python scripts/edge_tts_synthesizer.py \
-t "æ
¢éæè¯»" \
-o output.mp3 \
--rate -20%
# Louder volume (+30%)
python scripts/edge_tts_synthesizer.py \
-t "大声æè¯»" \
-o output.mp3 \
--volume +30%
# Combined adjustments
python scripts/edge_tts_synthesizer.py \
-t "åå¿«å大声" \
-o output.mp3 \
--rate +50% \
--volume +20%
```
### Processing text files
Read from a file instead of inline text:
```bash
python scripts/edge_tts_synthesizer.py \
-f input.txt \
-o output.mp3 \
-v zh-CN-XiaoxiaoNeural
```
### Batch processing
When processing multiple texts or creating multiple voice versions:
```bash
# Create multiple voice versions
for voice in zh-CN-XiaoxiaoNeural zh-CN-YunxiNeural zh-CN-XiaoyiNeural; do
python scripts/edge_tts_synthesizer.py \
-t "åæ ·çææ¬ï¼ä¸åç声é³" \
-o "output_${voice}.mp3" \
-v "$voice"
done
# Process multiple text files
for file in *.txt; do
python scripts/edge_tts_synthesizer.py \
-f "$file" \
-o "${file%.txt}.mp3"
done
```
## Output Format
The script supports multiple audio formats:
- `.mp3` (recommended for smaller file size)
- `.wav` (recommended for higher quality)
- Other formats supported by Edge TTS
Specify the format via the output filename extension.
## Error Handling
If synthesis fails:
1. Check internet connection (Edge TTS requires online access)
2. Verify the voice name is correct (use `--list-voices`)
3. Ensure output directory exists and is writable
4. Check that the text encoding is UTF-8 for non-ASCII characters
## Dependencies
This script requires the `edge-tts` Python package. Install with:
```bash
pip install edge-tts
```
The script handles this automatically via the import, but users need the package installed.
Similar Skills
Unified speech-to-text skill. Use when the user asks to transcribe audio or video, generate subtitles, identify speakers, translate speech, search transcripts, diarize meetings, or perform any speech-to-text task. Also use when a voice message or audio file appears in chat and the user's intent to transcribe it is extremely clear.
npx skills add ThePlasmak/super-transcribeLocal speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT/VTT/TTML/CSV subtitles, speaker diarization, URL/YouTube input, batch processing with ETA, transcript search, chapter detection, per-file language map.
npx skills add ThePlasmak/faster-whisperTranscribe and critically analyze audio/video content. Accepts a .vtt file, an audio file (.m4a, .mp3, .wav, etc.), or a URL (YouTube or other yt-dlp-supported sites). Generates a structured markdown analysis.
npx skills add jftuga/transcript-criticTransforms content between formats and platforms. Use when user says 'turn this into', 'repurpose this as', 'make this a', 'atomize this', or 'reformat for'. Creates Twitter/X threads, LinkedIn posts, email newsletters, Instagram carousels, YouTube Shorts scripts, TikTok scripts, Threads posts, Bluesky posts, podcast talking points from any source (pasted text, URL, transcript, rough notes, or topic idea). Also converts between content types: podcastâblog, threadâarticle, notesânewsletter, case studyâtemplate. Includes Writing Style matching that learns your style once and applies it automatically. Ends with a humanizer pass that removes AI writing patterns from every output.
npx skills add baagad-ai/content-wand