Task	Text	Source	Edited
Emotion-Fear	我总觉得，有人在跟着我，我能听到奇怪的脚步声。	[fear_zh_female_prompt.webm](https://github.com/user-attachments/assets/a088c059-032c-423f-81d6-3816ba347ff5)	[fear_zh_female_output.webm](https://github.com/user-attachments/assets/917494ac-5913-4949-8022-46cf55ca05dd)
Style-Whisper	比如在工作间隙，做一些简单的伸展运动，放松一下身体，这样，会让你更有精力。	[whisper_prompt.webm](https://github.com/user-attachments/assets/ed9e22f1-1bac-417b-913a-5f1db31f35c9)	[whisper_output.webm](https://github.com/user-attachments/assets/e0501050-40db-4d45-b380-8bcc309f0b5f)
Style-Act_coy	我今天想喝奶茶，可是不知道喝什么口味，你帮我选一下嘛，你选的都好喝～	[act_coy_prompt.webm](https://github.com/user-attachments/assets/74d60625-5b3c-4f45-becb-0d3fe7cc4b3f)	[act_coy_output.webm](https://github.com/user-attachments/assets/b2f74577-56c2-4997-afd6-6bf47d15ea51)
Paralinguistics	你这次又忘记带钥匙了 [Dissatisfaction-hnn]，真是拿你没办法。	[paralingustic_prompt.webm](https://github.com/user-attachments/assets/21e831a3-8110-4c64-a157-60e0cf6735f0)	[paralingustic_output.webm](https://github.com/user-attachments/assets/a82f5a40-c6a3-409b-bbe6-271180b20d7b)
Denoising	Such legislation was clarified and extended from time to time thereafter. No, the man was not drunk, he wondered how we got tied up with this stranger. Suddenly, my reflexes had gone. It's healthier to cook without sugar.	[denoising_prompt.webm](https://github.com/user-attachments/assets/70464bf4-ebde-44a3-b2a6-8c292333319b)	[denoising_output.webm](https://github.com/user-attachments/assets/7cd0ae8d-1bf0-40fc-9bcd-f419bd4b2d21)
Speed-Faster	上次你说鞋子有点磨脚，我给你买了一双软软的鞋垫。	[speed_faster_prompt.webm](https://github.com/user-attachments/assets/db46609e-1b98-48d8-99c8-e166cfdfc6e3)	[speed_faster_output.webm](https://github.com/user-attachments/assets/0fbc14ca-dd4a-4362-aadc-afe0629f4c9f)

Task

Text

Source

Edited

Emotion-Fear

我总觉得，有人在跟着我，我能听到奇怪的脚步声。

[fear_zh_female_prompt.webm](https://github.com/user-attachments/assets/a088c059-032c-423f-81d6-3816ba347ff5)

[fear_zh_female_output.webm](https://github.com/user-attachments/assets/917494ac-5913-4949-8022-46cf55ca05dd)

Style-Whisper

比如在工作间隙，做一些简单的伸展运动，放松一下身体，这样，会让你更有精力。

[whisper_prompt.webm](https://github.com/user-attachments/assets/ed9e22f1-1bac-417b-913a-5f1db31f35c9)

[whisper_output.webm](https://github.com/user-attachments/assets/e0501050-40db-4d45-b380-8bcc309f0b5f)

Style-Act_coy

我今天想喝奶茶，可是不知道喝什么口味，你帮我选一下嘛，你选的都好喝～

[act_coy_prompt.webm](https://github.com/user-attachments/assets/74d60625-5b3c-4f45-becb-0d3fe7cc4b3f)

[act_coy_output.webm](https://github.com/user-attachments/assets/b2f74577-56c2-4997-afd6-6bf47d15ea51)

Paralinguistics

你这次又忘记带钥匙了 [Dissatisfaction-hnn]，真是拿你没办法。

[paralingustic_prompt.webm](https://github.com/user-attachments/assets/21e831a3-8110-4c64-a157-60e0cf6735f0)

[paralingustic_output.webm](https://github.com/user-attachments/assets/a82f5a40-c6a3-409b-bbe6-271180b20d7b)

Denoising

Such legislation was clarified and extended from time to time thereafter. No, the man was not drunk, he wondered how we got tied up with this stranger. Suddenly, my reflexes had gone. It's healthier to cook without sugar.

[denoising_prompt.webm](https://github.com/user-attachments/assets/70464bf4-ebde-44a3-b2a6-8c292333319b)

[denoising_output.webm](https://github.com/user-attachments/assets/7cd0ae8d-1bf0-40fc-9bcd-f419bd4b2d21)

Speed-Faster

上次你说鞋子有点磨脚，我给你买了一双软软的鞋垫。

[speed_faster_prompt.webm](https://github.com/user-attachments/assets/db46609e-1b98-48d8-99c8-e166cfdfc6e3)

[speed_faster_output.webm](https://github.com/user-attachments/assets/0fbc14ca-dd4a-4362-aadc-afe0629f4c9f)

**Table: Generalization of Emotion, Speaking Style, and Paralinguistic Editing on Closed-Source Models.**
Language	Model	Emotion ↑				Speaking Style ↑				Paralinguistic ↑
Language	Model	Iter₀	Iter₁	Iter₂	Iter₃	Iter₀	Iter₁	Iter₂	Iter₃	Iter₀	sub	Iter₁
Chinese	MiniMax-2.6-hd	71.6	78.6	81.2	83.4	36.7	58.8	63.1	67.3	1.73	2.80	2.90
	Doubao-Seed-TTS-2.0	67.4	77.8	80.6	82.8	38.2	60.2	65.0	64.9	1.67	2.81	2.90
	GPT-4o-mini-TTS	62.6	76.0	77.0	81.8	45.9	64.0	65.7	69.7	1.71	2.88	2.93
	ElevenLabs-v2	60.4	74.6	77.4	79.2	43.8	63.3	69.7	70.8	1.70	2.71	2.92
English	MiniMax-2.6-hd	55.0	64.0	64.2	66.4	51.9	60.3	62.3	64.3	1.72	2.87	2.88
	Doubao-Seed-TTS-2.0	53.8	65.8	65.8	66.2	47.0	62.0	62.7	62.3	1.72	2.75	2.92
	GPT-4o-mini-TTS	56.8	61.4	64.8	65.2	52.3	62.3	62.4	63.4	1.90	2.90	2.88
	ElevenLabs-v2	51.0	61.2	64.0	65.2	51.0	62.1	62.6	64.0	1.93	2.87	2.88
Average	MiniMax-2.6-hd	63.3	71.3	72.7	74.9	44.2	59.6	62.7	65.8	1.73	2.84	2.89
	Doubao-Seed-TTS-2.0	60.6	71.8	73.2	74.5	42.6	61.1	63.9	63.6	1.70	2.78	2.91
	GPT-4o-mini-TTS	59.7	68.7	70.9	73.5	49.1	63.2	64.1	66.6	1.81	2.89	2.90
	ElevenLabs-v2	55.7	67.9	70.7	72.2	47.4	62.7	66.1	67.4	1.82	2.79	2.90

emotion	happy	Expressing happiness	angry	Expressing anger
	sad	Expressing sadness	fear	Expressing fear
	surprised	Expressing surprise	confusion	Expressing confusion
	empathy	Expressing empathy and understanding	embarrass	Expressing embarrassment
	excited	Expressing excitement and enthusiasm	depressed	Expressing a depressed or discouraged mood
	admiration	Expressing admiration or respect	coldness	Expressing coldness and indifference
	disgusted	Expressing disgust or aversion	humour	Expressing humor or playfulness

speaking style	serious	Speaking in a serious or solemn manner	arrogant	Speaking in an arrogant manner
	child	Speaking in a childlike manner	older	Speaking in an elderly-sounding manner
	girl	Speaking in a light, youthful feminine manner	pure	Speaking in a pure, innocent manner
	sister	Speaking in a mature, confident feminine manner	sweet	Speaking in a sweet, lovely manner
	exaggerated	Speaking in an exaggerated, dramatic manner	ethereal	Speaking in a soft, airy, dreamy manner
	whisper	Speaking in a whispering, very soft manner	generous	Speaking in a hearty, outgoing, and straight-talking manner
	recite	Speaking in a clear, well-paced, poetry-reading manner	act_coy	Speaking in a sweet, playful, and endearing manner
	warm	Speaking in a warm, friendly manner	shy	Speaking in a shy, timid manner
	comfort	Speaking in a comforting, reassuring manner	authority	Speaking in an authoritative, commanding manner
	chat	Speaking in a casual, conversational manner	radio	Speaking in a radio-broadcast manner
	soulful	Speaking in a heartfelt, deeply emotional manner	gentle	Speaking in a gentle, soft manner
	story	Speaking in a narrative, audiobook-style manner	vivid	Speaking in a lively, expressive manner
	program	Speaking in a show-host/presenter manner	news	Speaking in a news broadcasting manner
	advertising	Speaking in a polished, high-end commercial voiceover manner	roar	Speaking in a loud, deep, roaring manner
	murmur	Speaking in a quiet, low manner	shout	Speaking in a loud, sharp, shouting manner
	deeply	Speaking in a deep and low-pitched tone	loudly	Speaking in a loud and high-pitched tone

paralinguistic	Breathing	Breathing sound	Laughter	Laughter or laughing sound
	Uhm	Hesitation sound: "Uhm"	Sigh	Sighing sound
	Surprise-oh	Expressing surprise: "Oh"	Surprise-ah	Expressing surprise: "Ah"
	Surprise-wa	Expressing surprise: "Wa"	Confirmation-en	Confirming: "En"
	Question-ei	Questioning: "Ei"	Dissatisfaction-hnn	Dissatisfied sound: "Hnn"