[Feature request] #4062

kunge98 · 2024-11-27T08:44:30Z

My code is shown in the figure, using the model of Chinese speech synthesis, my synthesis test text is "你好", when I open the output audio, I find that the audio has 4 seconds, but the two words "你好" obviously do not need that long time, and the synthesized audio is followed by some sounds similar to howling, may I ask what I need to do? Do you need to modify the config.json file?

kunge98 · 2024-11-27T08:52:13Z

Then I tried to modify the max_decoder_steps value in the config.json file, and the output voice confirmation could be shortened. But right now I want to do speech generation for bulk text, but the length of this article is different. How can I do that?

stale · 2024-12-28T11:57:50Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

kunge98 added the feature request feature requests for making TTS better. label Nov 27, 2024

stale bot added the wontfix This will not be worked on but feel free to help. label Dec 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] #4062

[Feature request] #4062

kunge98 commented Nov 27, 2024 •

edited

Loading

kunge98 commented Nov 27, 2024

stale bot commented Dec 28, 2024

[Feature request] #4062

[Feature request] #4062

Comments

kunge98 commented Nov 27, 2024 • edited Loading

kunge98 commented Nov 27, 2024

stale bot commented Dec 28, 2024

kunge98 commented Nov 27, 2024 •

edited

Loading