Azure TTS low voice quality in AP

Problem: When using Azure cloud voices for TTS in AP, the neural voice sounds more mechanical than normal Azure samples. This happens both with and without SSML tags.

Is there a way to get less mechanical and more natural sounding voice through TSS?
Target audio quality should be the same as the sample available here: https://azure.microsoft.com/en-us/services/cognitive-services/text-to-speech/#features
With the following settings:

In the AP I tried to produce the audio with the following settings, but did not get even close to the required quality:

Endpoint: https://westeurope.api.cognitive.microsoft.com/sts/v1.0/issueToken
Voice: en-US-JennyNeural

SSML:


<mstts:express-as style=“customerservice”>You can replace this text with any text you wish. You can either write in this text box or paste your own text here.
</mstts:express-as>

I would love to attach some audio samples to this message but I am currently unable to do so as a new user.

ActivePresenter version: 8.5.4

OS: Win 10

Notes:
It would save me and my colleagues a lot of time, because we are required to use the “less mechanical voice” in our videos which we currently have to generate via and external app and put into our videos one by one manually.

Hi,

Thank you for letting us know the problem. We will release the updated version soon.

Regards,

Hello Nam,

Thanks a lot for a quick response and your Exceptional work guys!

Just to let other forum users know, I was contacted by AP staff via an e-mail and received a work in progress version of ActivePresenter with the requested functionality fixed and working. Just a day after submitting the forum post :wink:

2 Likes

Hello Vit,

Thank you for your kind words.

And please notice that the official ActivePresenter 8.5.8 with the fix was released. Please check it out.

Regards,

Hi,

There is still a problem.

I’ve just downloaded 8.5.8 and tried to use TTS for Azure and it sounds like bad static. I then tried the older version 8.5.7 and it worked perfectly.

I’m using W10 and the trial versions of AP.

Hi,
Your issue is slightly different from the one I have described in this topic.
But I might have encountered similar one when “previewing” the audio by pressing the Speak button (not sure if that is your case). Try if the issues are present also in the exported final version of the video (they should not be). Check/change the output audio software and its drivers because they caused the issue for me.

Hi mackavi,

Could you please send us the project file that has the problem so we can check?

Regards,

Thanks for both replies.

I hadn’t got past the ‘speak’ preview button on the properties dialogue. I tried using the ‘generate’ button this morning and the wave file added to the timeline worked normally.

I was going to send a recording of the problematic audio but it seems my W10 is one of those rare devices that does not include a sound mix driver for recording what you hear. In solving this, I’ve solved the problem.

My system has always used Nvidia HD sound drivers from my graphics card without any issue. This morning, I replaced them with Realtek HD drivers and previewing sound in 8.5.8 now works.

If the new drivers don’t cause any unforeseen issues, then I’ll stick with them.