Lower values are more stable, higher values are more creative.
Higher values make the voice sound more similar to the original speaker.
Adds emotion and expressiveness to the generated voice.
Choose the voice for the text-to-speech conversion.