- This is a major new feature, which now allows for much more natural speech generation by lowering the influence of the emotion vector/text control modes.
- It is particularly useful for the "emotion text description" control mode, where a strength of 0.6 or lower is useful to get much more natural speech.
- Added support for `emo_alpha` scaling of emotion vectors and emotion text inputs.
- This is a major new feature, which now allows for much more natural speech generation by lowering the influence of the emotion vector/text control modes.
- It is particularly useful for the "emotion text description" control mode, where a strength of 0.6 or lower is useful to get much more natural speech. Before this feature, it was not possible to make natural speech with that mode, because QwenEmotion assigns emotion scores to the text from 0.0-1.0, and that score was used directly as an emotion vector. This meant that the text mode always used very high strengths. Now, the user can adjust the strength of the emotions to get very natural results.
- Refactored `IndexTTS2.infer()` variable initialization logic to avoid repetition and ensure cleaner code paths.
- Refactored to a unified device listing function.
- Now checks every supported hardware acceleration device type and lists the devices for all of them, to give a deeper system analysis.
- Added Intel XPU support.
- Improved AMD ROCm support.
- Improved Apple MPS support.
- Several users have unfortunately disregarded the `uv` instructions and ended up with broken `conda` / `pip` installations. We require `uv` for a reason: It's the *only* way to guarantee an exact, well-tested installation environment.
- The warning is now clearly highlighted, with a deeper explanation about why `uv` is required, so that everyone can enjoy IndexTTS without hassle!