index-tts2-ForDgxSpark

Author	SHA1	Message	Date
Arcitec	e185fa1ce7	fix(webui): Make the Advanced Settings visible by default again - The Advanced Settings contains some very advanced features which users shouldn't tweak, but it also contains important insight into segmentation generations, and the "max tokens per generation segment" feature which users must tweak if they have low VRAM. - Therefore it's very important that users notice the "Advanced Settings" section so that they can read the VRAM help text and reduce the segment length if they have VRAM issues. So let's make the advanced category visible by default again until a better solution is determined.	2025-09-17 19:56:07 +02:00
Arcitec	c266910cc6	refactor(webui): Remove repeated code in Examples loader	2025-09-17 19:56:07 +02:00
Arcitec	8aa8064a53	feat: Add reusable Emotion Vector normalization helper - The WebUI was secretly squashing all emotion vectors and re-scaling them. It's a good idea for user friendliness, but it makes it harder to learn what values will work in Python when using the WebUI for testing. - Instead, let's move the normalization code into IndexTTS2 as a helper function which is used by Gradio and can be used from other people's code too. - The emotion bias (which reduces the influence of certain emotions) has also been converted into an optional feature, which can be turned off if such biasing isn't wanted. And all biasing values have been re-scaled to use 1.0 as the reference, to avoid scaling relative to 0.8 (which previously meant that it applied double scaling).	2025-09-17 19:56:07 +02:00
Arcitec	1520d0689b	fix(webui): New default emo_alpha recommendation instead of scaling - Silently scaling the value internally is confusing for users. They may be tuning their settings via the Web UI before putting the same values into their Python code, and would then get a different result since the Web UI "lies" about the slider values. - Instead, let's remove the silent scaling, and just change the default weight to a better recommendation.	2025-09-17 19:56:07 +02:00
Arcitec	ef097101b7	fix(webui): Add support for Gradio 5.45.0 and higher - We were using ".select" to detect when tabs are changed, but Gradio has modified behavior in 5.45.0 to only trigger from user clicks. They now require that we use ".change" to detect tab changes from code. This fix makes the Examples work when loading on new Gradio versions.	2025-09-17 19:56:07 +02:00
index-tts	cb5c98011f	Merge pull request #378 from index-tts/tts2dev update Contributors	2025-09-17 11:39:05 +08:00
shujingchen	d50340aa5b	update Contributors	2025-09-17 11:37:20 +08:00
index-tts	12ee39996f	Merge pull request #375 from index-tts/tts2dev update Contributors	2025-09-16 20:22:52 +08:00
shujingchen	a37d808923	update Contributors	2025-09-16 20:20:50 +08:00
index-tts	02c1e5a234	Merge pull request #374 from index-tts/tts2dev Update contributors	2025-09-16 19:45:47 +08:00
shujingchen	901a5a4111	update Contributors	2025-09-16 19:43:32 +08:00
shujingchen	1361244010	update Contributors	2025-09-16 19:38:33 +08:00
shujingchen	c2482142d6	Merge remote-tracking branch 'origin/main' into tts2dev	2025-09-16 19:28:59 +08:00
shujingchen	3e416dc598	update Contributors	2025-09-16 19:28:09 +08:00
index-tts	70aa801b25	Merge pull request #372 from index-tts/tts2dev update readme	2025-09-16 15:55:13 +08:00
shujingchen	58f8a9d2b1	Merge remote-tracking branch 'origin/main' into tts2dev	2025-09-16 15:53:38 +08:00
shujingchen	e3595faec1	add Contributors in Bilibili	2025-09-16 15:51:46 +08:00
shujingchen	ef86774658	update Official Statement	2025-09-16 14:21:02 +08:00
shujingchen	de949be82a	update Official Statement	2025-09-16 14:18:49 +08:00
index-tts	45d8d13f0b	Merge pull request #368 from index-tts/tts2dev Include usage notes for Pinyin	2025-09-16 13:22:22 +08:00
shujingchen	961dcc23f4	add pinyin.vocab	2025-09-16 13:18:55 +08:00
shujingchen	be4af061f1	update	2025-09-16 13:13:21 +08:00
shujingchen	10c1fcd3ad	add tips: pinyin usage	2025-09-16 13:10:40 +08:00
shujingchen	7b4f0880d9	update modelscope demo page link	2025-09-16 11:31:15 +08:00
shujingchen	aad61c2afc	Merge remote-tracking branch 'origin/main' into tts2dev	2025-09-16 11:25:54 +08:00
nanaoto	a058502865	Add Docker publish workflow configuration	2025-09-15 17:47:08 +08:00
nanaoto	ee23371296	Merge pull request #338 from yrom/fix/preload-bigvgan-cuda Correct the import path of BigVGAN's custom cuda kernel	2025-09-15 16:27:40 +08:00
nanaoto	009428b62d	Merge pull request #347 from index-tts/cut_audio feat: 裁剪过长的输入音频至15s,减少爆内存和显存	2025-09-12 16:48:14 +08:00
nanaoto	0828dcb098	feat: 裁剪过长的输入音频至15s,减少爆内存和显存	2025-09-12 16:45:37 +08:00
shujingchen	6118d0ecf9	update modelscope demo page link	2025-09-12 16:20:37 +08:00
nanaoto	48a71aff6d	Merge pull request #345 from index-tts/webui_update feat: 归一化参数到推荐的范围，改善用户体验	2025-09-12 14:23:24 +08:00
nanaoto	af2b06e061	feat: 归一化参数到推荐的范围，改善用户体验	2025-09-12 14:20:04 +08:00
LGZwr	2cfc76ad9c	fix: 修复样本音频太长报错的问题，对音频进行裁切。	2025-09-12 14:08:46 +08:00
Arcitec	d777b8a029	docs: Add FP16 usage advice for faster inference	2025-09-12 14:06:30 +08:00
Yrom	e409c4a19b	fix(infer_v2): Correct the import path of BigVGAN's custom cuda kernel	2025-09-11 16:55:18 +08:00
nanaoto	8336824c71	Merge pull request #325 from Arcitec/indextts2-arc IndexTTS2 New Features & Maintenance Patches	2025-09-11 12:55:38 +08:00
Arcitec	85ba55a1d3	docs: Document the DeepSpeed performance effects	2025-09-11 06:37:03 +02:00
Arcitec	f041d8eb64	fix(webui): Fix unintentional empty spacing between control groups	2025-09-11 06:08:08 +02:00
Arcitec	3b5b6bca85	docs: Document the new `emo_alpha` feature for text-to-emotion mode	2025-09-11 05:42:39 +02:00
Arcitec	d899770313	feat(webui): Implement emotion weighting for vectors and text modes - This is a major new feature, which now allows for much more natural speech generation by lowering the influence of the emotion vector/text control modes. - It is particularly useful for the "emotion text description" control mode, where a strength of 0.6 or lower is useful to get much more natural speech.	2025-09-11 04:25:26 +02:00
Arcitec	9668064377	feat: Implement `emo_alpha` scaling of emotion vectors and emotion text - Added support for `emo_alpha` scaling of emotion vectors and emotion text inputs. - This is a major new feature, which now allows for much more natural speech generation by lowering the influence of the emotion vector/text control modes. - It is particularly useful for the "emotion text description" control mode, where a strength of 0.6 or lower is useful to get much more natural speech. Before this feature, it was not possible to make natural speech with that mode, because QwenEmotion assigns emotion scores to the text from 0.0-1.0, and that score was used directly as an emotion vector. This meant that the text mode always used very high strengths. Now, the user can adjust the strength of the emotions to get very natural results. - Refactored `IndexTTS2.infer()` variable initialization logic to avoid repetition and ensure cleaner code paths.	2025-09-11 04:24:47 +02:00
Arcitec	555e146fb4	feat(webui): Implement speech synthesis progress bar	2025-09-11 04:17:02 +02:00
Arcitec	55095de317	chore: Lock Gradio version due to bug in 5.45.0 Their new 5.45.0 release today breaks the ability to load examples. We have to lock the last working version of Gradio.	2025-09-11 04:16:46 +02:00
Arcitec	39a035d106	feat: Extend GPU Check utility to support more GPUs - Refactored to a unified device listing function. - Now checks every supported hardware acceleration device type and lists the devices for all of them, to give a deeper system analysis. - Added Intel XPU support. - Improved AMD ROCm support. - Improved Apple MPS support.	2025-09-11 04:16:27 +02:00
Arcitec	6113567e94	fix(cli): More robust device priority checks	2025-09-11 04:16:27 +02:00
Arcitec	c3d7ab4adc	docs: Add usage note regarding random sampling	2025-09-11 04:15:58 +02:00
Arcitec	30848efd45	docs: Add Alibaba's high-bandwidth PyPI mirror for China	2025-09-11 04:15:58 +02:00
Arcitec	752df30549	chore: Move docs to new directory	2025-09-11 04:15:58 +02:00
Arcitec	f0badb13af	feat(webui)!: Easier DeepSpeed launch argument	2025-09-11 04:15:58 +02:00
nanaoto	97d06383da	Merge pull request #327 from index-tts/doc_zh 中文文档	2025-09-11 00:36:33 +08:00

1 2 3 4

194 Commits