diff --git a/audios/speech_editing/0.wav b/audios/speech_editing/0.wav new file mode 100644 index 0000000..08f91c4 Binary files /dev/null and b/audios/speech_editing/0.wav differ diff --git a/audios/speech_editing/0_0.wav b/audios/speech_editing/0_0.wav new file mode 100644 index 0000000..108825c Binary files /dev/null and b/audios/speech_editing/0_0.wav differ diff --git a/audios/speech_editing/0_0_voicecraft.wav b/audios/speech_editing/0_0_voicecraft.wav new file mode 100644 index 0000000..bd7495f Binary files /dev/null and b/audios/speech_editing/0_0_voicecraft.wav differ diff --git a/audios/speech_editing/0_ditto_all.wav b/audios/speech_editing/0_ditto_all.wav new file mode 100644 index 0000000..53a23f1 Binary files /dev/null and b/audios/speech_editing/0_ditto_all.wav differ diff --git a/audios/speech_editing/1.wav b/audios/speech_editing/1.wav new file mode 100644 index 0000000..11f99a9 Binary files /dev/null and b/audios/speech_editing/1.wav differ diff --git a/audios/speech_editing/1_0.wav b/audios/speech_editing/1_0.wav new file mode 100644 index 0000000..ab10088 Binary files /dev/null and b/audios/speech_editing/1_0.wav differ diff --git a/audios/speech_editing/1_0_voicecraft.wav b/audios/speech_editing/1_0_voicecraft.wav new file mode 100644 index 0000000..54999c7 Binary files /dev/null and b/audios/speech_editing/1_0_voicecraft.wav differ diff --git a/audios/speech_editing/1_ditto_all.wav b/audios/speech_editing/1_ditto_all.wav new file mode 100644 index 0000000..834bb64 Binary files /dev/null and b/audios/speech_editing/1_ditto_all.wav differ diff --git a/audios/speech_editing/2.wav b/audios/speech_editing/2.wav new file mode 100644 index 0000000..07d53ec Binary files /dev/null and b/audios/speech_editing/2.wav differ diff --git a/audios/speech_editing/2_0.wav b/audios/speech_editing/2_0.wav new file mode 100644 index 0000000..f4323ef Binary files /dev/null and b/audios/speech_editing/2_0.wav differ diff --git a/audios/speech_editing/2_0_voicecraft.wav b/audios/speech_editing/2_0_voicecraft.wav new file mode 100644 index 0000000..e1066a9 Binary files /dev/null and b/audios/speech_editing/2_0_voicecraft.wav differ diff --git a/audios/speech_editing/2_ditto_all.wav b/audios/speech_editing/2_ditto_all.wav new file mode 100644 index 0000000..f75974a Binary files /dev/null and b/audios/speech_editing/2_ditto_all.wav differ diff --git a/audios/speech_editing/3.wav b/audios/speech_editing/3.wav new file mode 100644 index 0000000..72046cb Binary files /dev/null and b/audios/speech_editing/3.wav differ diff --git a/audios/speech_editing/3_0.wav b/audios/speech_editing/3_0.wav new file mode 100644 index 0000000..b171721 Binary files /dev/null and b/audios/speech_editing/3_0.wav differ diff --git a/audios/speech_editing/3_0_voicecraft.wav b/audios/speech_editing/3_0_voicecraft.wav new file mode 100644 index 0000000..8ddde7e Binary files /dev/null and b/audios/speech_editing/3_0_voicecraft.wav differ diff --git a/audios/speech_editing/3_ditto_all.wav b/audios/speech_editing/3_ditto_all.wav new file mode 100644 index 0000000..93010b9 Binary files /dev/null and b/audios/speech_editing/3_ditto_all.wav differ diff --git a/audios/speech_editing/4.wav b/audios/speech_editing/4.wav new file mode 100644 index 0000000..1a36038 Binary files /dev/null and b/audios/speech_editing/4.wav differ diff --git a/audios/speech_editing/4_0.wav b/audios/speech_editing/4_0.wav new file mode 100644 index 0000000..7e5e119 Binary files /dev/null and b/audios/speech_editing/4_0.wav differ diff --git a/audios/speech_editing/4_0_voicecraft.wav b/audios/speech_editing/4_0_voicecraft.wav new file mode 100644 index 0000000..3f34640 Binary files /dev/null and b/audios/speech_editing/4_0_voicecraft.wav differ diff --git a/audios/speech_editing/4_ditto_all.wav b/audios/speech_editing/4_ditto_all.wav new file mode 100644 index 0000000..5e62207 Binary files /dev/null and b/audios/speech_editing/4_ditto_all.wav differ diff --git a/audios/speech_editing/5.wav b/audios/speech_editing/5.wav new file mode 100644 index 0000000..234525f Binary files /dev/null and b/audios/speech_editing/5.wav differ diff --git a/audios/speech_editing/5_0.wav b/audios/speech_editing/5_0.wav new file mode 100644 index 0000000..43f700c Binary files /dev/null and b/audios/speech_editing/5_0.wav differ diff --git a/audios/speech_editing/5_0_voicecraft.wav b/audios/speech_editing/5_0_voicecraft.wav new file mode 100644 index 0000000..d6ecd41 Binary files /dev/null and b/audios/speech_editing/5_0_voicecraft.wav differ diff --git a/audios/speech_editing/5_ditto_all.wav b/audios/speech_editing/5_ditto_all.wav new file mode 100644 index 0000000..a11ae39 Binary files /dev/null and b/audios/speech_editing/5_ditto_all.wav differ diff --git a/index.html b/index.html index 5596971..2c1b7af 100644 --- a/index.html +++ b/index.html @@ -183,6 +183,17 @@

Speech Rate Controllability +

DiTTo-TTS can generate robust speech, as demonstrated by its low WER. Additionally, our model is capable of consistently producing a 'whispering' effect (please listen to the first sample). - Baseline samples are taken from the Mega-TTS demo1 and CLaM-TTS demo2Speech Rate Controllabili +
+

+ Speech Editing +

+

+ DiTTo-TTS is capable of correcting mispronounced words by editing generated speech contents, + without the need for the speaker to re-record the audio. The examples are brought from + Voicebox demo1 and + VoiceCraft demo2. +

+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Original TextOriginal AudioEdited TextVoicebox
(16 kHz)
VoiceCraft
(16 kHz)
DiTTo-TTS
(22.05 kHz)
+ in zero weather in mid-winter when the earth is frozen to a great depth below the surface when in driving over the unpaved country roads they give forth a hard metallic road + + + + in zero weather in mid-winter when jack frost has cast his icy spell upon the land when in driving over the unpaved country roads they give forth a hard metallic road + + + + + + +
+ and especially as i am not very much up in latin myself he said the suit was on an insurance policy that he was defending on the ground of misinterpretations + + + + and especially as i am not very much up in latin myself he said the suit was on a classified treasure map that he was defending on the ground of misinterpretations + + + + + + +
+ yet these petty operations incessantly continued in time surmount the greatest difficulties mountains are elevated and oceans bounded by the slender force of human beings + + + + yet these petty operations incessantly continued in time surmount the greatest difficulties vast challenges emerge and unexplored frontiers beckon by the slender force of human beings + + + + + + +
+ will find himself completely at a loss on occasions of common and constant recurrence speculative ability is one thing and practical ability is another + + + + will find himself completely at a loss on rare and unpredictable circumstances speculative ability is one thing and practical ability is another + + + + + + +
+ and the carlsruhe professor had to devise an ingenious apparatus which enabled him to bring the preparation at the required temperature on to the very plate of the microscope + + + + and the inventive professor had to devise an ingenious apparatus which enabled him to bring the preparation at the required temperature on to the very plate of the microscope + + + + + + +
+ this was george steers the son of a british naval captain and ship modeler who had become an american naval officer and was the first man to take charge of the washington navy yard + + + + this was george steers the son of a british naval captain and ship modeler who had become an american naval officer and was entrusted with the prestigious role of overseeing the operations at the renowned naval headquarters + + + + + + +
+
+
+
+ 1https://voicebox.metademolab.com/edit.html
+ 2https://jasonppy.github.io/VoiceCraft_web/ +