Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthesizing some phrases triggers onnx error ("GatherElements op: Out of range value in index tensor") #520

Open
knochenhans opened this issue Jun 8, 2024 · 6 comments

Comments

@knochenhans
Copy link

Hi, I’m currently trying to track down an issue when using the current Piper version with Python that came up after a recent system update. This runs in a venv with Python 3.11.9 (can’t test this in my main Python version 3.12.3 because of issue #509 for now). The following minimal example, trying to synthesizing the text "This is a test. This is a Test.", reproducibly produces the following rather strange error which seems to be related to onnx:

2024-06-09 01:24:01.191154583 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running GatherElements node. Name:'/dp/flows.7/GatherElements_3' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/gather_elements.cc:154 void onnxruntime::core_impl(const Tensor*, const Tensor*, Tensor*, int64_t, concurrency::ThreadPool*) [with Tin = long int; int64_t = long int] GatherElements op: Out of range value in index tensor

Here is the minimal example (stripped down version of a much larger project):

import io
import wave

from piper import PiperVoice

synthesize_args = {
    "speaker_id": None,
    "length_scale": None,
    "noise_scale": None,
    "noise_w": None,
    "sentence_silence": 0.5,
}

model = PiperVoice.load(
    "/usr/share/piper-voices/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx",
    "/usr/share/piper-voices/en/en_US/hfc_male/medium/en_US-hfc_male-medium.onnx.json",
)

wave_io = io.BytesIO()
with wave.open(wave_io, "wb") as wav_file:
    model.synthesize("This is a test. This is a Test.", wav_file, **synthesize_args) # <- Produces the error
    # model.synthesize("This is a test. Test.", wav_file, **synthesize_args) <- This works for some reason

As you can see, shortening the text makes this work again for some reason. Before the system update this kind of error never came up. This uses onnxruntime-1.18.0, piper_phonemize-1.1.0., and piper_tts-1.2.0.

This works fine in the binary version of Piper, by the way.

@jarvisSM24
Copy link

just encountered the same issue a downgrade to the 1.17.1 for me resolved this issue seems to be a bug in the 1.18 version

@knochenhans
Copy link
Author

knochenhans commented Jun 9, 2024

Thanks for the hint, I can confirm downgrading onnxruntime solves the problem for now!

I guess this is related to microsoft/onnxruntime#20877. I initially ran into that thread but got discouraged from trying to downgrade as this didn’t seem to help the last participant, while the actual solution (changing "some of the tensors from int64 to int when calculating the metric on the prediction") was completely over my head 🙃

@KRISHpatel-01
Copy link

yeh thanks it also worked for me. But really I didn't get it why it didn't worked in 1.18 version

@jnhck
Copy link

jnhck commented Jun 28, 2024

Thanks for figuring this out. We had the same problem!

@tejas-hosamani
Copy link

I am facing this issue as well

@hanneseilers
Copy link

It still appears in latest piper version on Ubuntu linux 22.04.5 LTS. It seems this error relies on the text length used. On some speech models, processing is going futher, on some not. also the length of the processed text sometime varies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants