-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rasa X Decoding error with German umlauts #4151
Comments
Thanks for raising this issue, @gausie will get back to you about it soon. |
@kristiankolthoff Which encoding is your file in? Can you please save it as |
I saved it as |
@erohmensing Can you please check whether you can reproduce that when I'm gone? Thanks! |
Hm, I added ö to my domain.yml, nlu.md and stories.md and it loaded up with no problem. I can see the umlauts in all 3 of these places on the UI too. Of course I am also running the latest version, so you might want to try updating. Can you run this in the console for me?
I think the user in the post you mentioned is probably right with |
@erohmensing This error still reproduces for me on the latest version, using Windows in German locale. The stdin and stdout streams show UTF-8, but they are not the root cause here. The underlying issue is that python by default writes to files with the system code page, unless an override is provided when opening the file, and rasa does not specificy UTF8. Additionally, when loading the domain.yml file rasa first reformats and saves it, before actually loading and parsing it, during the first step we lose the encoding, and when loading we are no longer in UTF8 causing the error. Workaround: (Python 3.7+ only) set the environment variable Solution for rasa/rasa x: When saving the domain file (and other files as well .. ) specify utf8 as override. Python 3.7+ only: Enable utf8 mode in code. |
Thanks for the very descriptive into @taotsetung. I've tracked down the part where the domain gets written in Rasa X and you're right, the encoding isn't specified:
I assume that |
@erohmensing yes, adding the encoding to the open call fixed the error. Rasa-X is not open source so we can't make a PR? |
Yes, unfortunately. But we actually already merged the PR to fix this issue :) will close it when the fix is released. |
fix is part of Rasa X 0.20.3 |
Rasa version: 1.1.7
Rasa X version (if used & relevant): 0.19.5
Python version: 3.7.0
Operating system (windows, osx, ...): windows
Issue:
I am getting a decoding error when I want to start rasa x with German umlauts in the domain.yml. If I remove the special characters, I can start rasa x without problems. Same issue has been reported already on a rasa-x-demo repository here : RasaHQ/rasa-x-demo#16
After testing, this error also occurs when running
rasa train
.Error (including full traceback):
Command or request that led to error:
Content of configuration file (config.yml) (if relevant):
Content of domain file (domain.yml) (if relevant):
The text was updated successfully, but these errors were encountered: