-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Translate category scheme to english #3
Comments
Okay, I will do. By quick check there is over 200k unique labels under "Top/World/..." that will be non-English. But seems Google Translate is limit just 1,000 words/day? |
Not sure if we have good alternatives. And it seems that Google pricing is reasonable: We can run through it one time. |
Actually, Google Translate API has the following limit :- (it's not 1,000 words/day)
By splitting each level of the category and grouping them by the language, we can get the smaller unique list of words for each language. It's about 1.5M characters so probably free quota will enough to translate it all. |
Lots of category labels are in a language other than English
For non-english, it appears one pattern is that language is in the path: Deutsch, Japanese etc.
Perhaps use google translate to translate it? One package we could use:
https://pypi.python.org/pypi/translate
Final output will have an additional column -> cat_labels_english
The text was updated successfully, but these errors were encountered: