-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Word cloud for tamil and english language #688
Comments
Does the if you want to give it a frequency distribution instead of the raw text, you need to use the |
I'm sorry, I can't read tamil, so I don't know what's wrong with the processing. The most likely cause is the font is not supporting some characters. |
I am using Nirmala.ttf font . While plotting in matplotlib using Nirmala.ttf ,all the words are printed correctly with the help of mplcairo , cairo , raqm. Only while using word cloud I am getting this error . |
matplotlib uses pil/pillow under the hood. Can you try reproducing with pil? |
mostcommon = FreqDist(allwords).most_common(100)
wordcloud = WordCloud(width=1600, height=800, background_color='white',font_path='Nirmala.ttf',).generate(str(mostcommon))
fig = plt.figure(figsize=(30,10), facecolor='white')
plt.imshow(wordcloud) #, interpolation="bilinear")
plt.axis('off')
plt.title('Top 100 Most Common Words', fontsize=100)
#plt.tight_layout(pad=0)
plt.show()
The above code generates the word cloud.
English words are printed properly but not in correct size (high frequency words must have a bigger font size)
Eg:- "sir" must be have the biggest font size since it's been repeated more number of times (see output.csv)
Tamil words are not printed, only random letters are printed.
How to solve this ?
The words and its frequencies are present in output.csv.
output.csv
The text was updated successfully, but these errors were encountered: