Skip to content

Latest commit

 

History

History
29 lines (11 loc) · 665 Bytes

README.md

File metadata and controls

29 lines (11 loc) · 665 Bytes

※Namuwiki articles

-Web crawling using BeautifulSoup, used 2 main domains and 100 related articles for each main domain

-Used namuwiki articles which are written in korean as source

-Then I made word clouds from collected data

-Attained similar words using Word2Vec

Note: I wrote this notebook in jupyter notebook, not colab. You would also need a file of shape that you want for wordcloud and a font file in your local environment.

※IMDB website crawling

-Web crawling from dynamic website using selenium

-Collected reviews and ratings of John Wick4 from imdb

※MovieLens 100K

-I used this dataset to recommend movies based on user profile