This Python script scrapes the contents of a Medium article given its URL and saves the extracted text in a .txt
file. The script extracts all the paragraphs from the HTML page of the article and stores the content in a directory called scraped_articles
. It uses beautiful soup library that scapes text from the webpage with the help of an html parser.
- Python 3.9
- Libraries:
requests
BeautifulSoup
(from thebs4
package)
To install the necessary libraries, run:
pip install requests
pip install beautifulsoup4
When prompted to input enter the following medium article url:
https://medium.com/@subashgandyer/papa-what-is-a-neural-network-c5e5cc427c7