Skip to content

Example code for scraping a podcast on Squarespace and uploading the data to TOM-site

License

Notifications You must be signed in to change notification settings

orbitalpodcast/TOM-scrape

Repository files navigation

TOM-scrape

This is example code for scraping a podcast on Squarespace and uploading the data to TOM-site.

Key concepts included in this example code:

  • HTTP GET requests using Requests library
  • Parsing HTML with Beautiful Soup
  • Reading and writing data to a file using JSON
  • Reading and writing binary data to disk
  • Creating HTTP POST and PATCH requests
  • Formatting data in those requests to interface with Rails' ActiveRecord
  • Interpreting errors from failed POST requests

The code

tom-scrape.py is pretty universal. Different SS themes might mess with it, and of course your content will be completely different than mine, but it should be a pretty full-fleshed example. Please feel free to use this code, but please do change the user agent.

tom-upload.py is specific to the API of my website, but it's a great example of how to upload binary files to a Rails application! The key here is to not include content-type in the HTTP header (HUGE thanks to Dakota Lillie for this insight!).

Please see my other repo for the full server code that this script interacts with, but the basics are as follows:

  # POST /episodes
  # POST /episodes.json
  def create
    @episode = Episode.new(episode_params.except(:images))
    respond_to do |format|
      if @episode.save
        format.json { render :show, status: :created, location: @episode }
      else
        format.json { render json: {errors: @episode.errors}, status: 422, encoding: 'application/json' }
      end
    end
  end

  # PATCH/PUT /episodes/1
  # PATCH/PUT /episodes/1.json
  def update
    respond_to do |format|
      if @episode.update( episode_params.except(:images))
        update_attachments
        format.json { render :show, status: :accepted, location: @episode }
      else
        format.json { render json: {errors: @episode.errors}, status: 422, encoding: 'application/json' }
      end
    end
  end  

  # PATCH/PUT /upload_audio/1
  def upload_audio
    unless @episode = Episode.find_by(number: params[:number])
      render inline: '{"errors":{"number":"non-existant episode number."}}', status: 422, encoding: 'application/json' and return
    end
    file = params[:file]
    @episode.audio.attach(file)
  end

  # DELETE /episodes/1
  # DELETE /episodes/1.json
  def destroy
    @episode.destroy
    respond_to do |format|
      format.html { redirect_to episodes_url, notice: 'Episode was successfully destroyed.' }
      format.json { head :no_content }
    end
  end

About

Example code for scraping a podcast on Squarespace and uploading the data to TOM-site

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages