Skip to content

Distant Reader Tutorials

Natalie Meyers edited this page Jun 27, 2020 · 17 revisions

DR Banner

Here are some Distant Reader Tutorials:


This is a recipe for study carrel creation. Like any recipe it is really an outline of what needs to be done.

First, create an SSH public key and send it to Eric C., who will create an account on our system.

Second, log into our HTTP host making sure to map TCP ports along the way. People who use PuTTY as their ssh client ought to be able to open their terminal ("cmd") and issue the following command:

putty -L localhost:8080:localhost:8080 149.165.170.150

People who use a Macintosh (or Linux or the absolute latest & greatest version of Windows) ought to be able to open their terminal ("Terminal") and issue the following command:

ssh -L localhost:8080:localhost:8080 149.165.170.150

Third, use a Web browser to search our Solr index at http://localhost:8080. Use the words AND, OR, or NOT to create Boolean queries. There are a number of fields available for searching. They include: title, authors, year, entity, type, keywords, and journal. One can also exploit stemming through the use of an asterisk ("*"). Example queries include:

* foo
* foo bar
* "foo bar"
* baz*

* foo AND bar
* foo OR bar
* (foo OR bar) NOT baz

* title:foo
* year:bar
* keywords:baz

Here's a helpful hint: search for everything with simply the asterisk symbol; submit the query "*", sans the quote marks. The result will be the totality of records in the system. The searcher can then use the resulting hyperlinked facets to narrow the results.

Fourth, once a satisfying set of results has been created, use the "queue the creation of a study carrel" link to initialize a study carrel. In the resulting HTML form supply a name of the carrel and make absolutely sure the name is a single "word", where a word contains upper or lower-case letters, numbers, and/or characters such as "-" or "_", and zero spaces. Example carrel names ("words") include:

  • virus
  • virus-2020
  • medical_care
  • MedicalCare
  • kaggle-question-01

Make your life easy. Use lowercase letters, numbers, and the dash character ("-").

If a carrel already exists with the name one enters, then the existing carrel will be overwritten. This is by design and considered a good thing. If people want to keep their carrels separate from other people's, then consider prefixing carrel names with initials. An example might be elm-medical-care. People ought to feel free to create as many carrels as they desire, even if the content test-like; if carrels are intended as tests, then consider prefixing carrels with the word... test. An example might be test-medical-care.

Fifth, wait. If nothing goes wrong, then the carrel will be initialized in 60 seconds. After another 120 seconds (or so), the carrel will begin to be processed. Continually reload the URL returned by the Step # 4, and one ought see changes. If no changes are seen after 5 minutes, then call Eric. Otherwise, wait. The carrel is building, and depending on the number of things in the carrel, the building proces will require between 15 minutes and many hours. Again, one can monitor progress by continually reloading the URL returned by Step # 4. To get more detail, consider "drilling down" the multitude of directories in the study carrel. If you REALLY want to see what is going on, then open your study carrel's "standard-error.txt" file, and you will get a step-by-step rendition of the building process.

Sixth, your study carrel is finished building when you load the URL from Step # 4 and you see an HTML page. Congratulations, you're done. If you get a blank page, then carrel is almost done. Wait some more. If an HTML page never returns, then: 1) load the log file (standard-error.txt), and 2) call Eric. What's really cool about this result is two-fold. First, you can share the URL with your friends, colleagues, etc. Second, you will be able to download the whole carrel and open it on your computer where it will be 100% functional. As a bonus, you could save the carrel on a different Web server, and it will be 100% functional there as well.

For extra credit (think, "icing on the cake"), one can add context (a title, a scope note, provenance, a date, and authorship) to a study carrel. This is an advanced technique and it is still in development. Here's how:

  1. Ssh to 149.165.170.150.

  2. Duplicate any of the files in /export/cord/etc/contexts making sure the duplicate file has the same name as the study carrel in question.

  3. Edit the duplicated file and give values to each of the named fields (LONGNAME, SCOPENOTE, CREATOR, EMAIL, CREATIONDATE). Keep the values VERY simple, and make sure each name-value pair is delimited by a tab character.

  4. Test your edits by first navigating to the cord directory (cd /export/cord/) and then running ./bin/add-context.pl <carrel> where <carrel>is the name of the study carrel. The result will be a stream of HTML. Peruse the HTML. Is it what you desire? If not, then go back to previous step. If so, then continue.

  5. Implement your edits by redirecting the resulting HTML to a file, like this: ./bin/add-context.pl <carrel> > /export/reader/carrels/<carrel>/home.html where <carrel> is the name of the study carrel.

By this point one ought to be able to open up the study carrel's URL, and the context will seen at the top of the resulting page.

  1. If you like the new carrel page context section and want to use it to replace your carrel's default *.htm then go to /export/reader/carrels/<carrel>/home.html where <carrel> is the name of the study carrel and cp your home.html over your index.htm file.

"Fun with distant reading!" …

-- Eric Lease Morgan [email protected] June 20, 2020 (First Day of Summer)


  1. cd /export/reader/carrels/<Your Carrel Name>
  2. sbatch make-carrel.slurm
  3. wait until the carrel is re-built
  4. re-run add-context.pl making sure to redirect the output to index.html

An advanced technique is to:

  1. salloc
  2. wait for a node to warm up
  3. ssh compute-0 (or whatever the name of the node is)
  4. cd /export/reader/carrels/<Your Carrel Name>
  5. /export/reader/bin/carrel2about.py > index.htm
  6. wait
  7. when you get your prompt back, exit
  8. re-run add-context.pl making sure to redirect the output to index.html The first technique is easy and straightforward. The second technique is faster but more finnicky.

P.S. If you get permission errors, then use sudo in conjunction with add-context.pl if you have that right or reach out on Slack or via email

Eric Lease Morgan [email protected] June 20, 2020 (First Day of Summer)