Repository of tools for manipulating information held in an ESDOC repository to make tables for scientific publications.
Contributions welcome, please generate a pull request if you have some more examples to add.
(Python 2.7 only, Python 3 version due late summer 2019.)
The initial release will be a tool for extracting Experiment and MIP documents from an ESDOC repository and making a PDF table which is suitable for inclusion into a journal paper (i.e. the table is typeset into PDF and the paper will need to include the PDF as a graphic). In most cases the PDF output of the python code will need to be cropped using a tool such as pdfcropmargins (there are also a bunch of online services that do the same thing).
Dependencies:
- pyesdoc - which is pip installable.
- jinja2 - also pip installable.
- weasyprint - you'll need the python2 version, which is pip installable:
pip install weasyprint==0.42.3
- Unfortunately there is also a dependency on system gtk librararies, which may or may not be a problem for some. How I got it to work with MacOS is documented in the code.
All the code in this version is in one script, and you can see several instances of how to use it in the test cases, which can be run by simply typing python experiment.py
, and cover:
- For Experiment or MIP documents, outputting a one or two column table, and
- Creating a table which describes CMIP6.
An example1 of the first is:
An example of using these classes in a real live application (producing tables for a paper) is the following:
import os
from experiment import Repo, Experiment, Mip, CMIP6
os.environ['ESDOC_API'] = 'https://api.es-doc.org'
experiments = ['abrupt-4xCO2','land-NoFire', 'g6sulfur', 'g7sst1-cirrus']
widths = [True, False, True, False]
mips = ['CMIP','DECK','GeoMIP','CMIP6']
r = Repo()
docs = [r.getbyname(e) for e in experiments]
mdocs = [r.getbyname(m,'mip') for m in mips ]
for e,d,w in zip(experiments, docs, widths):
E = Experiment(d)
E.render('%s.pdf' % e, wide=w)
for m,d in zip(mips,mdocs):
M = Mip(d)
M.render('%s.pdf' % m, wide=True)
C = CMIP6()
C.render('cmip6.pdf')
print C.reference_list
print C.nocite
Users who want to change the layout of the tables, or develop new tables may need to try and understand pyesdoc which does the heavy lifting via Python class instances for manipulating ESDOC documents.
- If you want to change the layout only, then you can play with the jinja2 templates alone, you won't need to understand pyesdoc.
- It would be relatively straight forward to create new templates for latex table output. I didn't do that in this version since they were pretty ugly, for this purpose (at least without using some of the more fancy table packages). I expect for future model comparison tables, someone (me maybe) will make latex table output.
- The documentation for pyesdoc itself is rather sparse right now, but you can't go too far wrong by just playing at the console and using introspection.
- You can also inspect the attributes and structure by looking at the canonical schema description. For example this code is the definition of an Experiment and you can see the attributes which an Experiment instance must have.
- A key concept you will need to deal with is that the documents are all linked together. As you navigate around them pythonically you will get the title and uid and types of other documents in links (e.g. the Experiment references Requirements), and sometimes you will need to pull the full description of those via the UID. Code to do this is included. Note how the additional requirements are found for the example above.
- If you get some new things working please contribute them back here!
This would not have been possible without some initial help from Mark Greenslade, the author of pyesdoc.
[1]: Note that this is a poor quality jpg rendering of the actual pdf in the repository. The workflow for this was to crop the original pdf output using pdf-crop-margins, then convert to jpg using an online converter (my usual goto for this sort of work, ImageMagick, didn't like the grayscale).