TeX 2 JATS XML for Seismica

Introduction

There are two options to run the scripts. There's a docker-based setup that only requires installation of Docker and automatically manages all other dependencies. The other option is to install the dependencies and run the shell script that calls the python files to perform the conversions.

Follow the steps in the following order -

Base Requirements for installation
Pre-conversion Requirements
Workflow Docker Option or Dependencies Option
Post-conversion checks

Base Requirements

Clone the github repo to your local machine. cd into the desired directory and clone using -
```
git clone https://github.com/WeAreSeismica/tex2jats.git
```

Pre-conversion requirements

Before running the TeX2JATS converter, you need to:

Have produced final .tex galleys with updated metadata (proof accepted by authors)
Check that date format is correct (Month dd, YYYY)
Check that you did not use the obsolete command seistable in the tex file
Convert every figure to PNG format if not already done. You can use the following bash command (converts every PDF file which starts with fig to a PNG format, requires imagemagick, you can adjust density if needed):
mogrify -verbose -quality 00 -density 250 -format png ./fig*.pdf

Workflow Option 1 - Using Docker

Install Docker if you don't have it already.
Start up Docker. Usually you will have an application called "Docker" on your computer with a rudimentary graphical user interface (GUI). You can also run this command in the command-line interface (CLI):
```
open -a Docker
```
Move the directory containing latex files to be converted into the git repo tex2jats.
Open the file env.set, and edit the parameters in the file as mentioned there. Spaces in names are accepted.
Open a shell and navigate to the wherever the git repo is cloned, which now includes the latex files directory.
```
cd path/to/git/repo/tex2jats
```
You can always run pwd to check whether you're in the right place.
Run docker-compose by entering the below commands in your favorite shell.
```
docker-compose up -d
```
This should convert and create all the necessary files. Conversion may take up to a minute. You can check the status of the conversion by using -
```
docker-compose logs
```
Verify that there are xml files in the latex files directory.
Shut down docker-compose by entering -
```
docker-compose down
```

Workflow Option 2 - Installing Dependencies

Dependencies

These ones are shared dependencies with the docx/odt parsing for Seismica module:

python 3.n (preferably 3.8+)
numpy
beautifulsoup (+ lmxl parser on MacOS)
pandoc
biblib

Other dependencies:

python3 datetime (often already installed)
GNU sed, v4.8
perl, v>5 (below not tested)

Option 2 Instructions

Copy tex2xml.sh, apa.csl, and cleanjats.py to your current working directory (CWD, where the TeX galley is):
cd /cwd/
cp /path/to/tex2jats/tex2xml.sh ./
cp /path/to/tex2jats/*.py ./
cp /path/to/tex2jats/apa.csl ./
In your CWD, use tex2xml.sh to convert the TeX to JATS XML:
./tex2xml.sh proof biblio or ./tex2xml.sh proof biblio math for the math mode
with:
- proof, the name of the TeX galley without the extension (which should be .tex)
- biblio, the name of the corrected list of references, without the extension (which should be .bib)
- math, to activate the math mode (to use if math does not print correctly, especially if amsmath TeX package is in use)

Post conversion checks

Either of the workflows above will output the following files within the latex directory:
- proof.xml
- proof_metadata.jats
- proof_credits.jats
- proof_galley.xml
- proof_tab1.tex if you have one table (see point (5)), and one similar file per table
- proof_tab1.xml
  You will then work with the XML galley only (proof_galley.xml). Other files are only here for correction if needed.

Initial checks (with a text editor)

Open the galley with a web browser, it should not show any error. You can use the web browser to debug (will show the line of every error).
Correct metadata if needed: Authors' names, affiliations, DOI, title, and other metadata. If multiple affiliations, there are often multiple points.
Cross references for figures (xref) look OK. They should be printed with a format similar to:
<xref ref-type="fig" alt="1" rid="fig1">1</xref>
Check the acknowledgements and references
If there are TABLES in your TeX galley, tex2xml.sh will export two files for each table: tabxx.tex and tabxx.xml, xx ranging from 1 to the total number of arrays present in the article.

-> If the table is simple and has been converted properly by pandoc + cleanjats.py:

Check and correct the caption of the table in the xml file for references or formula that have not been converted.

-> If the table is too complex and has not been properly converted:

Correct any unwanted symbol in tabxx.tex
Translate the tabxx.tex to HTML with https://tableconvert.com/latex-to-html
Copy the HTML code in the XML table file tabxx.xml, where indicated
Replace the wrong table code in the XML galley proof_galley.xml with the updated tabxx.xml. In the XML galley, tables are under a table-wrap environment.
Check and correct the caption of the table in the xml file for references or formula that have not been converted.

Final checks (with the OJS preview tool)

Upload the XML galley to the OJS website.
- You can rename the XML galley
- Images need to be uploaded separately (no need to fill in the caption etc.)
- Use the preview tool to open the XML Lens Viewer
Check that title, authors, and all metadata are printed correctly (in the main text page but also the info tab)
Cross references to figures and tables and references are OK
Tables are printed correctly
Acknowledgements are printed correctly
References (in the References tab) are printed correctly.

Known issues:

References and maths formula in table captions are often not converted properly
Often in metadata, credits or acknowledgements: symbols are not properly converted, or are still escaped when they shouldn't: look for \&, \%, multiple occurrences of dots and/or comma (.., .,).

TO DO

Correctly parse math expressions and references within table captions

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
apa.csl		apa.csl
cleanjats.py		cleanjats.py
docker-compose.yml		docker-compose.yml
docker_entrypoint.sh		docker_entrypoint.sh
env.set		env.set
sort_bib.py		sort_bib.py
tex2xml.sh		tex2xml.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TeX 2 JATS XML for Seismica

Introduction

Base Requirements

Pre-conversion requirements

Workflow Option 1 - Using Docker

Workflow Option 2 - Installing Dependencies

Dependencies

Option 2 Instructions

Post conversion checks

Initial checks (with a text editor)

Final checks (with the OJS preview tool)

Known issues:

TO DO

About

Releases

Packages

Contributors 5

Languages

License

WeAreSeismica/tex2jats

Folders and files

Latest commit

History

Repository files navigation

TeX 2 JATS XML for Seismica

Introduction

Base Requirements

Pre-conversion requirements

Workflow Option 1 - Using Docker

Workflow Option 2 - Installing Dependencies

Dependencies

Option 2 Instructions

Post conversion checks

Initial checks (with a text editor)

Final checks (with the OJS preview tool)

Known issues:

TO DO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages