Skip to content

This tool parses PGN files, shows a summary review of its contents, sort and filter games by any criteria, and produce histograms on any piece of data. In addition, it also generates LaTeX files that can be processed to generate pdf files showing the contents of the games in any PGN file.

License

Notifications You must be signed in to change notification settings

clinaresl/pgnparser

Repository files navigation

Introduction

This tool parses PGN files, shows a summary review of its contents, sort and filter games by any criteria, and produce histograms on any piece of data. In addition, it also generates LaTeX files that can be processed with pdflatex or xelatex to generate pdf files with the contents of the games in any PGN file.

While it can be used for any PGN files, it is specifically designed to parse and process the PGN files generated by Lichess and also games from the FICS database.

pgnparser has been conceived as a command-line tool. However, it also provides an easy API to be used by other programmers who want to import this module. Special care has been put into the documentation style so that go doc provides reliable and accurate information to use this module as a third-party software.

External dependencies

pgnparser uses the following third-party libraries

  • expr library to parse statements written in Go. Indeed, this library made a huge difference in the development of pgnparser

  • table is used to generate fancy tables on the console output

These modules are automatically installed when following the directives given under Install.

Install

To install this program for development in Go:

   $ go get -v github.com/clinaresl/pgnparser

To install an executable in your system using the Go toolchain:

   $ go install github.com/clinaresl/pgnparser

Usage

pgnparser has a mandatory argument, --file, that must be used to provide a path to a PGN file. If no more directives are given, pgnparser prints out general information.

It also honours other optional arguments:

  • list: generates a table with information of every game parsed.

  • play: generates a table on the standard output where every game is played. It must be given an argument nbplies. The table shows a sequence of moves along with the resulting table every nbplies played.

    Even if this argument is not given, all games found in the input pgn parser are played to verify correctness. If a pgn game could not be properly parsed an error is produced and execution halts.

  • filter: generates a new pgn file with those games in the input pgn file satisfying the input criteria. Filtering criteria are described below.

  • sort: generates a new pgn file where games are sorted according to the given criteria. Sorting criteria are described below.

  • histogram: generates a table with a summary of information about the given variables. Histogram variables are described below.

Because some of these options can generate new files (namely, --filter and --sort), it is possible to provide the directive --output with the name of the pgn file to generate. If none is given, the file output.pgn is produced overwritting its previous contents in case the file already exists.

All in all, pgnparser has been designed with flexibility of use in mind. Sorting/filtering criteria and histogram variables result from this approach. Also, tables produced with the directive list or LaTeX files can use different templates affecting the look&feel of the output. To modify the design of the table shown with list a new template can be provided with table and the path to the template to use. To generate a LaTeX file it just suffices to provide a path to the LaTeX template to use with latex.

These options can be used simultaneously. If so:

  1. First, games found in the input pgn file are listed when using list, i.e., list takes precedence over all the other arguments. If a value was given to table then the given template is used to generate the table.

  2. If play is given along with a strictly positive value, all games found in the pgn file are played and the output is shown in tabular form on the output console.

  3. If filter is given, games satisfying the given criteria are selected for the next operations ---including the generation of an output pgn file.

  4. If sortis given, the current collection of games is sorted according to the given criteria ---either all games found in the input pgn file or those filtered when used in conjunction with filter

  5. The current collection of games (either sorted and/or filtered or not), can be examined to produce a summary of information with histogram in the end

  6. Finally, in case latex is used, the current collection of games is used to generate a LaTeX file showing its contents.

Listing games

Using list to provide information about the games found in a pgn file:

    $ pgnparser --file ... --list

produces the following output:

 ▶ Name    : examples/lichess_clinares_2024-05-15.pgn
 ▶ Size    : 3341933 bytes                           
 ▶ Mod Time: 2024-05-15 18:51:20.957491023 +0200 CEST
 ════════════════════════════════════════════════════
 [1.091136ms]

 3845 games found
 [2.303645721s]

┍━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━┯━━━━━━━━━━━━━┯━━━━━━━┯━━━━━━━━┑
│    Date    │ White                WhiteElo │ Black                BlackElo │ ECO │ TimeControl │ Moves │ Result │
╞════════════╪═══════════════════════════════╪═══════════════════════════════╪═════╪═════════════╪═══════╪════════╡
│ 2024.05.15 │ Don_jon10                1908 │ clinares                 1901 │ C40 │    180+0    │  55   │  1-0   │
│ 2024.05.15 │ clinares                 1907 │ Don_jon10                1903 │ C02 │    180+0    │  60   │  0-1   │
│ 2024.05.15 │ clinares                 1911 │ Simonenko                2022 │ C02 │    180+0    │  104  │  0-1   │
│ 2024.05.15 │ clinares                 1905 │ Ikrom_Isayev             1893 │ C25 │    180+0    │  31   │  1-0   │
│ 2024.05.15 │ clinares                 1900 │ behzadss                 1915 │ B23 │    180+0    │  67   │  1-0   │
│ 2024.05.15 │ behzadss                 1909 │ clinares                 1905 │ C40 │    180+0    │  91   │  1-0   │
│ 2024.05.14 │ Qsac                     1903 │ clinares                 1911 │ C63 │    180+0    │  53   │  1-0   │
│ 2024.05.14 │ clinares                 1914 │ Revyakin_Andrey          2064 │ A00 │    180+0    │  48   │  0-1   │
│ 2024.05.14 │ Revyakin_Andrey          2060 │ clinares                 1918 │ C40 │    180+0    │  43   │  1-0   │
│ 2024.05.14 │ clinares                 1921 │ Revyakin_Andrey          2056 │ B23 │    180+0    │  70   │  0-1   │
├────────────┼───────────────────────────────┼───────────────────────────────┼─────┼─────────────┼───────┼────────┤
│ 2024.05.14 │ Khezman                  1890 │ clinares                 1916 │ C46 │    180+0    │  128  │  0-1   │
│ 2024.05.14 │ clinares                 1911 │ Khezman                  1895 │ A00 │    180+0    │  113  │  1-0   │
│
      ...        ...                     ...      ...                    ...   ...       ...        ...     ...
│ 2024.01.01 │ clinares                 1906 │ tat8866                  1955 │ C25 │    180+0    │  89   │  1-0   │
│ 2024.01.01 │ clinares                 1901 │ Ks12345                  1845 │ C25 │    180+0    │  35   │  1-0   │
│ 2024.01.01 │ BonbonTisoy              1941 │ clinares                 1906 │ C23 │    180+0    │  96   │  1-0   │
┕━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━┷━━━━━━━━━━━━━┷━━━━━━━┷━━━━━━━━┙
 # Games found: 3845

 Games verified!
 [286.922097ms]

and to change the appearance of the table use table with the path to the template to use:

    $ pgnparser --file ... --table templates/table/bare.tpl

(note that when using tablethere is no need to provide also list though it can be done)

produces a much more concise output:

 ▶ Name    : examples/lichess_clinares_2024-05-15.pgn
 ▶ Size    : 3341933 bytes                           
 ▶ Mod Time: 2024-05-15 18:51:20.957491023 +0200 CEST
 ════════════════════════════════════════════════════
 [1.100194ms]

 3845 games found
 [2.227093864s]

 ┍━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━┑
 │ White                WhiteElo │ Black                BlackElo │ Result │
 ╞═══════════════════════════════╪═══════════════════════════════╪════════╡
 │ Don_jon10                1908 │ clinares                 1901 │  1-0   │
 │ clinares                 1907 │ Don_jon10                1903 │  0-1   │
 │ clinares                 1911 │ Simonenko                2022 │  0-1   │
      ...        ...                     ...      ...                    ...   ...       ...        ...     ...
 │ clinares                 1906 │ tat8866                  1955 │  1-0   │
 │ clinares                 1901 │ Ks12345                  1845 │  1-0   │
 │ BonbonTisoy              1941 │ clinares                 1906 │  1-0   │
 ┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━┙
 # Games found: 3845

 Games verified!
 [308.763738ms]

Note: All tables use UTF-8 characters which might not be rendered properly in this view, i.e., the view on your console might be more beautiful than the one rendered here.

Playing games

Games can be automatically played on the console. When using play with a strictly positive argument a table is generated with the result of playing every single game found in the input file where each game is started with information given in the tags of the pgn file and then every row shows a number of moves and the resulting board. The number of moves shown is the argument given to play.

For example, to play all games found in a pgn file every 30 plies:

    $ pgnparser --file ... --play 30

produces an outcome like the following:

 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
    Result      : 0-1                            
    WhiteElo    : 1842                           
    PlyCount    : 38                             
    TimeControl : 180+0                          
    Opening     : Hungarian Opening              
    Event       : Rated game                     
    Site        : http://lichess.org/6LAIFOp6    
    White       : yerken                         
    Date        : 2016.05.06                     
    Variant     : Standard                       
    ECO         : A00                            
    Black       : clinares                       
    BlackElo    : 1989                           
    Termination : Normal                         
 ────────────────────────────────────────────────
  1. g3 e5                                       
  2. Bg2 d5                                      
  3. d3 c6                        ╔════════╗     
  4. Nc3 f5                       ║♜▒♝♝♚▒ ♜║     
  5. e4 fxe4                      ║♙ ▒ ▒ ♟♟║     
  6. dxe4 d4                      ║ ▒♞▒ ♞ ▒║     
  7. Nce2 Nf6                     ║▒ ▒ ♟ ▒ ║     
  8. c3 c5                        ║ ♟ ▒♙▒ ▒║     
  9. b4 b6                        ║▒ ♙ ▒♘♙ ║     
  10. a4 Be7                      ║ ▒ ▒♟♙♗♙║     
  11. a5 d3                       ║♖ ♗ ♔ ▒♖║     
  12. axb6 dxe2                   ╚════════╝     
  13. Qxd8+ Bxd8                                 
  14. bxa7 Nc6                                   
  15. Nf3 cxb4                                   
                                                 
                                  ╔════════╗     
                                  ║♜▒ ▒♚▒ ♜║     
                                  ║♙ ▒ ▒ ♟♟║     
  16. cxb4 Bb6                    ║♝▒ ▒ ♞ ▒║     
  17. b5 Nd4                      ║▒♙▒ ♟ ▒ ║     
  18. Nxd4 Bxd4                   ║ ▒ ♝♙▒ ▒║     
  19. Ra6 Bxa6                    ║▒ ▒ ▒ ♙ ║     
                                  ║ ▒ ▒♟♙♗♙║     
                                  ║▒ ♗ ♔ ▒♖║     
                                  ╚════════╝     

for every game found in the input pgn file.

Note: All tables use UTF-8 characters which might not be rendered properly in this view, i.e., the view on your console might be more beautiful than the one rendered here.

Filtering criteria

PGN files always start with a header and a set of tags that can be used for filtering games. A filtering criteria just consists of an evaluable expression in Go (which for the purpose here is much the same like almost any programming language) that uses variables, either appearing in the header of the PGN file or implemented by pgnparser ---described below.

For example,

    $ pgnparser --file ... --filter 'ECO=="C25"'

generates a pgn file named output.pgn with those games in the input pgn file with ECO=C25. Note that the argument to filter is given between single quotes because the evaluation of an expression in Go expects strings to be given between double quotes. Variables can be combined in any way, e.g.:

    $ pgnparser --file ... --filter 'ECO=="C25" && ((White=="clinares" && Result=="0-1") || (Black=="clinares" && Result=="1-0"))'

returns all games with opening C25 that were lost either with black or white by one specific player.

pgnparser provides an additional variable, Moves which is reminiscent of PlyCount used in FICS. Because this variable is not provided by lichess, it is computed on the fly. It is a numerical variable and thus:

    $ pgnparser --file ... --filter 'Moves<40 && ((White=="clinares" && Result=="0-1") || (Black=="clinares" && Result=="1-0"))'

filters all games lost by one specific player with either color in less than 40 moves ---or plies.

Note that the argument --list takes precedence over filter so that no information is shown on the console of the result of filtering games. To see the result use:

    $ pgnparser --file ... --list

providing the name of the output file given to the precedence invocation of pgnparser

Sorting criteria

Sorting criteria consists of a semicolon-separated string of different variables (either those appearing in the tags of the pgn games or those provided by pgnparser) preceded each by either < (for ascending order) or > ---for descending order. For example to generate a pgn file showing first the most recently played games and then breaking ties in ascending order of the number of moves:

    $ pgnparser --file ... --sort ">Date;<Moves"

Note that the argument --list takes precedence over sort so that no information is shown on the console of the result of sorting games. To see the result use:

    $ pgnparser --file ... --list

providing the name of the output file given to the precedence invocation of pgnparser

Histogram variables

histogram can be used to produce a summary (in tabular form) of the games in the input pgn file. All sorts of variables can be used (either those appearing in the tags of the games, or those provided by pgnparser). Histogram variables must be given in a semicolon-separated string which uses either variables or boolean expressions using variables:

  • For every variable given the histogram produces a count of the number of occurrences for every value found for the given variable

  • For every boolean expression given the histogram provides information on the frequency of every feasible outcome, either true or false. If any of these outcomes never takes place it is skipped in the output.

The histogram is sorted in ascending order of the values of the variables. In case of using boolean expressions false is shown before true.

It is possible to provide an arbitrary number of istogram variables. If so, observations are nested. For example, to produce a histogram on the number of games won and lost with every opening found in the input pgn file:

    $ pgnparser --file ... --histogram 'ECO;((White=="clinares") && (Result=="1-0")) || ((Black=="clinares") && (Result=="0-1"))'

produces a table with three columns: first, the ECO is shown; next, for every value of ECO two rows are shown and tagged as either false or true. The third column shows the frequency of the combined occurrence for every combination of the ECO variable and the outcome of the boolean expression.

The output histogram shows a header with the name of each variable used. When using boolean expressions the header is the entire boolean expression given, but this might not be very informative. It is because of this that any variable or boolean expression can be preceded with a header and a colon just to be used in the output table:

    $ pgnparser --file ... --histogram 'ECO Opening: ECO;Win: ((White=="clinares") && (Result=="1-0")) || ((Black=="clinares") && (Result=="0-1"))'

produces a much more concise table with more comprehensive names for the headers:

 ▶ Name    : examples/lichess_clinares_2024-05-15.pgn
 ▶ Size    : 3341933 bytes                           
 ▶ Mod Time: 2024-05-15 18:51:20.957491023 +0200 CEST
 ════════════════════════════════════════════════════
 [279.511µs]

 3845 games found
 [2.17623463s]

 Games verified!
 [273.469723ms]


 ECO Opening │  Win  │ # Obs. 
 ━━━━━━━━━━━━┿━━━━━━━┿━━━━━━━━
     A00     │ false │  125   
             ├───────┼────────
             │ true  │   83   
 ────────────┼───────┼────────
     A01     │ false │   12   
             ├───────┼────────
             │ true  │   19   
 ────────────┼───────┼────────
     A02     │ false │   25   
             ├───────┼────────
     ...        ...     ...

Note: All tables use UTF-8 characters which might not be rendered properly in this view, i.e., the view on your console might be more beautiful than the one rendered here.

Note that the argument --list takes precedence over histogram so that no information is shown on the console of the result of a histogram. To see the result use:

    $ pgnparser --file ... --list

providing the name of the output file given to the precedence invocation of pgnparser

Gerating LaTeX files

If the argument latex is given along with a path to a latex template, then a LaTeX file is generated which is named after the value given to outuput (output.pgn by default) with suffix .tex. The latest release of pgnparser contains various templates. For example, to produce a detailed view of the contents of a pgn file use:

    $ pgnparser --file ... --latex templates/report/lichess/tabular.tpl

(note there are different templates for lichess and fics as they use different variables)

which generates a document with two parts. First, it shows an index to every game (which might span several pages) with a link shown in the first column. The second part shows the result of playing every game in tabular form showing the board every 8 plies. The transcription of the game contains all comments found in the input pgn file and it also recognizes other special comments such as the emt (elapsed move time) used in FICS which is shown separately. The resulting LaTeX file when being processed twice shows first an index to all games in the pgn file:

image info

Clicking on any of the links shown in red in the left column will takes the reader to the selected game, e.g., clicking on #2 immediately shows a different page displaying the following:

image info

This page also contains various links:

  • The world symbol shown above the main header will automatically take you to the web page in lichess showing this game with all tools enabled for analyzing the game and playing alternative lines

  • The links shown in blue in the footer will take you to the homepages of each software project

Of course, these links are specific to the template being used, and different layouts can be produced with different templates. Importantly, some templates might require external files as in the case of the template used in the example above (lichess/tabular.tpl) which requires an image of the lichess icon. In case any template requires any external file these are given under the directory latex ---and can be freely replaced by others if needed.

Note that variables used in the templates might contain UTF-8 characters as they are read from the input pgn file. Fortunately, xelatex provides automatic conversion from UTF-8 characters to LaTeX symbols.

License

MIT License

Copyright (c) 2015, 2024, Carlos Linares López

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Author

Carlos Linares Lopez [email protected]
Computer Science Department https://www.inf.uc3m.es/en
Universidad Carlos III de Madrid https://www.uc3m.es/home

About

This tool parses PGN files, shows a summary review of its contents, sort and filter games by any criteria, and produce histograms on any piece of data. In addition, it also generates LaTeX files that can be processed to generate pdf files showing the contents of the games in any PGN file.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published