-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
127 changed files
with
36,532 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# This file has been generated automatically with HTR-United <3 Github Actions form | ||
name: HTR United Workflow | ||
'on': | ||
- push | ||
- pull_request | ||
permissions: | ||
contents: write | ||
jobs: | ||
HTRUC: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.8 | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.8 | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install htruc | ||
- name: Run HTRUC | ||
run: | | ||
htruc test htr-united.yml | ||
HTR_United_Metadata_Generator: | ||
runs-on: ubuntu-latest | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.8 | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.8 | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install htr-united-metadata-generator htruc anybadge | ||
- name: Run Report | ||
run: | | ||
humGenerator --chars -n NFD --parse alto --group ./data/**/*.xml --github-envs --to-json updated_metrics.json | ||
cat envs.txt >> $GITHUB_ENV | ||
- name: Get HTR United Badge Template | ||
if: github.ref == 'refs/heads/main' | ||
uses: andymckay/get-gist-action@master | ||
with: | ||
gistURL: https://gist.github.com/PonteIneptique/7813bb99f234b334fbf9c6c429ec2406 | ||
- name: Automatically update the Catalog & the Badges | ||
if: github.ref == 'refs/heads/main' | ||
run: |- | ||
htruc update-volumes htr-united.yml updated_metrics.json --inplace | ||
# Generate badges | ||
mkdir -p badges | ||
anybadge --value=${{ env.HTRUNITED_CHARS }} --file=badges/characters.svg --label=Characters --color=#007ec6 --overwrite --template=${{ steps.get.outputs.file }} | ||
anybadge --value=${{ env.HTRUNITED_REGNS }} --file=badges/regions.svg --label=Regions --color=#007ec6 --overwrite --template=${{ steps.get.outputs.file }} | ||
anybadge --value=${{ env.HTRUNITED_LINES }} --file=badges/lines.svg --label=Lines --color=#007ec6 --overwrite --template=${{ steps.get.outputs.file }} | ||
anybadge --value=${{ env.HTRUNITED_FILES }} --file=badges/files.svg --label="XML Files" --color=#007ec6 --overwrite --template=${{ steps.get.outputs.file }} | ||
git config user.name github-actions | ||
git config user.email [email protected] | ||
git add htr-united.yml ./badges/ | ||
git commit -m "[Automatic] Update the Catalog & the Badges" || echo "Nothing to commit" | ||
git push || echo "Nothing to push" | ||
ChocoMufin: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.8 | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.8 | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install chocomufin | ||
- name: Run ChocoMufin | ||
run: | | ||
chocomufin generate table.csv ./data/**/*.xml | ||
cat table.csv | ||
HTRVX: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.8 | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.8 | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install htrvx | ||
- name: Run HTRVX | ||
run: | | ||
htrvx --verbose --group --format alto --check-empty --segmonto --xsd --raise-empty ./data/**/*.xml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
.DS_Store | ||
*/.DS_Store | ||
*/*/.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
cff-version: 1.0 | ||
title: Données OCR et segmentation des textes littéraires qui ont des liens avec l'oeuvre de M. Malingre | ||
type: dataset | ||
message: >- | ||
If you use this dataset, please cite it using the | ||
metadata from this file. | ||
authors: | ||
- family-names: Solfrini | ||
given-names: Sonia | ||
affiliation: University of Geneva | ||
orcid: 0009-0009-7367-048X | ||
repository-code: 'https://github.com/SETAFDH/HTR-Varia-Malingre' | ||
url: 'https://github.com/SETAFDH/HTR-Varia-Malingre' | ||
abstract: >- | ||
OCR data for Malingre research sub-project (SETAF project), 16th-century French prints in Gothic and Roman characters. | ||
keywords: | ||
- HTR | ||
- OCR | ||
- french | ||
- modern | ||
- prints | ||
license: CC-BY-4.0 | ||
version: 1.0 | ||
date-released: 2023-12-05 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
Dépôt GitHub,Identifiant,Segmentation,Transcription,Pages,Imprimeur,Titre bref,Auteur,Date,Lieu,Lien fac-similé numérique,Lieu de conservation | ||
SETAFDH/HTR-Varia-Malingre,e-rara 120199,gold,gold (romain),87,s.n.,Chanson spirituelle sur la saincte cene,[anonyme],1545,s.l.,https://doi.org/10.3931/e-rara-120199,Genève BGE Bd 1915 | ||
SETAFDH/HTR-Varia-Malingre,e-rara 12703,en cours,en cours (romain),50/141,Jean Girard,Les Pseaumes de David,Clément Marot,1550,Genève,https://doi.org/10.3931/e-rara-12703,Genève MHR O6e(550) | ||
SETAFDH/HTR-Varia-Malingre,onb 109B2E78,en cours,en cours (romain),65/271,[Jean Girard],Chrestienne Resiovyssance,Eustorg de Beaulieu,1546,[Genève],https://onb.digital/result/109B2E78,Vienne NB 80.M.74 | ||
SETAFDH/HTR-Varia-Malingre,Gallica bpt6k87118420, en cours, en cours (gothique sauf 4 pages au début en romain), 58/180,Gryphius,Les Oeuvres de Clement Marot,Clément Marot,1538,Lyon,https://gallica.bnf.fr/ark:/12148/bpt6k87118420,Paris BNF RES-YE-1461 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
Dépôt GitHub,Identifiant,Segmentation,Transcription,Pages,Imprimeur,Titre bref,Auteur,Date,Lieu,Lien fac-similé numérique,Lieu de conservation | ||
SETAFDH/HTR-SETAF-Pierre-de-Vingle ,CRRPV11,gold,gold (gothique),96,Pierre de Vingle,Moralite de la maladie de Chrestiente,[Matthieu Malingre],1533,[Neuchâtel],https://doi.org/10.3931/e-rara-563,Zurich ZB Res 1331 | ||
SETAFDH/HTR-SETAF-Pierre-de-Vingle,CRRPV12,gold,gold (gothique),48,Pierre de Vingle,Sensuyvent plusieurs belles et bonnes chansons,[Matthieu Malingre],1533,[Neuchâtel],https://doi.org/10.3931/e-rara-6934,Genève BGE Bd 1475 Rés | ||
SETAFDH/HTR-SETAF-Pierre-de-Vingle,CRRPV13,gold,gold (gothique),48,Pierre de Vingle,Noelz nouveaulx,[Matthieu Malingre],[1533?],[Neuchâtel],https://doi.org/10.3931/e-rara-577,Zurich ZB Res 1332 | ||
SETAFDH/HTR-SETAF-Pierre-de-Vingle,CRRPV16,gold,gold (gothique),16,Pierre de Vingle,Chansons nouvelles,[anonyme],[1534?],[Neuchâtel],https://doi.org/10.3931/e-rara-576,Zurich ZB Res 1327 | ||
SETAFDH/HTR-SETAF-Jean-Michel,CRRJM29,gold,gold (gothique),78,Jean Michel,Verite cachee,[anonyme],1544,[Genève],https://doi.org/10.3931/e-rara-12691,Genève MHR D Mal 1 | ||
SETAFDH/HTR-SETAF-Jean-Michel,CRRJM34,gold,gold (gothique),96,Jean Michel,Moralite de la maladie de chrestiente,[Matthieu Malingre],[1538/1544],[Genève],https://onb.digital/result/103BE0A8,Vienne NB 48.V.87 | ||
SETAFDH/HTR-Varia-Malingre-romain,e-rara 120199,gold,gold (romain),87,s.n.,Chanson spirituelle sur la saincte cene,[anonyme],1545,s.l.,https://doi.org/10.3931/e-rara-120199,Genève BGE Bd 1915 | ||
SETAFDH/HTR-Varia-Malingre-romain,e-rara 12703,en cours,en cours (romain),50/141,Jean Girard,Les Pseaumes de David,Clément Marot,1550,Genève,https://doi.org/10.3931/e-rara-12703,Genève MHR O6e(550) | ||
SETAFDH/HTR-Varia-Malingre-romain,onb 109B2E78,en cours,en cours (romain),65/271,[Jean Girard],Chrestienne Resiovyssance,Eustorg de Beaulieu,1546,[Genève],https://onb.digital/result/109B2E78,Vienne NB 80.M.74 | ||
SETAFDH/HTR-Varia-Malingre-gothique,Gallica bpt6k87118420, en cours, en cours (gothique sauf 4 pages au début en romain), 58/180,Gryphius,Les Oeuvres de Clement Marot,Clément Marot,1538,Lyon,https://gallica.bnf.fr/ark:/12148/bpt6k87118420,Paris BNF RES-YE-1461 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# HTR-Varia-Malingre | ||
|
||
![characters badge](badges/characters.svg) ![regions badge](badges/regions.svg) ![lines badge](badges/lines.svg) ![files badge](badges/files.svg) | ||
|
||
Ce dépôt comprend les données OCR des textes littéraires qui ont des liens avec l'œuvre du poète réformé M. Malingre (c.1500-1572). La liste des textes avec plus de détails se trouve dans le [tableau CSV](https://github.com/SETAFDH/HTR-Varia-Malingre/blob/main/HTR-Varia-Malingre_Table.csv) du dépôt (par exemple, une colonne indique si la trascription est issue d'une typographie gothique ou romaine). | ||
|
||
Les ouvrages attribués à M. Malingre, imprimés par Pierre de Vingle et Jean Michel, se trouvent dans les dépots [HTR-SETAF-Pierre-de-Vingle](https://github.com/SETAFDH/HTR-SETAF-Pierre-de-Vingle) et [HTR-SETAF-Jean-Michel](https://github.com/SETAFDH/HTR-SETAF-Jean-Michel). | ||
|
||
|
||
## Financeur | ||
|
||
Ce projet est financé par le Fonds national suisse (FNS) dans le cadre du projet SETAF. | ||
|
||
- Site du projet SETAF : https://www.unige.ch/setaf | ||
- GitHub du projet SETAF : https://github.com/SETAFDH | ||
|
||
|
||
## Licence | ||
|
||
Les transcriptions sont [CC-BY](https://creativecommons.org/licenses/by/4.0), et les images suivent les règles de différentes bibliothèques numériques : [e-rara](https://www.e-rara.ch/wiki/termsOfUse?lang=en), [ONB](https://www.onb.ac.at/en/use), [Gallica](https://gallica.bnf.fr/edit/und/conditions-dutilisation-des-contenus-de-gallica), [BSB](https://oai.bsb-muenchen.de/doc/en/imprint). | ||
|
||
|
||
## Données | ||
|
||
Les données se trouvent au chemin ‘./data//.xml‘ et sont au format ALTO. Elles suivent les normes de segmentation de [SegmOnto](https://segmonto.github.io) et sont cataloguées sur [HTR-United](https://htr-united.github.io). Les fichiers sont corrigés manuellement : la qualité de la segmentation des pages et de la transcription produite par l'OCR est indiqué dans le tableau CSV du dépôt ("gold" ou "en cours"). | ||
|
||
Le contrôle de la transcription produite par l'OCR se base sur un guide redigé par l'équipe du projet : Solfrini et al., _Guide de transcription pour les imprimés français du XVIe siècle en caractères gothiques_, Version A, 2023, https://hal.science/hal-04281804. | ||
|
||
|
||
## Infrastructure | ||
|
||
Les données pour l'OCR sont produites à l'aide de l’instance genevoise [FoNDUE](https://www.unige.ch/lettres/humanites-numeriques/recherche/projets-de-la-chaire/fondue) d'[eScriptorium](https://gitlab.com/scripta/escriptorium). | ||
|
||
Les calculs sont effectués à l'Université de Genève en utilisant le [service HPC](https://www.unige.ch/eresearch/fr/services/hpc/). | ||
|
||
|
||
## Citer le dépôt | ||
|
||
- Version `1.0`: Sonia Solfrini, _Données OCR et segmentation des textes littéraires qui ont des liens avec l'œuvre de M. Malingre_, version `1.0`, Genève, université de Genève, 2023, https://github.com/SETAFDH/HTR-Varia-Malingre. | ||
|
||
```bibtex | ||
@misc{solfrini_ocr_varia_malingre_2023, | ||
author={Solfrini, Sonia}, | ||
title={Données OCR et segmentation des textes littéraires qui ont des liens avec l'œuvre de M. Malingre}, | ||
version={1.0}, | ||
address={Genève}, | ||
publisher={université de Genève}, | ||
year={2023}, | ||
url={https://github.com/SETAFDH/HTR-Varia-Malingre}, | ||
} | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+3.53 MB
...lement_Marot/Les_Oeuvres_de_Clément_Marot_Gallica_bpt6k87118420.pdf_page_10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.