Releases
Code
Heurist API
This Python package provides an API wrapper for Heurist as well as a command-line interface (CLI) that Extracts, Transforms, and Loads (ETL) data from a Heurist database server into a local DuckDB database file.
The Python packaged is published on the Python Package Index (PyPI). Documentation is available here.
https://github.com/LostMa-ERC/heurist-api
simMAtree
This Python package performs some Simulation Based Algorithm (SBI) on abundance distribution data. One application done for the LostMa project consist in modelling the transmission and survival of textual witnesses through time, enabling researchers to infer model parameters from observed data.
https://github.com/LostMa-ERC/simMAtree
Scrapers for Cultural Heritage sites
Jonas
Scrape metadata about manuscripts and works on the website Jonas and its Répertoire des textes et livres français et occitans (850-1550) database, which is managed by the Institut de Recherche et d'Histoire des Textes (IRHT).
Provide the scraper with the URL of a manuscript (jonas.irht.cnrs.fr/manuscrit/) or work (jonas.irht.cnrs.fr/oeuvre/) and receive relational tables of the manuscrit or work, depending on the URL, and the witnesses related to it.
https://github.com/LostMa-ERC/JonasScraper
Catalogue collectif de France, Archives et Manuscrits
Scrape bibliographic metadata from notices in the Catalogue collectif de France (CCfr) and/or the Bibliothèque nationale de France's Archives et Manuscrits catalogue.
Both scrapers are installed with the same Python package and require the URL of the notice to be scraped.
https://github.com/LostMa-ERC/french-catalogue-scraper
Archives et Manuscrits (search)
A CLI that runs the advanced search feature of the Bibliothèque nationale de France's Archives et Manuscrits website.
Using the department (i.e. Arsenal) and the shelfmark (cote), find the notice for the document in the Archives et Manuscrits catalogue. This tool is particularly useful when combined with the Archives et Manuscrits scraper, which takes the discovered notice URL.
https://github.com/LostMa-ERC/search-archives-manuscrits
Handschriftencensus
Scrape metadata about manuscripts and works from the Handschriftencensus website, which is managed by the Philipps-Universität Marburg and the kademie der Wissenschaften und der Literatur Mainz.
Collect works and their witnesses, and create records for all the linked codicological units.
https://github.com/LostMa-ERC/hsc-scraper
Datasets
ML Models