Heurist ETL
This Python package Extracts, Transforms, and Loads (ETL) data from a Heurist database server into a local DuckDB database file.
- Installation & configuration
- Basic command-line usage
- Integrate API client in Python code
- Load Heurist data into R-studio
$ pip install \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple \
heurist
Commands
heurist download -f [file]
- Load all the records of a certain record group type into a DuckDB database file. There is also the option to export the transformed tables into CSV files for each record type.heurist record -t [record-type]
- Simply calling Heurist's API, export all of a targeted record type's records to a JSON file.heurist schema -t [output-type]
- Transform a Heurist database schema into descriptive CSV tables for each record type or into a descriptive JSON array.
Note: Currently, the
heurist
package has only been developed for Heurist databases hosted on Huma-Num's Heurist server. This includes nearly 2000 database instances, which is a good place to start! If you want to help develop the API client to work with other servers, consider contributing.
ERC-funded project
This Python package is distributed with the Creative Commons' Attribution-ShareAlike 4.0 license.
It was developed as part of a project funded by the European Research Council. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.