-
Notifications
You must be signed in to change notification settings - Fork 118
Data Integrity Testing
Since most of the metadata submitted to OpenAPC has been manually created at some point in its life cycle, it will inevitably contain errors. Furthermore, even data imported from external sources like CrossRef cannot be relied on to be correct or up-to-date in all cases. We address this problem by employing a software test suite which checks the whole dataset for potential errors on a regular basis.
The test script is written in Python and based on the pytest testing framework. Upon execution the script imports both the OpenAPC core data file and the offsetting file and sends every entry through a set of test functions. A report lists any encountered errors after finishing.
There are 2 work modes for the test suite: First, it can be simply called from the command line to verify data integrity in the local git repository (This should always be done before pushing back any changes to the APC data files back to GitHub!). Second, it is automatically called whenever a push or pull request occurs in the OpenAPC repository by hooking into a continuous integration service (Travis, in our case). The test suite is executed on a remote server and results are reported to the OpenAPC team via mail/Slack integration. A small widget on the OpenAPC README page also informs about the latest test status:
Mit freundlicher Unterstützung der Arbeitsgruppe Elektronisches Publizieren der Deutschen Initiative für Netzwerkinformation (DINI), der Deutschen Forschungsgemeinschaft und dem Bundesministerium für Bildung und Forschung.
Inhalte sind lizenziert unter CC BY 4.0.
- Handreichung Dateneingabe (englisch)
- Mitmachen
- Daten zitieren
- Protokolle und Arbeitsstände
- Datenschema (englisch)
- Versionierung (englisch)
- Handreichung Dateneingabe Transformationsverträge (DEAL-Wiley) (englisch)
- Handreichung Dateneingabe Transformationsverträge (DEAL-Wiley und -Springer-Nature) ab Berichtsjahr 2020
- OAPK-Daten