tripsu
(/tɹˈɪpsˈuː/, triple pseudonymizer) is a tool to protect
sensitive values in RDF triples
through pseudonymization. The
goal is to offer a fast, secure and memory-efficient pseudonymization solution
to any RDF graph.
Note: code is still in development and we support only NTriples format as input.
The tool works in two steps:
- Indexing to create a reference to all rdf:type instances in the graph
- Pseudonymization to encrypt or hash sensitive parts of any RDF triple in the graph via a human-readable configuration file and the previously generated index
Table of Content
Run the container image directly with docker
or podman
:
docker run -it ghcr.io/sdsc-ordes/tripsu:0.0.1 --help
The package can be compiled from source using cargo:
git clone https://github.com/sdsc-ordes/tripsu
cd tripsu
cargo build --release
./target/release/tripsu --help
Tip
Check the development section for other setups (Nix etc.).
The general command-line interface outlines the two main steps of the tool, indexing and pseudonymization:
tripsu --help
which outputs
A tool to pseudonymize URIs and values in RDF graphs.
Usage: tripsu <COMMAND>
Commands:
index 1. Pass: Create a node-to-type index from input triples
pseudo 2. Pass: Pseudonymize input triples
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
Indexing only requires an RDF file as input:
tripsu index input.nt > index.nt
Pseudonymization requires an RDF file, index and rules configuration as input:
tripsu pseudo --index index.nt --rules rules.yaml input.nt > output.nt
By default, pseudonymization uses a random key. To make the process
deterministic, you may provide a file containing a fixed key with --secret
.
In both subcommands, the input defaults to stdin and the output to stdout,
allowing to pipe both up- and downstream tripsu
(see next section).
Tip
Each subcommand supports the --help
option to show all options. For
more information about use-cases and configuration, see the
tutorial.
Read first the Contribution Guidelines.
For technical documentation on setup and development, see the Development Guide
Copyright © 2023-2024 Swiss Data Science Center (SDSC), www.datascience.ch. All rights reserved. The SDSC is jointly established and legally represented by the École Polytechnique Fédérale de Lausanne (EPFL) and the Eidgenössische Technische Hochschule Zürich (ETH Zürich). This copyright encompasses all materials, software, documentation, and other content created and developed by the SDSC.