Skip to content

sdsc-ordes/tripsu

Repository files navigation

tripsu logo

tripsu

Current Release label Test Status label License label

tripsu (/tɹˈɪpsˈuː/, triple pseudonymizer) is a tool to protect sensitive values in RDF triples through pseudonymization. The goal is to offer a fast, secure and memory-efficient pseudonymization solution to any RDF graph.

Note: code is still in development and we support only NTriples format as input.

The tool works in two steps:

  1. Indexing to create a reference to all rdf:type instances in the graph
  2. Pseudonymization to encrypt or hash sensitive parts of any RDF triple in the graph via a human-readable configuration file and the previously generated index
Table of Content

Installation

Container

Run the container image directly with docker or podman:

docker run -it ghcr.io/sdsc-ordes/tripsu:0.0.1 --help

Source Build

The package can be compiled from source using cargo:

git clone https://github.com/sdsc-ordes/tripsu
cd tripsu
cargo build --release

./target/release/tripsu --help

Tip

Check the development section for other setups (Nix etc.).

Usage

The general command-line interface outlines the two main steps of the tool, indexing and pseudonymization:

tripsu --help

which outputs

A tool to pseudonymize URIs and values in RDF graphs.

Usage: tripsu <COMMAND>

Commands:
  index   1. Pass: Create a node-to-type index from input triples
  pseudo  2. Pass: Pseudonymize input triples
  help    Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help
  -V, --version  Print version

Indexing only requires an RDF file as input:

tripsu index input.nt > index.nt

Pseudonymization requires an RDF file, index and rules configuration as input:

tripsu pseudo --index index.nt --rules rules.yaml input.nt > output.nt

By default, pseudonymization uses a random key. To make the process deterministic, you may provide a file containing a fixed key with --secret.

In both subcommands, the input defaults to stdin and the output to stdout, allowing to pipe both up- and downstream tripsu (see next section).

Tip

Each subcommand supports the --help option to show all options. For more information about use-cases and configuration, see the tutorial.

Development

Read first the Contribution Guidelines.

For technical documentation on setup and development, see the Development Guide

Copyright

Copyright © 2023-2024 Swiss Data Science Center (SDSC), www.datascience.ch. All rights reserved. The SDSC is jointly established and legally represented by the École Polytechnique Fédérale de Lausanne (EPFL) and the Eidgenössische Technische Hochschule Zürich (ETH Zürich). This copyright encompasses all materials, software, documentation, and other content created and developed by the SDSC.