Skip to content

tst-race/race-semanticsteg

Repository files navigation

Resilient Anonymous Communication for Everyone (RACE) Semantic Steganography Guide

Table of Contents


Terminology

  • Invertible Neural Network - A unique flavor of generative neural network that can map from random seed to image as well as from image back to random seed
  • Encoding - A combination of model weights and error correction parameters that can reliably encode and decode a message without error
  • Obfuscation - Hiding a message such that an adversary cannot distinguish it from other traffic on a channel

Introduction

This project implements a technique to pass obfuscated communications without detection by a passive adversary monitoring the network.

Invertible generative neural networks are used to generate realistic media (such as images) from message data. This media is then passed across the network (e.g., posted on social media) and decoded back to the original message by the receiver using an identical copy of the neural network.

RACE Network

This approach is sensitive to any transformations applied to the image in transit such as lossy compression or resizing, so we utilize Error Correction Codes and embedding techniques that make our solution robust to different types of noise.

Below is an example of what the network might look like in the context of how our software fits in to the RACE program:

  • Network Manager encrypts the message.
  • This Comms plugin embeds the message into an image, then sends it through a cover service.
  • The Network Manager routing scheme decides which servers will relay the message.
  • The servers in the chain are also running the Comms plugin software, and decode the message to determine destination, then embed the image again and forward to the next link in the chain.
  • Core software receives the Comms plugin generated image(s), which are decoded by Comms plugin software.
  • We send the resulting encrypted message to Network Manager for decryption.

RACE Network

Design Goals

  • Generate media that is difficult to distinguish from other common internet traffic
  • Support a variety of real world cover services such as email and various social media providers
  • Perform well without access to a GPU or other specialized deep learning hardware

Security Considerations

The adversary model assumes a passive adversary monitoring the channel with an automated mechanism to identify out of distribution traffic.

This is a research codebase and should not be used in any critical application especially where users' safety would be impacted by an adversary breaking the obfuscation.

It is likely that an advanced adversary could reliably partially compromise the system with sufficient compute (i.e. detect a non-trivial percentage of the messages with an acceptable false positive rate).

It is also possible that stronger side channel attacks exist.



Scope

This developer guide covers the Semantic Steganography development model, building artifacts, running, and troubleshooting. It is structured this way to first provide context, associate attributes/features with source files, then guide a developer through the build process.

Audience

Comfort with modern Python development is sufficient for maintaining or extending the codebase. Modifying encoding configs to increase resilience to channel noise may require a reasonable background in computer science, applied math, or similar. Training new models is likely to be very difficult without extensive experience in applied machine learning.

Environment

Designed for Linux and for Android phones. Requires at least 4GB of RAM for the neural networks. Latency for encoding and decoding is roughly linear in terms of CPU count. There is some support for GPU acceleration which can massively decrease latency.

License

Apache 2.0, see LICENSE.

Additional Reading



Implementation Overview

The project follows the CMake guidelines of having source and build directories and adds include, lib, test, and loader directories.

source directory

All project source code and tests are stored here.

config directory

Stores the configuration files for the RACE Comms Python Plugin.

scripts directory

Includes a number of Python, Bash, and miscellaneous scripts that do things like:

  • Generate the link-profiles
  • Send messages through race-in-the-box
  • Run a complete RIB test

encodings.cfg

This file contains model, ECC, bits per dimension, and other important encoding information that is used by the SSEncodingConfig class. The link-profiles.json file will name an encoding for every link, which must exist in this file.

See config/encodings.cfg for examples. See embedding_readme.md for more information.

How To Build

See instructions in prepare_plugin.sh to use appropriate command-line arguments.

Dependencies

Python Plugin:

  • No dependencies are required beyond the Python version and packages installed in the RACE Docker Images.

Java Plugin:

  • Dependencies are listed in build.gradle
  • Tensorflow-lite jar is extracted from the provided aar. The provided aar was custom built but updated versions of Tensorflow-lite should be sufficient
  • All dependencies are combined into one jar that is then converted into a dex file

Manifest

See plugin_channels/ for channel_properties.

Known Limitations

The Android version of the encoding process is not as robust as the Linux version. Therefore, Android nodes cannot actually encode or decode messages larger than ~3KB, despite the reported MTU. In particular this means NetworkManager plugins must not send messages larger than that or they will experience decode errors on Android receivers.


How To Run

Include in a RACE deployment by adding the following arguments to a rib deployment create command:

--comms-channel=ssEmail --comms-kit=<kit source for semanticsteg>

Common Issues

It usually takes between 30-60s for the channel to activate and links to become functional after startup.



How Do I?

Configuring a Deployment

When running a deployment using RiB there are options that can be passed after creation to customize Comms running. In order to have clean configs when doing a deployment create make sure to add ``--no-config-gento it. Then before running an up/start you can create new configs running:rib deployment local config gen`. Passing JSON with the flag `--comms-custom-args` allows you to customize the Semantic Steg plugin. EX: `rib deployment local config generate --comms-custom-args "{"ssEmail": "--disable-obfuscation"}`. This example will disable the obfuscation model for the given channel. There are some other options listed below:   

  • --disable-obfuscation: Makes the plugin not follow a user model
  • --encoding: Selects which encoding (see encodings.cfg) to use
  • --user-type: Selects which user type to use (changes user model speed, see channel specific user model)

Adding New Transports

This technique is agnostic to the method by which the image is transported from the sender to the receiver. One could send the individual images via alternative social media platforms, messaging systems, or even carrier pigeon without violating the core assumptions.

The core development of adding a new method to post and recieve images is straightforward.

Additionally one needs to modify the management of link-profiles to support the credentials and addresses used by this new transport.

Finally, if the new transport has significantly more image transformations (e.g., jpeg compression) then the encoding.cfg will need to be adjusted to increase the redundancy in the error correction. This can be challenging.

Training New Models

Only a limited selection of pre-trained models is included in this release. Using these models provides weaker security guarantees than training your custom model.

We do not provide code for model training, but compatible models can be produced with this unrelated codebase - https://github.com/openai/glow. Note that there is no affiliation, endorsement, or other relationship between the authors or sponsors of the RACE codebase and the authors or sponsors of the GLOW codebase.

Note that a new model may require non-trivial changes to the encoding.cfg to build in the correct amount of error correction.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published