Skip to content

Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.

License

Notifications You must be signed in to change notification settings

pukomuko/java-html-sanitizer

 
 

Repository files navigation

OWASP Java HTML Sanitizer

A fast and easy to configure HTML Sanitizer written in Java which lets you include HTML authored by third-parties in your web application while protecting against XSS.

The existing dependencies are on guava and JSR 305. The other jars are only needed by the test suite. The JSR 305 dependency is a compile-only dependency, only needed for annotations.

This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review.


Getting Started includes instructions on how to get started with or without Maven.

You can use prepackaged policies:

PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS);
String safeHTML = policy.sanitize(untrustedHTML);

or the tests show how to configure your own policy:

PolicyFactory policy = new HtmlPolicyBuilder()
    .allowElements("a")
    .allowUrlProtocols("https")
    .allowAttributes("href").onElements("a")
    .requireRelNofollowOnLinks()
    .toFactory();
String safeHTML = policy.sanitize(untrustedHTML);

or you can write custom policies to do things like changing h1s to divs with a certain class:

PolicyFactory policy = new HtmlPolicyBuilder()
    .allowElements("p")
    .allowElements(
        new ElementPolicy() {
          public String apply(String elementName, List<String> attrs) {
            attrs.add("class");
            attrs.add("header-" + elementName);
            return "div";
          }
        }, "h1", "h2", "h3", "h4", "h5", "h6")
    .toFactory();
String safeHTML = policy.sanitize(untrustedHTML);
Please note that the elements "a", "font", "img", "input" and "span" need to be explicitly whitelisted 
using the `allowWithoutAttributes()` method if you want them to be allowed through the filter when 
these elements do not include any attributes.

Subscribe to the mailing list to be notified of known Vulnerabilities. If you wish to report a vulnerability, please see AttackReviewGroundRules.


Thanks to everyone who has helped with criticism and code

About

Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 42.9%
  • HTML 35.3%
  • JavaScript 21.6%
  • Shell 0.2%