Skip to content

Commit

Permalink
CacheFilter: parses the allowed_vary_headers from the cache config. (#…
Browse files Browse the repository at this point in the history
…12928)

Commit Message: CacheFilter: parses the allowed_vary_headers from the cache config.

Additional Description:
Parses the allowlist from the cache config proto; this allows users to define a set of rules to control which headers can be varied in the cache.

Risk Level: Low
Testing: Unit testing
Docs Changes: Updated cache proto's comments regarding allowed_vary_headers
Release Notes: N/A
Fixes #10131

Signed-off-by: Caio <caiomelo@google.com>
  • Loading branch information
cbdm authored Sep 11, 2020
1 parent fdfaba5 commit 6a994d5
Show file tree
Hide file tree
Showing 20 changed files with 398 additions and 272 deletions.
7 changes: 2 additions & 5 deletions api/envoy/config/filter/http/cache/v2alpha/cache.proto
Original file line number Diff line number Diff line change
Expand Up @@ -48,17 +48,14 @@ message CacheConfig {
// Config specific to the cache storage implementation.
google.protobuf.Any typed_config = 1 [(validate.rules).any = {required: true}];

// [#not-implemented-hide:]
// <TODO(toddmgreer) implement *vary* headers>
//
// List of allowed *Vary* headers.
// List of matching rules that defines allowed *Vary* headers.
//
// The *vary* response header holds a list of header names that affect the
// contents of a response, as described by
// https://httpwg.org/specs/rfc7234.html#caching.negotiated.responses.
//
// During insertion, *allowed_vary_headers* acts as a allowlist: if a
// response's *vary* header mentions any header names that aren't in
// response's *vary* header mentions any header names that aren't matched by any rules in
// *allowed_vary_headers*, that response will not be cached.
//
// During lookup, *allowed_vary_headers* controls what request headers will be
Expand Down
7 changes: 2 additions & 5 deletions api/envoy/extensions/filters/http/cache/v3alpha/cache.proto
Original file line number Diff line number Diff line change
Expand Up @@ -52,17 +52,14 @@ message CacheConfig {
// Config specific to the cache storage implementation.
google.protobuf.Any typed_config = 1 [(validate.rules).any = {required: true}];

// [#not-implemented-hide:]
// <TODO(toddmgreer) implement *vary* headers>
//
// List of allowed *Vary* headers.
// List of matching rules that defines allowed *Vary* headers.
//
// The *vary* response header holds a list of header names that affect the
// contents of a response, as described by
// https://httpwg.org/specs/rfc7234.html#caching.negotiated.responses.
//
// During insertion, *allowed_vary_headers* acts as a allowlist: if a
// response's *vary* header mentions any header names that aren't in
// response's *vary* header mentions any header names that aren't matched by any rules in
// *allowed_vary_headers*, that response will not be cached.
//
// During lookup, *allowed_vary_headers* controls what request headers will be
Expand Down
7 changes: 2 additions & 5 deletions api/envoy/extensions/filters/http/cache/v4alpha/cache.proto

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion source/docs/filters/http/cache/cache_filter.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ which request methods and response codes are cacheable, subject to the
cache-related headers they also define: *cache-control*, *range*, *if-match*,
*if-none-match*, *if-modified-since*, *if-unmodified-since*, *if-range*, *authorization*,
*date*, *age*, *expires*, and *vary*. Responses with a *vary* header will only be cached
if the named headers are listed in *allowed_vary_headers*.
if the named headers are accepted by one of the matching rules in *allowed_vary_headers*.

## Status
* ready for developers to write cache storage plugins; please contribute them
Expand Down
4 changes: 4 additions & 0 deletions source/extensions/filters/http/cache/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,12 @@ envoy_cc_library(
":inline_headers_handles",
"//include/envoy/common:time_interface",
"//include/envoy/http:header_map_interface",
"//source/common/common:matchers_lib",
"//source/common/http:header_map_lib",
"//source/common/http:header_utility_lib",
"//source/common/http:headers_lib",
"//source/common/protobuf",
"@envoy_api//envoy/extensions/filters/http/cache/v3alpha:pkg_cc_proto",
],
)

Expand Down
12 changes: 6 additions & 6 deletions source/extensions/filters/http/cache/cache_filter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,11 @@ struct CacheResponseCodeDetailValues {

using CacheResponseCodeDetails = ConstSingleton<CacheResponseCodeDetailValues>;

CacheFilter::CacheFilter(const envoy::extensions::filters::http::cache::v3alpha::CacheConfig&,
const std::string&, Stats::Scope&, TimeSource& time_source,
HttpCache& http_cache)
CacheFilter::CacheFilter(
const envoy::extensions::filters::http::cache::v3alpha::CacheConfig& config, const std::string&,
Stats::Scope&, TimeSource& time_source, HttpCache& http_cache)
: time_source_(time_source), cache_(http_cache),
allowed_vary_headers_(VaryHeader::parseAllowlist()) {}
vary_allow_list_(config.allowed_vary_headers()) {}

void CacheFilter::onDestroy() {
filter_state_ = FilterState::Destroyed;
Expand Down Expand Up @@ -63,7 +63,7 @@ Http::FilterHeadersStatus CacheFilter::decodeHeaders(Http::RequestHeaderMap& hea
}
ASSERT(decoder_callbacks_);

LookupRequest lookup_request(headers, time_source_.systemTime(), allowed_vary_headers_);
LookupRequest lookup_request(headers, time_source_.systemTime(), vary_allow_list_);
request_allows_inserts_ = !lookup_request.requestCacheControl().no_store_;
lookup_ = cache_.makeLookupContext(std::move(lookup_request));

Expand Down Expand Up @@ -97,7 +97,7 @@ Http::FilterHeadersStatus CacheFilter::encodeHeaders(Http::ResponseHeaderMap& he
// Either a cache miss or a cache entry that is no longer valid.
// Check if the new response can be cached.
if (request_allows_inserts_ &&
CacheabilityUtils::isCacheableResponse(headers, allowed_vary_headers_)) {
CacheabilityUtils::isCacheableResponse(headers, vary_allow_list_)) {
ENVOY_STREAM_LOG(debug, "CacheFilter::encodeHeaders inserting headers", *encoder_callbacks_);
insert_ = cache_.makeInsertContext(std::move(lookup_));
// Add metadata associated with the cached response. Right now this is only response_time;
Expand Down
9 changes: 5 additions & 4 deletions source/extensions/filters/http/cache/cache_filter.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,11 @@ class CacheFilter : public Http::PassThroughFilter,
// onHeaders for Range Responses, otherwise initialized by encodeCachedResponse.
std::vector<AdjustedByteRange> remaining_ranges_;

// TODO(#12901): The allowlist could be constructed only once directly from the config, instead of
// doing it per-request.
// Stores the headers that can be used to vary responses.
absl::flat_hash_set<std::string> allowed_vary_headers_;
// TODO(#12901): The allow list could be constructed only once directly from the config, instead
// of doing it per-request. A good example of such config is found in the gzip filter:
// source/extensions/filters/http/gzip/gzip_filter.h.
// Stores the allow list rules that decide if a header can be varied upon.
VaryHeader vary_allow_list_;

// True if the response has trailers.
// TODO(toddmgreer): cache trailers.
Expand Down
103 changes: 63 additions & 40 deletions source/extensions/filters/http/cache/cache_headers_utils.cc
Original file line number Diff line number Diff line change
Expand Up @@ -199,26 +199,70 @@ absl::optional<uint64_t> CacheHeadersUtils::readAndRemoveLeadingDigits(absl::str
return absl::nullopt;
}

absl::flat_hash_set<std::string> VaryHeader::parseAllowlist() {
// TODO(cbdm): Populate the hash_set from
// envoy::extensions::filters::http::cache::v3alpha::CacheConfig::allowed_vary_headers.
// Need to make sure that the headers we add here are valid values (i.e., not malformed). That
// way, we won't have to check this again in isAllowed.
return {"x-temporary-standin-header-name"};
void CacheHeadersUtils::getAllMatchingHeaderNames(
const Http::HeaderMap& headers, const std::vector<Matchers::StringMatcherPtr>& ruleset,
absl::flat_hash_set<absl::string_view>& out) {
headers.iterate([&ruleset, &out](const Http::HeaderEntry& header) -> Http::HeaderMap::Iterate {
absl::string_view header_name = header.key().getStringView();
for (const auto& rule : ruleset) {
if (rule->match(header_name)) {
out.emplace(header_name);
break;
}
}
return Http::HeaderMap::Iterate::Continue;
});
}

std::vector<std::string>
CacheHeadersUtils::parseCommaDelimitedList(const Http::HeaderEntry* entry) {
if (!entry) {
return {};
}

std::vector<std::string> header_values = absl::StrSplit(entry->value().getStringView(), ',');
for (std::string& value : header_values) {
// TODO(cbdm): Might be able to improve the performance here by using StringUtil::trim to
// remove whitespace.
absl::StripAsciiWhitespace(&value);
}

return header_values;
}

bool VaryHeader::isAllowed(const absl::flat_hash_set<std::string>& allowed_headers,
const Http::ResponseHeaderMap& headers) {
if (!hasVary(headers)) {
VaryHeader::VaryHeader(
const Protobuf::RepeatedPtrField<envoy::type::matcher::v3::StringMatcher>& allow_list) {

for (const auto& rule : allow_list) {
allow_list_.emplace_back(std::make_unique<Matchers::StringMatcherImpl>(rule));
}
}

bool VaryHeader::isAllowed(const Http::ResponseHeaderMap& headers) const {
if (!VaryHeader::hasVary(headers)) {
return true;
}

std::vector<std::string> varied_headers =
parseHeaderValue(headers.get(Http::Headers::get().Vary));
CacheHeadersUtils::parseCommaDelimitedList(headers.get(Http::Headers::get().Vary));

// If the vary value was malformed, it will not be contained in allowed_headers.
for (const std::string& header : varied_headers) {
if (!allowed_headers.contains(header)) {
bool valid = false;

// "Vary: *" should never be cached per:
// https://tools.ietf.org/html/rfc7231#section-7.1.4
if (header == "*") {
return false;
}

for (const auto& rule : allow_list_) {
if (rule->match(header)) {
valid = true;
break;
}
}

if (!valid) {
return false;
}
}
Expand Down Expand Up @@ -252,7 +296,7 @@ std::string VaryHeader::createVaryKey(const Http::HeaderEntry* vary_header,

std::string vary_key = "vary-key\n";

for (const std::string& header : parseHeaderValue(vary_header)) {
for (const std::string& header : CacheHeadersUtils::parseCommaDelimitedList(vary_header)) {
// TODO(cbdm): Can add some bucketing logic here based on header. For example, we could
// normalize the values for accept-language by making all of {en-CA, en-GB, en-US} into
// "en". This way we would not need to store multiple versions of the same payload, and any
Expand All @@ -269,40 +313,19 @@ std::string VaryHeader::createVaryKey(const Http::HeaderEntry* vary_header,
return vary_key;
}

std::vector<std::string> VaryHeader::parseHeaderValue(const Http::HeaderEntry* vary_header) {
if (!vary_header) {
return {};
}

ASSERT(vary_header->key() == "vary");

// Vary header value should follow rules set per:
// https://tools.ietf.org/html/rfc7231#section-7.1.4

std::vector<std::string> header_values =
absl::StrSplit(vary_header->value().getStringView(), ',');
for (std::string& value : header_values) {
// TODO(cbdm): Might be able to improve the performance here: (1) could use StringUtil::trim to
// remove whitespace; (2) lowering the case might not be necessary depending on the
// functionality of isAllowed (e.g., if a hash-set, could hash ignoring case).
absl::StripAsciiWhitespace(&value);
absl::AsciiStrToLower(&value);
}

return header_values;
}

Http::RequestHeaderMapPtr
VaryHeader::possibleVariedHeaders(const absl::flat_hash_set<std::string>& allowed_headers,
const Http::RequestHeaderMap& request_headers) {
VaryHeader::possibleVariedHeaders(const Http::RequestHeaderMap& request_headers) const {
Http::RequestHeaderMapPtr possible_headers =
Http::createHeaderMap<Http::RequestHeaderMapImpl>({});

for (const std::string& header : allowed_headers) {
absl::flat_hash_set<absl::string_view> header_names;
CacheHeadersUtils::getAllMatchingHeaderNames(request_headers, allow_list_, header_names);

for (const absl::string_view& header : header_names) {
std::vector<absl::string_view> values;
Http::HeaderUtility::getAllOfHeader(request_headers, header, values);
for (const absl::string_view& value : values) {
possible_headers->addCopy(Http::LowerCaseString(header), value);
possible_headers->addCopy(Http::LowerCaseString(std::string{header}), value);
}
}

Expand Down
36 changes: 25 additions & 11 deletions source/extensions/filters/http/cache/cache_headers_utils.h
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
#pragma once

#include "envoy/common/time.h"
#include "envoy/extensions/filters/http/cache/v3alpha/cache.pb.h"
#include "envoy/http/header_map.h"

#include "common/common/matchers.h"
#include "common/http/header_map_impl.h"
#include "common/http/header_utility.h"
#include "common/http/headers.h"
#include "common/protobuf/protobuf.h"

#include "absl/strings/str_join.h"
#include "absl/strings/string_view.h"
Expand Down Expand Up @@ -103,32 +109,40 @@ class CacheHeadersUtils {
* absl::nullopt without advancing "*str".
*/
static absl::optional<uint64_t> readAndRemoveLeadingDigits(absl::string_view& str);

// Add to out all header names from the given map that match any of the given rules.
static void getAllMatchingHeaderNames(const Http::HeaderMap& headers,
const std::vector<Matchers::StringMatcherPtr>& ruleset,
absl::flat_hash_set<absl::string_view>& out);

// Parses the values of a comma-delimited list as defined per
// https://tools.ietf.org/html/rfc7230#section-7.
static std::vector<std::string> parseCommaDelimitedList(const Http::HeaderEntry* entry);
};

class VaryHeader {
public:
// Checks if the headers contain an allowed value in the Vary header.
static bool isAllowed(const absl::flat_hash_set<std::string>& allowed_headers,
const Http::ResponseHeaderMap& headers);

// Checks if the headers contain a non-empty value in the Vary header.
static bool hasVary(const Http::ResponseHeaderMap& headers);

// Creates a single string combining the values of the varied headers from entry_headers.
static std::string createVaryKey(const Http::HeaderEntry* vary_header,
const Http::RequestHeaderMap& entry_headers);

// Parses the header names that are in the Vary header value.
static std::vector<std::string> parseHeaderValue(const Http::HeaderEntry* vary_header);
// Parses the allow list from the Cache Config into the object's private allow_list_.
VaryHeader(const Protobuf::RepeatedPtrField<envoy::type::matcher::v3::StringMatcher>& allow_list);

// Checks if the headers contain an allowed value in the Vary header.
bool isAllowed(const Http::ResponseHeaderMap& headers) const;

// Returns a header map containing the subset of the original headers that can be varied from the
// request.
static Http::RequestHeaderMapPtr
possibleVariedHeaders(const absl::flat_hash_set<std::string>& allowed_headers,
const Http::RequestHeaderMap& request_headers);
Http::RequestHeaderMapPtr
possibleVariedHeaders(const Http::RequestHeaderMap& request_headers) const;

// Parses the allowlist of header values that can be used to create varied responses.
static absl::flat_hash_set<std::string> parseAllowlist();
private:
// Stores the matching rules that define whether a header is allowed to be varied.
std::vector<Matchers::StringMatcherPtr> allow_list_;
};

} // namespace Cache
Expand Down
Loading

0 comments on commit 6a994d5

Please sign in to comment.