Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KVMaxValueSize should not be enforced on entire txn endpoint data #6767

Closed
VasuCigareddy opened this issue Nov 11, 2019 · 5 comments · Fixed by #7388
Closed

KVMaxValueSize should not be enforced on entire txn endpoint data #6767

VasuCigareddy opened this issue Nov 11, 2019 · 5 comments · Fixed by #7388
Assignees
Labels
type/enhancement Proposed improvement or new feature

Comments

@VasuCigareddy
Copy link

Overview of the Issue

txn endpoint is used for service,node and check transactions along with KV. But in the code the entire 'content-length' header is checked against KVMaxValueSize which is probably incorrect?

sizeStr := req.Header.Get("Content-Length")
if sizeStr != "" {
if size, err := strconv.Atoi(sizeStr); err != nil {
fmt.Fprintf(resp, "Failed to parse Content-Length: %v", err)
return nil, 0, false
} else if size > int(s.agent.config.KVMaxValueSize) {
fmt.Fprintf(resp, "Request body too large, max size: %v bytes", s.agent.config.KVMaxValueSize)
return nil, 0, false
}
}

And if KVMaxValueSize is for single/individual KV size, why check for combined KV size in txn endpoint?

// Enforce an overall size limit to help prevent abuse.
if netKVSize > s.agent.config.KVMaxValueSize {
resp.WriteHeader(http.StatusRequestEntityTooLarge)
fmt.Fprintf(resp, "Cumulative size of key data is too large (%d > %d bytes)",
netKVSize, s.agent.config.KVMaxValueSize)
return nil, 0, false
}

Reproduction Steps

Steps to reproduce this issue, eg:

  1. Follow the example here and create a txn payload size greater than 512KB (Not KV, but all ops)
    https://www.consul.io/api/txn.html

  2. You can also reproduce the issue by sending http payload to txn endpoint with 60KV and each KV with 9KB data.

  3. Use Consul as Vault storage backend and store big secret(KV) with size 200KB data - In this case (at least in our environment), the vault was adding roughly 300KB data to txn endpoint so the whole data to txn endpoint is crossing 512KB which because of content-length header breaks/errors out.

Consul info for both Client and Server

Client info
agent:
	check_monitors = 0
	check_ttls = 0
	checks = 0
	services = 0
build:
	prerelease =
	revision = 8bb5e03c
	version = 1.6.1
consul:
	acl = disabled
	known_servers = 5
	server = false
runtime:
	arch = amd64
	cpu_count = 40
	goroutines = 47
	max_procs = 40
	os = linux
	version = go1.12.4
serf_lan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 94
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 846
	members = 15
	query_queue = 0
	query_time = 1

Server info
agent:
	check_monitors = 0
	check_ttls = 0
	checks = 0
	services = 0
build:
	prerelease =
	revision = 20ff9710
	version = 1.6.0+ent
consul:
	acl = disabled
	bootstrap = false
	known_datacenters = 1
	leader = false
	leader_addr = x.x.x.x:8300
	server = true
license:
	customer = permanent
	expiration_time = 2049-11-01 12:06:58.131890821 -0500 EST
	features = Automated Backups, Automated Upgrades, Enhanced Read Scalability, Network Segments, Redundancy Zone, Advanced Network Federation
	id = permanent
	install_id = *
	issue_time = 2019-11-09 12:06:58.131890821 -0500 EST
	package = premium
	product = consul
	start_time = 2019-11-09 12:01:58.131890821 -0500 EST
raft:
	applied_index = 27065544
	commit_index = 27065544
	fsm_pending = 0
	last_contact = 2.360726ms
	last_log_index = 27065545
	last_log_term = 4
	last_snapshot_index = 27048728
	last_snapshot_term = 4
	latest_configuration = [{Suffrage:Voter ID:9650a087-846b-df86-cb08-af651caad8dd Address:x.x.x.x:8300} {Suffrage:Voter ID:7e8c84e5-81aa-5107-c2e8-8765e7815b16 Address:x.x.x.x:8300} {Suffrage:Voter ID:37fd461f-0281-72a0-bb7a-1b2c0d0bc9a7 Address:x.x.x.x:8300} {Suffrage:Voter ID:c82691a6-7f5e-598c-f8ba-7b21cd9b8948 Address:x.x.x.x:8300} {Suffrage:Voter ID:127ecd3e-9919-c242-d053-fc28e83bf167 Address:x.x.x.x:8300}]
	latest_configuration_index = 1
	num_peers = 4
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Follower
	term = 4
runtime:
	arch = amd64
	cpu_count = 40
	goroutines = 112
	max_procs = 40
	os = linux
	version = go1.12.8
serf_lan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 94
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 846
	members = 15
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 1
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 10
	members = 5
	query_queue = 0
	query_time = 1

Operating system and Environment details

OS: CentOS Linux release 7.6.1810 (Core)

@mkeeler
Copy link
Member

mkeeler commented Nov 11, 2019

@VasuCigareddy I agree that something seems up with using the content length here for a number of reasons.

As for the combined kv size check after the initial content length check, that is because the first check is more of a preliminary check. The content coming in will be JSON encoded whereas when this data makes it into Raft it will be msgpack encoded and thus much smaller. We do the preliminary check at all as an attempt to prevent decoding requests which we know will get thrown out.

With that being said, that initial checks value probably does need updating to account for the datas reduction in size after decoding.

@antonsoroko
Copy link

Hi @mkeeler
Thanks for the reply!
We see 3 checks in the code. 2 of them (mentioned above) look strange for us so we thought that only

case in.KV != nil:
size := len(in.KV.Value)
if uint64(size) > s.agent.config.KVMaxValueSize {
resp.WriteHeader(http.StatusRequestEntityTooLarge)
fmt.Fprintf(resp, "Value for key %q is too large (%d > %d bytes)", in.KV.Key, size, s.agent.config.KVMaxValueSize)
return nil, 0, false
}
should be active.
About "combined kv size check" - why to check the size of all KVs if the limit is applied only for 1 KV? Or we misunderstand the meaning of KVMaxValueSize ?
// KVMaxValueSize controls the max allowed value size. If not set defaults
// to raft's suggested max value size.
//
// hcl: limits { kv_max_value_size = uint64 }
KVMaxValueSize uint64

@VasuCigareddy
Copy link
Author

@mkeeler, I would expect the content-length check to be size > int(s.agent.config.KVMaxValueSize) * maxTxnOps Because a txn endpoint can have 64(or maxTxnOps) number of transactions and each transaction (if its KV) cloud be of size KVMaxValueSize.

KV endpoint does the exact same check for single KV PUT, so I would expect txn endpoint to be max_supported_write_ops * max_supported_kv_put

// Check the content-length
if req.ContentLength > int64(s.agent.config.KVMaxValueSize) {
resp.WriteHeader(http.StatusRequestEntityTooLarge)
fmt.Fprintf(resp, "Value exceeds %d byte limit", s.agent.config.KVMaxValueSize)
return nil, nil
}

And same for if netKVSize > s.agent.config.KVMaxValueSize, I'd expect it to be if netKVSize > s.agent.config.KVMaxValueSize * writes because writes are incremented for every KV write in txn data:

verb := in.KV.Verb
if isWrite(verb) {
writes++
}

@schristoff schristoff added the type/enhancement Proposed improvement or new feature label Nov 12, 2019
@hanshasselberg
Copy link
Member

Looked at this again and I agree that KVMaxValueSize is not a great value to use. I think we should introduce a separate option like MaxTxnSize which limits the amount of data you can send to the endpoint. That should have a higher value than KVMaxValueSize. That would be a pragmatic approach to solve this issue.

Maybe this is useful: https://golang.org/pkg/net/http/#MaxBytesReader.

Any thoughts?

@findkim findkim self-assigned this Mar 2, 2020
@findkim findkim linked a pull request Mar 5, 2020 that will close this issue
@findkim
Copy link
Contributor

findkim commented Mar 5, 2020

Thanks @VasuCigareddy for the very detailed description! After digging into the txn endpoint more, I agree with @i0rek to introduce a new option and the use of KVMaxValueSize is a bit confusing.

#7388 adds configurable TxnMaxReqLen and includes changes so KVMaxValueSize is no longer enforced on the whole request. However, the default values for both will still be set to raft's suggested size to account for performance since all of the operations for a transaction is passed along to raft in one request. Separating the two options aims to clarify the confusion on the limits imposed on the txn endpoint as well as offer options to resolve issues like the ones you listed under Reproduction Steps.

Please reach out if you have any followup concerns on the approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement Proposed improvement or new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants