Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-42344][K8S] Change the default size of the CONFIG_MAP_MAXSIZE #39884

Closed
wants to merge 2 commits into from

Conversation

nineinfra
Copy link
Contributor

The default size of the CONFIG_MAP_MAXSIZE should not be greater than 1048576

What changes were proposed in this pull request?

This PR changed the default size of the CONFIG_MAP_MAXSIZE from 1572864(1.5 MiB) to 1048576(1.0 MiB)

Why are the changes needed?

When a job is submitted by the spark to the K8S with a configmap , The Spark-Submit will call the K8S‘s POST API "api/v1/namespaces/default/configmaps". And the size of the configmaps will be validated by this K8S API,the max value shoud not be greater than 1048576.
In the previous comment,the explain in https://etcd.io/docs/v3.4/dev-guide/limit/ is:
"etcd is designed to handle small key value pairs typical for metadata. Larger requests will work, but may increase the latency of other requests. By default, the maximum size of any request is 1.5 MiB. This limit is configurable through --max-request-bytes flag for etcd server."
This explanation is from the perspective of etcd ,not K8S.
So I think the default value of the configmap in Spark should not be greate than 1048576.

Does this PR introduce any user-facing change?

Yes.
Generally, the size of the configmap will not exceed 1572864 or even 1048576.
So the problem solved here may not be perceived by users.

How was this patch tested?

local test

The default size of the CONFIG_MAP_MAXSIZE should not be greater than 1048576
@nineinfra
Copy link
Contributor Author

Could you review this pr please? @dongjoon-hyun

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making a PR, @ninebigbig . Here is my comment.

  • Please don't change the version because it's a version Since.
  • Please don't change the doc. Instead, add the following to protect it systematically.
.checkValue(_ <= 1048576, "Must be at most 1048576 bytes")

@nineinfra
Copy link
Contributor Author

Thank you for making a PR, @ninebigbig . Here is my comment.

  • Please don't change the version because it's a version Since.
  • Please don't change the doc. Instead, add the following to protect it systematically.
.checkValue(_ <= 1048576, "Must be at most 1048576 bytes")

Thanks for review, @dongjoon-hyun
I have made a new commit .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.
This is consistent with https://kubernetes.io/docs/concepts/configuration/configmap/#motivation

A ConfigMap is not designed to hold large chunks of data. The data stored in a ConfigMap cannot exceed 1 MiB.

dongjoon-hyun pushed a commit that referenced this pull request Feb 5, 2023
The default size of the CONFIG_MAP_MAXSIZE should not be greater than 1048576

### What changes were proposed in this pull request?
This PR changed the default size of the  CONFIG_MAP_MAXSIZE from 1572864(1.5 MiB) to 1048576(1.0 MiB)

### Why are the changes needed?
When a job is submitted by the spark to the K8S with a configmap , The Spark-Submit will call the K8S‘s POST API "api/v1/namespaces/default/configmaps". And the size of the configmaps will be validated by this K8S API,the max value shoud not be greater than 1048576.
In the previous comment,the explain in https://etcd.io/docs/v3.4/dev-guide/limit/ is:
"etcd is designed to handle small key value pairs typical for metadata. Larger requests will work, but may increase the latency of other requests. By default, the maximum size of any request is 1.5 MiB. This limit is configurable through --max-request-bytes flag for etcd server."
This explanation is from the perspective of etcd ,not K8S.
So I think the default value of the configmap in Spark should not be greate than 1048576.

### Does this PR introduce _any_ user-facing change?
Yes.
Generally, the size of the configmap will not exceed 1572864 or even 1048576.
So the problem solved here may not be perceived by users.

### How was this patch tested?
local test

Closes #39884 from ninebigbig/master.

Authored-by: Yan Wei <ninebigbig@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 9ac4640)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Feb 5, 2023

Merged to master/3.4/3.3.

Thank you, @ninebigbig . I added you to the Apache Spark contributor group and assigned SPARK-42344 to you.
Welcome to the Apache Spark community, @ninebigbig .

@dongjoon-hyun
Copy link
Member

Also, thank you for review, @LuciferYang .

dongjoon-hyun pushed a commit that referenced this pull request Feb 5, 2023
The default size of the CONFIG_MAP_MAXSIZE should not be greater than 1048576

### What changes were proposed in this pull request?
This PR changed the default size of the  CONFIG_MAP_MAXSIZE from 1572864(1.5 MiB) to 1048576(1.0 MiB)

### Why are the changes needed?
When a job is submitted by the spark to the K8S with a configmap , The Spark-Submit will call the K8S‘s POST API "api/v1/namespaces/default/configmaps". And the size of the configmaps will be validated by this K8S API,the max value shoud not be greater than 1048576.
In the previous comment,the explain in https://etcd.io/docs/v3.4/dev-guide/limit/ is:
"etcd is designed to handle small key value pairs typical for metadata. Larger requests will work, but may increase the latency of other requests. By default, the maximum size of any request is 1.5 MiB. This limit is configurable through --max-request-bytes flag for etcd server."
This explanation is from the perspective of etcd ,not K8S.
So I think the default value of the configmap in Spark should not be greate than 1048576.

### Does this PR introduce _any_ user-facing change?
Yes.
Generally, the size of the configmap will not exceed 1572864 or even 1048576.
So the problem solved here may not be perceived by users.

### How was this patch tested?
local test

Closes #39884 from ninebigbig/master.

Authored-by: Yan Wei <ninebigbig@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 9ac4640)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit d07a0e9)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@nineinfra
Copy link
Contributor Author

Thank you, @dongjoon-hyun .
And thanks for @LuciferYang

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
The default size of the CONFIG_MAP_MAXSIZE should not be greater than 1048576

### What changes were proposed in this pull request?
This PR changed the default size of the  CONFIG_MAP_MAXSIZE from 1572864(1.5 MiB) to 1048576(1.0 MiB)

### Why are the changes needed?
When a job is submitted by the spark to the K8S with a configmap , The Spark-Submit will call the K8S‘s POST API "api/v1/namespaces/default/configmaps". And the size of the configmaps will be validated by this K8S API,the max value shoud not be greater than 1048576.
In the previous comment,the explain in https://etcd.io/docs/v3.4/dev-guide/limit/ is:
"etcd is designed to handle small key value pairs typical for metadata. Larger requests will work, but may increase the latency of other requests. By default, the maximum size of any request is 1.5 MiB. This limit is configurable through --max-request-bytes flag for etcd server."
This explanation is from the perspective of etcd ,not K8S.
So I think the default value of the configmap in Spark should not be greate than 1048576.

### Does this PR introduce _any_ user-facing change?
Yes.
Generally, the size of the configmap will not exceed 1572864 or even 1048576.
So the problem solved here may not be perceived by users.

### How was this patch tested?
local test

Closes apache#39884 from ninebigbig/master.

Authored-by: Yan Wei <ninebigbig@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 9ac4640)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jan 27, 2025
…ault value in docs

### What changes were proposed in this pull request?

This PR aims to Fix `spark.kubernetes.configMap.maxSize` default value in docs

### Why are the changes needed?

Since Apache Spark 3.3.2, we fixed this value from `1572864` from `1048576` in the validation logic.
- #39884

### Does this PR introduce _any_ user-facing change?

No, this is a doc only update to be consistent with the code.

### How was this patch tested?

Manual review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49683 from dongjoon-hyun/SPARK-50998.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jan 27, 2025
…ault value in docs

### What changes were proposed in this pull request?

This PR aims to Fix `spark.kubernetes.configMap.maxSize` default value in docs

### Why are the changes needed?

Since Apache Spark 3.3.2, we fixed this value from `1572864` from `1048576` in the validation logic.
- #39884

### Does this PR introduce _any_ user-facing change?

No, this is a doc only update to be consistent with the code.

### How was this patch tested?

Manual review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49683 from dongjoon-hyun/SPARK-50998.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 2b70117)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jan 27, 2025
…ault value in docs

### What changes were proposed in this pull request?

This PR aims to Fix `spark.kubernetes.configMap.maxSize` default value in docs

### Why are the changes needed?

Since Apache Spark 3.3.2, we fixed this value from `1572864` from `1048576` in the validation logic.
- #39884

### Does this PR introduce _any_ user-facing change?

No, this is a doc only update to be consistent with the code.

### How was this patch tested?

Manual review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49683 from dongjoon-hyun/SPARK-50998.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 2b70117)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants