Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Remove duplicated executor CPU time and runtime metric from SQLTaskAggMetricsProfileResult #1504

Merged
merged 3 commits into from
Jan 22, 2025

Conversation

cindyyuanjiang
Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang commented Jan 19, 2025

Fixes #1467.

SQLTaskAggMetricsProfileResult has 2 duplicate fields:

  • executorRunTime and executorRunTimeSum
  • executorCPUTime and executorCPUTimeSum

Changes

  • Removed duplicated metric fields from SQLTaskAggMetricsProfileResult
  • Updated existing unit tests

Affected Output

  1. sql_level_aggregated_task_metrics.csv

Removed columns: executorCPUTime, executorRunTime

Signed-off-by: cindyyuanjiang <cindyj@nvidia.com>
@cindyyuanjiang cindyyuanjiang self-assigned this Jan 19, 2025
@cindyyuanjiang cindyyuanjiang added bug Something isn't working core_tools Scope the core module (scala) API change A changeA change affecting the output (add/remove/rename files, add/remove/rename columns) labels Jan 19, 2025
Signed-off-by: cindyyuanjiang <cindyj@nvidia.com>
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang

SQLTaskAggMetricsProfileResult had 2 duplicate fields:

  • executorRunTime and executorRunTimeSum
  • executorCPUTime and executorCPUTimeSum

Can you please add more details to the PR description to list the CSV files affected by the change and the names of the columns that are going to be removed in each file.
This will make it easy for us to enumerate the changes associated with the affect_output tag.

Signed-off-by: cindyyuanjiang <cindyj@nvidia.com>
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang. LGTME

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang !
LGTME

@cindyyuanjiang cindyyuanjiang changed the title [BUG] Remove duplicated executor CPU time metric from SQLTaskAggMetricsProfileResult [BUG] Remove duplicated executor CPU time and runtime metric from SQLTaskAggMetricsProfileResult Jan 22, 2025
@cindyyuanjiang cindyyuanjiang merged commit a1ed715 into NVIDIA:dev Jan 22, 2025
13 checks passed
@cindyyuanjiang cindyyuanjiang deleted the spark-rapids-tools-1467 branch January 22, 2025 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change A changeA change affecting the output (add/remove/rename files, add/remove/rename columns) bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] SQL/Task Aggregated Metrics contain duplicated executor CPU time results
3 participants