Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AN-140 Metadata index for cost capping #7567

Merged
merged 20 commits into from
Dec 11, 2024
Merged

AN-140 Metadata index for cost capping #7567

merged 20 commits into from
Dec 11, 2024

Conversation

aednichols
Copy link
Collaborator

@aednichols aednichols commented Oct 2, 2024

Description

The cost-capping query below improves from 1m30s to 250ms on the 20M-row test workflow.

select SQL_NO_CACHE
    `WORKFLOW_EXECUTION_UUID`,
    `CALL_FQN`,
    `JOB_SCATTER_INDEX`,
    `JOB_RETRY_ATTEMPT`,
    `METADATA_KEY`,
    `METADATA_VALUE`,
    `METADATA_VALUE_TYPE`,
    `METADATA_TIMESTAMP`,
    `METADATA_JOURNAL_ID`
from
    `METADATA_ENTRY`
where
    (
        (
            `WORKFLOW_EXECUTION_UUID` = '69e8259c-856a-4d87-9cdf-bd709f8e5ce3'
            )
            and (
            (
                (
                    (
                        `METADATA_KEY` like 'vmStartTime%'
                        )
                        or (`METADATA_KEY` like 'vmEndTime%')
                    )
                    or (
                    `METADATA_KEY` like 'vmCostPerHour%'
                    )
                )
                or (
                `METADATA_KEY` like 'subWorkflowId%'
                )
            )
        )
  and (
    (not false)
        or (
        (
            (`CALL_FQN` is null)
                and (`JOB_SCATTER_INDEX` is null)
            )
            and (`JOB_RETRY_ATTEMPT` is null)
        )
    )
order by
    `METADATA_TIMESTAMP`;

This is how Liquibase checks for index existence on MySQL:

SELECT 
  TABLE_CATALOG AS TABLE_CAT, 
  TABLE_SCHEMA AS TABLE_SCHEM, 
  TABLE_NAME, 
  NON_UNIQUE, 
  NULL AS INDEX_QUALIFIER, 
  INDEX_NAME, 
  3 AS TYPE, 
  SEQ_IN_INDEX AS ORDINAL_POSITION, 
  COLUMN_NAME, 
  COLLATION AS ASC_OR_DESC, 
  CARDINALITY, 
  0 AS PAGES, 
  NULL AS FILTER_CONDITION 
FROM 
  INFORMATION_SCHEMA.STATISTICS 
WHERE 
  TABLE_SCHEMA = 'cromwell_test' 
  AND INDEX_NAME = 'IX_METADATA_ENTRY_WEU_MK' 
ORDER BY 
  NON_UNIQUE, 
  INDEX_NAME, 
  SEQ_IN_INDEX

And this is the index create:

CREATE INDEX `IX_METADATA_ENTRY_WEU_MK` ON `cromwell_test`.`METADATA_ENTRY`(
  `WORKFLOW_EXECUTION_UUID`, `METADATA_KEY`
)

A reincarnation of #4736

Release Notes Confirmation

CHANGELOG.md

  • I updated CHANGELOG.md in this PR
  • I assert that this change shouldn't be included in CHANGELOG.md because it doesn't impact community users

Terra Release Notes

  • I added a suggested release notes entry in this Jira ticket
  • I assert that this change doesn't need Jira release notes because it doesn't impact Terra users

@aednichols aednichols requested a review from a team as a code owner October 2, 2024 20:03
Copy link
Contributor

@salonishah11 salonishah11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@aednichols aednichols requested a review from a team as a code owner October 16, 2024 19:13
@aednichols aednichols changed the title WX-1878 Metadata index for cost capping AN-140 Metadata index for cost capping Oct 16, 2024
@aednichols aednichols merged commit 8a2d781 into develop Dec 11, 2024
43 checks passed
@aednichols aednichols deleted the aen_wx_1878 branch December 11, 2024 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants