Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Adds diff algorithm and unit tests for data_source field #188874

Merged
merged 10 commits into from
Aug 1, 2024

Conversation

dplumlee
Copy link
Contributor

@dplumlee dplumlee commented Jul 22, 2024

Summary

Related ticket: #187659

Adds the diff algorithm and unit test coverage for the data_source field we use in the prebuilt rules customization workflow. This field is a custom grouped field that combines the data_view_id field and index_pattern field that are used interchangeably of one another on the rule type for a rule's data source.

Checklist

Delete any items that are not applicable to this PR.

For maintainers

@dplumlee dplumlee added release_note:skip Skip the PR/issue when compiling release notes Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Rule Management Security Detection Rule Management Team Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area v8.16.0 labels Jul 22, 2024
@dplumlee dplumlee self-assigned this Jul 22, 2024
@dplumlee dplumlee requested a review from a team as a code owner July 22, 2024 20:09
@dplumlee dplumlee requested a review from jpdjere July 22, 2024 20:09
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-rule-management (Team:Detection Rule Management)

@elastic-vault-github-plugin-prod elastic-vault-github-plugin-prod bot requested a review from a team as a code owner July 26, 2024 15:21
@dplumlee dplumlee force-pushed the data-source-field-diff branch from e33a179 to eb20c50 Compare July 26, 2024 18:21
@dplumlee dplumlee removed the request for review from a team July 26, 2024 20:45
Copy link
Contributor

@jpdjere jpdjere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good in general - I left a couple of comments on possible refactorings and some smaller issues.

But I'm posting here the biggest question that came up during review here: it's about how we handle the ABC scenario. I think it will be intersting to get Product input here from @approksiu and also FYI @banderror.

I'll post the test cases here and my questions for each. I just want to start the discussion on these, it might be the case that the best way forward right now is to go with the logic as it is implemented in this PR, which I think is the most straight-forward, and then consider any changes for a second iteration.

1. All versions are index patterns

    it('if all versions are index patterns', () => {
      const mockVersions: ThreeVersionsOf<RuleDataSource> = {
        base_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'two', 'three'],
        },
        current_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'three', 'four'],
        },
        target_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'two', 'five'],
        },
      };

      const expectedMergedVersion: RuleDataSource = {
        type: DataSourceType.index_patterns,
        index_patterns: ['one', 'four', 'five'],
      };

      const result = dataSourceDiffAlgorithm(mockVersions);

      expect(result).toEqual(
        expect.objectContaining({
          merged_version: expectedMergedVersion,
          diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
          merge_outcome: ThreeWayMergeOutcome.Merged,
          conflict: ThreeWayDiffConflict.SOLVABLE,
        })
      );
    });

✅ This looks good; we can automerge the three versions for the user and we should.

2. All versions are data views

    it('if all versions are data views', () => {
      const mockVersions: ThreeVersionsOf<RuleDataSource> = {
        base_version: { type: DataSourceType.data_view, data_view_id: '123' },
        current_version: { type: DataSourceType.data_view, data_view_id: '456' },
        target_version: { type: DataSourceType.data_view, data_view_id: '789' },
      };

      const result = dataSourceDiffAlgorithm(mockVersions);

      expect(result).toEqual(
        expect.objectContaining({
          merged_version: mockVersions.current_version,
          diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
          merge_outcome: ThreeWayMergeOutcome.Current,
          conflict: ThreeWayDiffConflict.NON_SOLVABLE,
        })
      );
    });

✅ This looks good also; we cannot merge anything here so a non-solvable conflict is the right output.

3. Base version has a different type

    it('if base version is a different data type', () => {
      const mockVersions: ThreeVersionsOf<RuleDataSource> = {
        base_version: { type: DataSourceType.data_view, data_view_id: '123' },
        current_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'three', 'four'],
        },
        target_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'two', 'five'],
        },
      };

      const result = dataSourceDiffAlgorithm(mockVersions);

      expect(result).toEqual(
        expect.objectContaining({
          merged_version: mockVersions.current_version,
          diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
          merge_outcome: ThreeWayMergeOutcome.Current,
          conflict: ThreeWayDiffConflict.NON_SOLVABLE,
        })
      );
    });

❓ This is where my first doubt arises. If the current_version and the target_version are index patterns, could we offer the user an automerged version of those two? It seems that the fact that the base_version was a data_view doesn't have much importance here, and we might be able to offer a better UX than just returning a non-solvable conflict. I'm wondering if it makes sense from the UX point of view.

4. Current version has a different type

    it('if currrent version is a different data type', () => {
      const mockVersions: ThreeVersionsOf<RuleDataSource> = {
        base_version: { type: DataSourceType.data_view, data_view_id: '123' },
        current_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'three', 'four'],
        },
        target_version: { type: DataSourceType.data_view, data_view_id: '789' },
      };

      const result = dataSourceDiffAlgorithm(mockVersions);

      expect(result).toEqual(
        expect.objectContaining({
          merged_version: mockVersions.current_version,
          diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
          merge_outcome: ThreeWayMergeOutcome.Current,
          conflict: ThreeWayDiffConflict.NON_SOLVABLE,
        })
      );
    });

❓ Another doubt: wouldn't this make more sense if we treat it in a similar way to the scenario ABA and just return the current version. Wondering, if a user updated a rule with a data view to an index pattern, and then the data view gets updated, it sound most likely that the users doesn't care about that change and would prefer to keep its customization.

5. Target version has a different type


    it('if target version is a different data type', () => {
      const mockVersions: ThreeVersionsOf<RuleDataSource> = {
        base_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'two', 'three'],
        },
        current_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'three', 'four'],
        },
        target_version: { type: DataSourceType.data_view, data_view_id: '789' },
      };

      const result = dataSourceDiffAlgorithm(mockVersions);

      expect(result).toEqual(
        expect.objectContaining({
          merged_version: mockVersions.current_version,
          diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
          merge_outcome: ThreeWayMergeOutcome.Current,
          conflict: ThreeWayDiffConflict.NON_SOLVABLE,
        })
      );
    });

✅ This looks good also; we cannot merge anything here so a non-solvable conflict looks like the right output.

@dplumlee
Copy link
Contributor Author

@jpdjere after taking some time to think about it, I'd probably agree with the thinking on points 3 and 4. Point 3 is definitely a good idea, I think we should add that for sure. Point 4 I had to think about a bit more just because I know when we change the data_source field, it can had larger ramifications in other parts of the rule like query. It's possible that a change to a data_view_id in the target version of the rule could mean a user would want to add another item to the index field array and vice versa.

Looking through most of the rules, though, I'm not sure I see this happening too much and even if we did return a result similar to the ABA, I guess the user will always have the option to change it and see the incoming target version for context. I think we should still return the ABC outcome even if the merged version we return is similar to an ABA merge, would you agree?

@dplumlee
Copy link
Contributor Author

@jpdjere Ok, I made some changes in 687d07f in regards to the tests comment. I added the merge logic for when current and target versions are both index patterns and we could return a merged array instead of a non solvable conflict.

I didn't make any changes to the 4th point in that comment - while working on that case, I felt that since we already return the current version in the case of a non-solvable conflict, it's probably not the worst thing to highlight that change and the user would have a similar workflow to the one with an ABA style result.

Take a look and let me know what you think, I would also like to hear product's thoughts on it as you mentioned earlier.

cc @banderror @approksiu

@jpdjere
Copy link
Contributor

jpdjere commented Jul 31, 2024

@dplumlee Thanks for addressing my comments. I think that what you did here is correct. My suggestion for scenario 3 makes sense, but I also wouldn't change the behaviour for scenario 4. I also talked about briefly about these with Georgii and he felt the same way.

Only thing is that I think that the ABC code can be simplified as follows:

    case ThreeWayDiffOutcome.CustomizedValueCanUpdate: {
      // Handles two scenarios:
      // 1. All three BaseVersion, CurrentVersion and TargetVersion are Index Patterns
      // 2. BaseVersion is a DataView and both CurrentVersion and TargetVersion are Index Patterns
      if (
        isIndexPatternDataSourceType(dedupedCurrentVersion) &&
        isIndexPatternDataSourceType(dedupedTargetVersion)
      ) {
        const baseVersionToMerge =
          dedupedBaseVersion && isIndexPatternDataSourceType(dedupedBaseVersion)
            ? dedupedBaseVersion.index_patterns
            : [];

        return {
          conflict: ThreeWayDiffConflict.SOLVABLE,
          mergeOutcome: ThreeWayMergeOutcome.Merged,
          mergedVersion: {
            type: DataSourceType.index_patterns,
            index_patterns: mergeDedupedArrays(
              baseVersionToMerge,
              dedupedCurrentVersion.index_patterns,
              dedupedTargetVersion.index_patterns
            ),
          },
        };
      }

      // Handles all other scenarios/combinations of index patterns and data views
      return {
        conflict: ThreeWayDiffConflict.NON_SOLVABLE,
        mergeOutcome: ThreeWayMergeOutcome.Current,
        mergedVersion: dedupedCurrentVersion,
      };
    }

We can use mergeDedupedArrays both when all three versions are index patterns, but also when the baseVersion is a dataView if we just pass an empty array. It ends up being the equivalent of the union of current and target. And it makes the code more concise. With the comments added, I think it's enough to understand what's going on.


Also, let's split the test case if base version is a different data type into two, to clearly see what happens in both combinations:

    it('if base version is a data view and others are index patterns ', () => {
      const mockVersions: ThreeVersionsOf<RuleDataSource> = {
        base_version: { type: DataSourceType.data_view, data_view_id: '123' },
        current_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'three', 'four'],
        },
        target_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'two', 'five'],
        },
      };

      const expectedMergedVersion: RuleDataSource = {
        type: DataSourceType.index_patterns,
        index_patterns: ['one', 'three', 'four', 'two', 'five'],
      };

      const result = dataSourceDiffAlgorithm(mockVersions);

      expect(result).toEqual(
        expect.objectContaining({
          merged_version: expectedMergedVersion,
          diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
          merge_outcome: ThreeWayMergeOutcome.Merged,
          conflict: ThreeWayDiffConflict.SOLVABLE,
        })
      );
    });

    it('if base version is a index patterns and other are data views', () => {
      const mockVersions: ThreeVersionsOf<RuleDataSource> = {
        base_version: {
          type: DataSourceType.index_patterns,
          index_patterns: ['one', 'three', 'four'],
        },
        current_version: { type: DataSourceType.data_view, data_view_id: '123' },
        target_version: { type: DataSourceType.data_view, data_view_id: '456' },
      };

      const expectedMergedVersion: RuleDataSource = {
        type: DataSourceType.data_view,
        data_view_id: '123',
      };

      const result = dataSourceDiffAlgorithm(mockVersions);

      expect(result).toEqual(
        expect.objectContaining({
          merged_version: expectedMergedVersion,
          diff_outcome: ThreeWayDiffOutcome.CustomizedValueCanUpdate,
          merge_outcome: ThreeWayMergeOutcome.Current,
          conflict: ThreeWayDiffConflict.NON_SOLVABLE,
        })
      );
    });

@dplumlee
Copy link
Contributor Author

dplumlee commented Jul 31, 2024

@jpdjere good comments, made these changes in 908d523.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @dplumlee

@approksiu
Copy link

Thanks all, reading through this - recommendations make sense.

Copy link
Contributor

@jpdjere jpdjere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adressing all comments @dplumlee. LGTM ✅

Thanks also @approksiu for confirming the decisions taken here.

@dplumlee dplumlee merged commit 687df51 into elastic:main Aug 1, 2024
39 checks passed
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Aug 1, 2024
@dplumlee dplumlee deleted the data-source-field-diff branch August 1, 2024 14:15
dplumlee added a commit that referenced this pull request Aug 9, 2024
…9669)

## Summary

Related ticket: #187659

Adds test plan for diff algorithm for arrays of scalar values
implemented here: #188874



### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
dplumlee added a commit that referenced this pull request Aug 9, 2024
…gorithm (#189744)

## Summary

Completes #187659


Switches `data_source` fields to use the implemented diff algorithm
assigned to them in #188874


Adds integration tests in accordance to
#189669 for the `upgrade/_review`
API endpoint for the `data_source` field diff algorithm.

### Checklist

Delete any items that are not applicable to this PR.

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Prebuilt Detection Rules Security Solution Prebuilt Detection Rules area release_note:skip Skip the PR/issue when compiling release notes Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants