Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two proprietary-license rules generate false positive detections #3504

Closed
DennisClark opened this issue Aug 31, 2023 · 2 comments
Closed

Two proprietary-license rules generate false positive detections #3504

DennisClark opened this issue Aug 31, 2023 · 2 comments
Assignees
Labels

Comments

@DennisClark
Copy link
Member

A recent scan of an FFmpeg project returned a composite license expression that included AND proprietary-license in the various licenses, and that was totally incorrect, as there was no object in the codebase under any proprietary license. The culprits are two rules that are simply getting clues from configuration file documentation, and I think that these rules should simply be deleted.

  • proprietary-license_489.RULE

  • proprietary-license_490.RULE

      {
        "score": 100.0,
        "matcher": "2-aho",
        "end_line": 4182,
        "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_489.RULE",
        "start_line": 4182,
        "matched_text": "    license=\"nonfree and unredistributable\"",
        "match_coverage": 100.0,
        "matched_length": 4,
        "rule_relevance": 100,
        "rule_identifier": "proprietary-license_489.RULE",
        "license_expression": "proprietary-license"
      },
    
      {
        "score": 100.0,
        "matcher": "2-aho",
        "end_line": 101,
        "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_490.RULE",
        "start_line": 101,
        "matched_text": "  --enable-nonfree         allow use of nonfree code, the resulting libs",
        "match_coverage": 100.0,
        "matched_length": 2,
        "rule_relevance": 100,
        "rule_identifier": "proprietary-license_490.RULE",
        "license_expression": "proprietary-license"
      }
    

The rules are simply finding the configure instructions that the FFmpeg authors provide to anyone that wants to build FFmpeg to include proprietary code (which results in a non-redistributable build, since that is not compatible with GPL, but that's another story). But SCTK is interpreting these rules to mean that there is some software object in the codebase under a "generic" proprietary-license, when there is actually no such object in the codebase.

@AyanSinhaMahapatra
Copy link
Member

AyanSinhaMahapatra commented Sep 1, 2023

@DennisClark @pombredanne

Our general approach to dealing with these kinds of detections (which should really be clues) and similar generic rule cases, is that we mark these as is_license_clue as True and this would result in this detection not being reported in the detected_license_expression or the license_detections, but in the seperate license_clues section.
References:

The question is whether this is a seperate case from the clues case and here there are no benifits from keeping this rule even as clues. In that case, we can go ahead and delete.

The difference will be:

Earlier:

{
      "path": "clues-prop.txt",
      "type": "file",
      "detected_license_expression": "proprietary-license",
      "detected_license_expression_spdx": "LicenseRef-scancode-proprietary-license",
      "license_detections": [
        {
          "license_expression": "proprietary-license",
          "matches": [
            {
              "score": 100.0,
              "start_line": 1,
              "end_line": 1,
              "matched_length": 2,
              "match_coverage": 100.0,
              "matcher": "2-aho",
              "license_expression": "proprietary-license",
              "rule_identifier": "proprietary-license_490.RULE",
              "rule_relevance": 100,
              "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_490.RULE",
              "matched_text": "enable-nonfree"
            }
          ],
          "detection_log": [],
          "identifier": "proprietary_license-f985f915-da45-4c79-05c4-f5a853de472a"
        }
      ],
      "license_clues": [],
      "percentage_of_license_text": 20.0,
      "scan_errors": []
    }

After marking as clue:

 {
      "path": "clues-prop.txt",
      "type": "file",
      "detected_license_expression": null,
      "detected_license_expression_spdx": null,
      "license_detections": [],
      "license_clues": [
        {
          "score": 100.0,
          "start_line": 1,
          "end_line": 1,
          "matched_length": 2,
          "match_coverage": 100.0,
          "matcher": "1-hash",
          "license_expression": "proprietary-license",
          "rule_identifier": "proprietary-license_490.RULE",
          "rule_relevance": 100,
          "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/proprietary-license_490.RULE",
          "matched_text": "enable-nonfree"
        }
      ],
      "percentage_of_license_text": 100.0,
      "scan_errors": []
    }

AyanSinhaMahapatra added a commit that referenced this issue Sep 1, 2023
Reference: #3504
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra
Copy link
Member

This was fixed, closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants