Skip to content

Commit

Permalink
Merge pull request #164 from github/content-hash-versioning
Browse files Browse the repository at this point in the history
Add content hashing as a versioning strategy
  • Loading branch information
jonabc authored May 9, 2019
2 parents 6a107a1 + f1dfd8a commit 908f0de
Show file tree
Hide file tree
Showing 12 changed files with 271 additions and 19 deletions.
17 changes: 17 additions & 0 deletions docs/sources/go.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,20 @@ go:
The setting supports absolute, relative and expandable (e.g. "~") paths. Relative paths are considered relative to the repository root.

Non-empty `GOPATH` configuration settings will override the `GOPATH` environment variable while enumerating `go` dependencies. The `GOPATH` environment variable is restored once dependencies have been enumerated.

#### Versioning

The go source supports multiple versioning strategies to determine if cached dependency metadata is stale. A version strategy is chosen based on the availability of go module information along with the current app configuration.

1. Go Module version - This strategy uses the version of the go module.
- :exclamation: This strategy will always be used if go module information is available because the version comes from an externally provided identifier. Locating the version of the source package used via this identifier will be easier than other strategies.
2. Git commit SHA - This strategy uses the latest Git commit SHA available for the package's import path directory as the version. This is the default strategy used if a go module version isn't available and the setting is not configured.
- :warning: The latest Git commit won't capture any changes that are committed alongside a cached file update. Make sure to update cached files after all other changes are committed.

```yaml
version_strategy: git # or leave this key unset
```
3. Contents hash - This strategy uses a hash of the files in the package's import path directory as the version.
```yaml
version_strategy: contents
```
15 changes: 15 additions & 0 deletions docs/sources/manifests.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,3 +145,18 @@ manifest:
licenses:
package: path/to/LICENSE
```

### License content versioning

The manifest source supports multiple versioning strategies to determine if cached dependency metadata is stale. A version strategy is chosen based on the current app configuration.

1. Git commit SHA - This strategy uses the latest Git commit SHA available for the package's import path directory as the version. This is the default strategy used if not otherwise configured.
- :warning: The latest Git commit won't capture any changes that are committed alongside a cached file update. Make sure to update cached files after all other changes are committed.

```yaml
version_strategy: git # or leave this key unset
```
2. Contents hash - This strategy uses a hash of the files in the package's import path directory as the version.
```yaml
version_strategy: contents
```
17 changes: 16 additions & 1 deletion lib/licensed/sources/go.rb
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
# frozen_string_literal: true
require "json"
require "pathname"
require "licensed/sources/helpers/content_versioning"

module Licensed
module Sources
class Go < Source
include Licensed::Sources::ContentVersioning

def enabled?
Licensed::Shell.tool_available?("go") && go_source?
end
Expand Down Expand Up @@ -102,7 +105,19 @@ def package_version(package)
# find most recent git SHA for a package, or nil if SHA is
# not available
Dir.chdir package_directory do
Licensed::Git.version(".")
contents_version *contents_version_arguments
end
end

# Determines the arguments to pass to contents_version based on which
# version strategy is selected
#
# Returns an array of arguments to pass to contents version
def contents_version_arguments
if version_strategy == Licensed::Sources::ContentVersioning::GIT
["."]
else
Dir["*"]
end
end

Expand Down
72 changes: 72 additions & 0 deletions lib/licensed/sources/helpers/content_versioning.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# frozen_string_literal: true

require "ruby-xxhash"

module Licensed
module Sources
module ContentVersioning
GIT = "git".freeze
CONTENTS = "contents".freeze

# Find the version for a list of paths using the version strategy
# specified for the source from the configuration
#
# paths - list of paths to find version
#
# Returns a version identifier for the given files
def contents_version(*paths)
case version_strategy
when CONTENTS
contents_hash(paths)
when GIT
git_version(paths)
end
end

# Returns the version strategy configured for the source
def version_strategy
# default to git for backwards compatible behavior
@version_strategy ||= begin
case config.fetch("version_strategy", nil)
when CONTENTS
CONTENTS
when GIT
GIT
else
Licensed::Git.available? ? GIT : CONTENTS
end
end
end

# Find the version for a list of paths using Git commit information
#
# paths - list of paths to find version
#
# Returns the most recent git SHA from the given paths
def git_version(paths)
return if paths.nil?

paths.map { |path| Licensed::Git.version(path) }
.reject { |sha| sha.to_s.empty? }
.max_by { |sha| Licensed::Git.commit_date(sha) }
end

# Find the version for a list of paths using their file contents
#
# paths - list of paths to find version
#
# Returns a hash of the path contents as an identifier for the group
def contents_hash(paths)
return if paths.nil?

paths = paths.compact.select { |path| File.file?(path) }
return if paths.empty?

paths.sort
.reduce(Digest::XXHash64.new, :file)
.digest
.to_s(16) # convert to hex
end
end
end
end
14 changes: 4 additions & 10 deletions lib/licensed/sources/manifest.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
# frozen_string_literal: true
require "pathname/common_prefix"
require "licensed/sources/helpers/content_versioning"

module Licensed
module Sources
class Manifest < Source
include Licensed::Sources::ContentVersioning

def enabled?
File.exist?(manifest_path) || generate_manifest?
end
Expand All @@ -12,7 +15,7 @@ def enumerate_dependencies
packages.map do |package_name, sources|
Licensed::Sources::Manifest::Dependency.new(
name: package_name,
version: package_version(sources),
version: contents_version(*sources),
path: configured_license_path(package_name) || sources_license_path(sources),
sources: sources,
metadata: {
Expand All @@ -23,15 +26,6 @@ def enumerate_dependencies
end
end

# Returns the latest git SHA available from `sources`
def package_version(sources)
return if sources.nil? || sources.empty?

sources.map { |s| Licensed::Git.version(s) }
.compact
.max_by { |sha| Licensed::Git.commit_date(sha) }
end

# Returns the license path for a package specified in the configuration.
def configured_license_path(package_name)
license_path = @config.dig("manifest", "licenses", package_name)
Expand Down
1 change: 1 addition & 0 deletions licensed.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Gem::Specification.new do |spec|
spec.add_dependency "pathname-common_prefix", "~> 0.0.1"
spec.add_dependency "tomlrb", "~> 1.2"
spec.add_dependency "bundler", ">= 1.10"
spec.add_dependency "ruby-xxHash", "~> 0.4"

spec.add_development_dependency "rake", "~> 10.0"
spec.add_development_dependency "minitest", "~> 5.8"
Expand Down
3 changes: 2 additions & 1 deletion test/fixtures/manifest/manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,6 @@
"test/fixtures/manifest/multiple_license_headers/source.c": "bsd3_multi_header_license",
"test/fixtures/manifest/multiple_license_headers/source_2.c": "bsd3_multi_header_license",
"test/fixtures/manifest/with_license_file/source.c": "mit_license_file",
"test/fixtures/manifest/with_notices/source.c": "notices"
"test/fixtures/manifest/with_notices/source.c": "notices",
"test/fixtures/manifest/version/test.c": "version_test"
}
1 change: 1 addition & 0 deletions test/fixtures/manifest/manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ test/fixtures/manifest/multiple_license_headers/source.c: bsd3_multi_header_lice
test/fixtures/manifest/multiple_license_headers/source_2.c: bsd3_multi_header_license
test/fixtures/manifest/with_license_file/source.c: mit_license_file
test/fixtures/manifest/with_notices/source.c: notices
test/fixtures/manifest/version/test.c: version_test
6 changes: 6 additions & 0 deletions test/fixtures/manifest/version/test.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#include <stdio.h>

int main()
{
printf("I'm a test!");
}
13 changes: 6 additions & 7 deletions test/sources/go_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -170,19 +170,18 @@
end

describe "without go module information" do
it "is nil when git is unavailable" do
it "is the latest git SHA of the package directory when configured" do
Dir.chdir fixtures do
Licensed::Git.stub(:available?, false) do
dep = source.dependencies.detect { |d| d.name == "github.com/gorilla/context" }
assert_nil dep.version
end
dep = source.dependencies.detect { |d| d.name == "github.com/gorilla/context" }
assert_equal source.git_version([dep.path]), dep.version
end
end

it "is the latest git SHA of the package directory" do
it "is the hash of all contents in the package directory when configured" do
config["version_strategy"] = Licensed::Sources::ContentVersioning::CONTENTS
Dir.chdir fixtures do
dep = source.dependencies.detect { |d| d.name == "github.com/gorilla/context" }
assert_match(/[a-f0-9]{40}/, dep.version)
assert_equal source.contents_hash(Dir["#{dep.path}/*"]), dep.version
end
end
end
Expand Down
114 changes: 114 additions & 0 deletions test/sources/helpers/content_versioning_test.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# frozen_string_literal: true
require "test_helper"

describe Licensed::Sources::ContentVersioning do
let(:fixtures) { File.expand_path("../../../fixtures/command", __FILE__) }
let(:config) { Licensed::Configuration.new }
let(:helper) do
obj = mock.extend Licensed::Sources::ContentVersioning
obj.stubs(:config).returns(config)
obj
end


describe "#contents_version" do
it "handles a content hashing strategy" do
config["version_strategy"] = Licensed::Sources::ContentVersioning::CONTENTS
helper.expects(:contents_hash).with(["path1", "path2"]).returns("version")
helper.expects(:git_version).never
assert_equal "version", helper.contents_version("path1", "path2")
end

it "handles a git commit SHA strategy" do
config["version_strategy"] = Licensed::Sources::ContentVersioning::GIT
helper.expects(:contents_hash).never
helper.expects(:git_version).with(["path1", "path2"]).returns("version")
assert_equal "version", helper.contents_version("path1", "path2")
end
end

describe "#version_strategy" do
it "specifies content hashing if configured" do
config["version_strategy"] = Licensed::Sources::ContentVersioning::CONTENTS
assert_equal Licensed::Sources::ContentVersioning::CONTENTS, helper.version_strategy
end

it "specifies git version if configured" do
config["version_strategy"] = Licensed::Sources::ContentVersioning::GIT
assert_equal Licensed::Sources::ContentVersioning::GIT, helper.version_strategy
end

it "defaults to git version if not configured and git is available" do
Licensed::Git.stubs(:available?).returns(true)
assert_equal Licensed::Sources::ContentVersioning::GIT, helper.version_strategy
end

it "defaults to content hashing if not configured and git is not available" do
Licensed::Git.stubs(:available?).returns(false)
assert_equal Licensed::Sources::ContentVersioning::CONTENTS, helper.version_strategy
end
end

describe "#git_version" do
it "gets a hash for the latest commit for the set of paths" do
Dir.chdir fixtures do
# the hash for "." in a folder should identify the latest commit
# regardless of what other files from that folder are included
assert_equal Licensed::Git.version("."), helper.git_version(Dir["*"].concat(["."]))
end
end

it "handles files not tracked by git" do
Dir.chdir File.expand_path("../../../bin", fixtures) do
assert_nil helper.git_version(Dir["*"])
end
end

it "handles empty arrays" do
assert_nil helper.git_version([])
end

it "handles nil input" do
assert_nil helper.git_version(nil)
end
end

describe "#contents_hash" do
it "gets a hash representing the contents of relative paths" do
Dir.chdir fixtures do
refute_nil helper.contents_hash(Dir["*"])
end
end

it "gets a hash representing the contents of absolute paths" do
refute_nil helper.contents_hash(Dir["#{fixtures}/*"])
end

it "is agnostic to the order of paths provided" do
Dir.chdir fixtures do
assert_equal helper.contents_hash(["bower.yml", "bundler.yml", "cabal.yml"]),
helper.contents_hash(["cabal.yml", "bundler.yml", "bower.yml"])
end
end

it "handles empty arrays" do
assert_nil helper.contents_hash([])
end

it "handles nil input" do
assert_nil helper.contents_hash(nil)
end

it "handles nil paths" do
assert_nil helper.contents_hash([nil])
end

it "handles non-existant paths" do
assert_nil helper.contents_hash(["#{fixtures}-bad"])
end

it "handles non-file paths" do
assert_nil helper.contents_hash([fixtures])
end
end
end
17 changes: 17 additions & 0 deletions test/sources/manifest_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,23 @@
assert dep
refute_empty dep.record.notices
end

it "uses the git commit SHA as the version if configured" do
config["version_strategy"] = Licensed::Sources::ContentVersioning::GIT
dep = source.dependencies.detect { |d| d.name == "version_test" }
assert_equal source.git_version(source.packages["version_test"]), dep.version
end

it "uses the git commit SHA as the version if not configured" do
dep = source.dependencies.detect { |d| d.name == "version_test" }
assert_equal source.git_version(source.packages["version_test"]), dep.version
end

it "uses the file contents hash as the version if configured" do
config["version_strategy"] = Licensed::Sources::ContentVersioning::CONTENTS
dep = source.dependencies.detect { |d| d.name == "version_test" }
assert_equal source.contents_hash(source.packages["version_test"]), dep.version
end
end

describe "manifest" do
Expand Down

0 comments on commit 908f0de

Please sign in to comment.