Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix longest common prefix #148

Merged
merged 1 commit into from
Apr 8, 2024

Conversation

phongulus
Copy link
Contributor

@phongulus phongulus commented Mar 25, 2024

Description of the task

As pointed out here, there's something wrong with the logic of longest_common_prefix. See the following output from utop:

utop # longest_common_prefix ["/test/paths/not_here"; "/test/path/here"; "/test/paths/here"];;
- : string = "/test/paths/not_here"

This function is currently used to collate file changes in a commit (e.g. x files changed in example/dir). It's been working so far since GitHub API seems to return file paths in lexicographic order (that's just from observation, I can't find this confirmed anywhere).

We should either:

  • have longest_common_prefix return the actual longest prefix without assuming anything about the order in which files appear (currently implemented in this PR).
  • assume that GitHub API always returns file paths in lexicographic order and remove the unneeded List.sort (commented out under the above).

How to test

make test

Copy link
Collaborator

@yasunariw yasunariw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still returns

utop # longest_common_prefix ["/test/paths/not_here"; "/test/path/here"; "/test/paths/here"];;
- : string = "/test/path"

but I expect it to return /test

lib/common.ml Outdated
| x :: _ -> List.sort (Fun.flip String.compare) xs |> List.hd |> Stre.common_prefix x |> String.sub x 0
| x :: xs -> List.fold_left (fun a s -> Stre.common_prefix a s |> String.sub a 0) x xs

(* let longest_common_prefix xs =
Copy link
Collaborator

@yasunariw yasunariw Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove the commented out code?
also maybe the function name is too general for its intended usecase/expected behavior

Copy link
Contributor Author

@phongulus phongulus Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should remove this from lib/common.ml entirely and just write the logic here in lib/slack_message.ml where it's called?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do that. This whole part can be replaced I think:

let prefix_path =
      List.map (fun f -> f.filename) files
      |> Common.longest_common_prefix
      |> String.split_on_char '/'
      |> drop_last
      |> String.concat "/"

so as not to rely on assumptions about what happens to the function output like:

we always drop the last path element

@phongulus
Copy link
Contributor Author

this still returns

utop # longest_common_prefix ["/test/paths/not_here"; "/test/path/here"; "/test/paths/here"];;
- : string = "/test/path"

but I expect it to return /test

Right, longest_common_prefix is only called in condense_file_changes in line 150 /lib/slack_message.ml. It think it's assumed there that we only have the common string prefix and not the common path prefix, since we always drop the last path element:

let prefix_path =
      List.map (fun f -> f.filename) files
      |> Common.longest_common_prefix
      |> String.split_on_char '/'
      |> drop_last
      |> String.concat "/"

@phongulus phongulus requested a review from yasunariw March 28, 2024 06:51
Copy link
Collaborator

@yasunariw yasunariw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change looks good, thanks!
however if adding a test, I think a unit test would be more appropriate

@phongulus
Copy link
Contributor Author

the change looks good, thanks! however if adding a test, I think a unit test would be more appropriate

@yasunariw I added some unit tests.

Copy link
Collaborator

@yasunariw yasunariw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, please remove the other test case compare.files_modified_common_directory in favor of the unit test, rebase and cleanup commits

test/longest_prefix_test.ml Outdated Show resolved Hide resolved
@phongulus phongulus force-pushed the phong/longest-common-prefix-fix branch from 2122ac2 to d164e24 Compare April 8, 2024 05:36
Copy link
Collaborator

@yasunariw yasunariw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@yasunariw yasunariw merged commit 0dc6c1d into ahrefs:master Apr 8, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants