Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Add "Best Matches" Binary Search Mode #6770

Closed
1 task done
solvingj opened this issue Mar 31, 2020 · 4 comments · Fixed by #14694
Closed
1 task done

[feature] Add "Best Matches" Binary Search Mode #6770

solvingj opened this issue Mar 31, 2020 · 4 comments · Fixed by #14694

Comments

@solvingj
Copy link
Contributor

This feature request was already somewhat outlined by @fulara in this other ticket, as a possible concept:

#6364 (comment)

This is a very old problem. From a user experience perspective, the common scenario is that we know that there are a bunch of binaries are built for some dependency, and we're pretty sure this includes the one we want. But, when we run Conan we get an error that there's no binary built with the specific package ID hash requested. The existing search has improved over the years to print out the information about the requested package ID hash so that users could deduce the mismatch.

The "problem" is that this is very onerous, tedious, and time-consuming a lot of the time. It relies on the user to perform a visual comparison setting-by-setting, option-by-option, between the error message and EACH search result in the list of search results to HOPEFULLY see the problem. Alternatively, users can manually construct a query for the conan search -q which contains all the settings they're trying to use. This reduces the comparison to just looking at all the different options and values. Furthermore, if the difference in package_id is related to the list of dependencies, it's very hard to see it, especially for newcomers.

Matching binary package ID's is central to Conan's functionality, and this situation comes up frequently enough that it feels like a big hole in the UX overall.

The suggestion in the aforementioned ticket is a sound strategy for providing a new search feature that improves the UX around this. From a search perspective, each binary is effectively describable by it's matrix of options, settings, and dependencies (pre-hash). The suggested conan search feature is effectively to do a matrix comparison against the requested but unhashed values and the unhashed values of the binaries in the available repositories. Conan would then need to use some sort of scoring/distinace/best match algorithm and then present to the user a list of results in order of "best score/best match" first. For each result, the output should display the "DIFF" with the requested binary so users can easily see what's different.

Logically, this is reasonable. Based on how the Conan dependency tree graph is calculated, dynamic options and settings are reconciled, and then everything is factored into the package_id .... the implementation of this might be very difficult. In any case, I think it's fair to say that "best match" search is reasonable, and would be very useful here. I will try to do some research into established "Matrix comparison" methods with best match and diff functionality.

@fulara
Copy link
Contributor

fulara commented Mar 31, 2020

Hello @solvingj, that summs it up nicely. yes. as I've further said in my comment: #6364 (comment) I was thinking at the time that I will be raising similar request to this, however I had a change of heart. Let me elaborate.

Short story long:
For this feature to provide the necessary info the output of conan install would have to be directly fed into the conan search output. or whatever command would this be if not conan search

Because to do this matching you need to know exactly what sort of package you are searching for, as the specified package layout will be affected by the upstream dependencies for it.
So - the conan install upon failing to find a prebuild binary prints these:
cant find a .. settings: "...", "options: "...", requires: "...", depends: "..."
And only be by taking all four of these and applying that on the output of conan search you would be able to do any matching.

I think this is more or less possible with the addition of ##6700 with next release of conan, as previously the output of install was not complete.

What I am planning to without any conan modification - and this is probably what would have ot be implemented by this request is to:

Extract the set of classifiers that were expected for particular library so these:
.. settings: "...", "options: "...", requires: "...", depends: "..."
Then invoke: conan search reference which will give you all prebuild binaries of that particular dependency that is missing prebuild.
Mind you that there is bug(?) in the conan search where it prints full_requires in place of requires (https://github.com/conan-io/conan/blame/8f5e997660bd265789f48bd26c54797f5f74067f/conans/model/info.py#L469) but it doesnt really matter for this case, as we need full requires.

And then having the classifiers values from the conan install you can apply this against every result of the conan search reference and following process of elimination find the closest match - or just highlight the differences, closest match is not needed, the offending lines are needed..
The biggest problem here would be actually that one has to reverse engineer the conan way of matching the versioning schema - the requires part against the full_requires, to decide whether your prebuilt matches.

Since I think this is doable starting with conan 1.24.0 I decided not to raise a feature at this time, and try to implement this for ourselves, possibly after we have done this I would come back to the community.

If I am correct then I dont think this would be that difficult.

@memsharded
Copy link
Member

The way I envision a possible approach:

  • Getting the output of the missing binary settings, options, and requirements, and feeding them into a search, and try to provide some automatic classification, or distance criteria can become something that cannot satisfy different cases, and we would be request to add more and more intelligence and different cases into the logic or "distance" computation.
  • On the other hand, we humans are very good at debugging, if we are given some good tools. So I think it is better to focus on the tooling rather than to try to give a full solution.
  • One of the complexities comes from the existence of many different binaries or existing configurations. If you only have 1 binary, built for 1 platform, then the problem becomes trivial. So the problem that we are facing here is the inspection and comparison of those many different binaries.
  • The html output of the conan search command was an effort in this direction, to visualize the binaries we have and help us debug what is happening.
  • I think we can improve and extend this html output. So you have for example checkboxes for settings and options, and the columns and rows are highlighted or filtered as those checkboxes are selected. For requirements, that could be a zoom matrix over existing cells, in which, again checkboxes for versions of dependencies appear (this seems more manageable than a 3D matrix).
  • I think this tool can be very effective and helpful to help users identify the reasons, while being very robust for all different use cases of settings and options and requires, and no smart logic needs to be iterated. Furthermore, it doesn't require the command line to be modified or extended with different parameters to learn, and usage is quite self-explanatory.

What do you think?

@solvingj
Copy link
Contributor Author

solvingj commented Apr 2, 2020

RE: comments from @fulara ..
There's a really good point here that I have thought many times but forgot about when creating this ticket. I've often wanted conan search to take all the same parameters as conan create and conan install. I want to pass a profile and all settings, etc. However, this was problematic, and it seems likely that the reason was because conan search was overloaded and doing too many things. Now that we talk about breaking up the search of recipes and binaries into different flows, this seems much more possible. In summary, I hope we can consider a new command just like you said :

conan search reference <path or package_ref> <options/settings/profiles/etc>

I agree this is a desirable building block. I would probably name it conan search binary but lets go with reference for this discussion.

Also, "Best Match" isn't actually a good description of what I wanted from the feature. Rather, the crucial part of what I wanted was the list of binaries to be sorted by the "number of diffs". "Best matches" (binaries with only 1 setting or option difference) would naturally be at the top (if they existed). From my usage, it's almost always the case that I'm building with 1-option-different than what was prebuilt. I should add that actually, there might be binaries with NO differences in settings/options, but only in direct dependency list or version. I would consider these better matches, and list them above 1-difference dependencies.

Finally, I agree with the additional convenience feature of automatically running this search and printing the results upon failed conan install/conan create . I would suggest a global opt-in config to enable this "auto-search-post-fail" behavior.

RE: comments from @memsharded

I think maybe it seems overwhelming and difficult to imagine because we've been talking about an abstract concept of "distance", which leaves a lot of details open to interpretation and implementation. Lots of room for debate about preferences of weight among the various settings and options, etc. In reality, I don't think we want something that abstract. @fulara and I seem to have come to same/similar/compatible conclusions about what a practical UX and reasonable implementation would look like. If there's no barrier to the conan search reference command he suggested, which identifies and highlights diffs, then I think the idea to sort by "least number of differences" is pretty obvious.

So I guess the question is: have the roadmap/discussions for the future of search command(s) included anything like the suggested conan search reference so far?

@memsharded
Copy link
Member

Implemented in #14694

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants