Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo publish should warn on invalid categories/keywords #4300

Closed
mqudsi opened this issue Jul 19, 2017 · 7 comments
Closed

cargo publish should warn on invalid categories/keywords #4300

mqudsi opened this issue Jul 19, 2017 · 7 comments
Labels
A-diagnostics Area: Error and warning messages generated by Cargo itself. A-interacts-with-crates.io Area: interaction with registries C-enhancement Category: enhancement Command-publish

Comments

@mqudsi
Copy link

mqudsi commented Jul 19, 2017

Currently the user experience for publishing crates that have invalid or non-conformant keywords is less than ideal.

  • the crate is compiled and errors only come from the server after compilation. i.e. it must be fully recompiled for each attempt
  • the error for keywords containing spaces comes from the server and is not very clear: error: api errors: invalid upload request: invalid value: string "static compress", expected a valid keyword specifier at line 1 column 8372
  • the error for too many keywords is unnecessarily technical: error: api errors: invalid upload request: invalid length 9, expected at most 5 keywords per crate at line 1 column 8440;

cargo publish should probably do some basic linting on the keywords parameter of the toml, and throw more user-friendly errors than what is currently done. The errors returned by the server are useful for someone hacking away at cargo, not for a developer using cargo (since they refer to the payload generated by the cargo publish command and sent to the server; i.e. those line and column numbers have no meaning to the end user).

@behnam
Copy link
Contributor

behnam commented Aug 5, 2017

I think even cargo check and cargo package should do so, because otherwise the problem surfaces after all the config is committed into the repository and publishing is in progress.

@carols10cents carols10cents added A-diagnostics Area: Error and warning messages generated by Cargo itself. A-interacts-with-crates.io Area: interaction with registries C-enhancement Category: enhancement Command-publish labels Aug 28, 2017
@mattgathu
Copy link
Contributor

Hi @carols10cents I would be happy to help fix this issue. Anyone who can mentor on what needs to be done?

@damon-myers
Copy link

damon-myers commented May 12, 2019

@behnam @mattgathu @carols10cents 👋 Hello from CodeTriage!

This is still a problem in version 1.33.0 (f099fe94b 2019-02-12).

We could fix this issue by duplicating the validation logic from crates.io:src/models/keyword.rs on cargo check/package/publish. Does that seem like a valid solution?

@behnam
Copy link
Contributor

behnam commented May 13, 2019

Thanks, @mattgathu and @damon-myers for following up.

From what I understood and remember, the problem is that the range of some values, such as package keywords, is defined by the package registry, and may in fact differ from one to another.

That, and introduction/existence of private registries and their possible reliance on other keyword value rules, makes me wondering if there's anything we can do here that would be scaleable.

On option that comes to my mind is to special-case crates.io as a core package registry, and maintain a copy of some of its policies, like format of keyword values.

Another option could be expect the registry to either have an API for this (which cargo package and such can use per call), or have an API to give us a rule, like a list of values or a RegEx pattern, to be cached and used for verifying the related package attributes.

I haven't been active with Cargo for a while, so these suggestions may be a bit off, or some of it may already exist. I leave it for the leads to clarify and make actionable suggestions.

@ehuss
Copy link
Contributor

ehuss commented May 20, 2019

I think we intend to share some of the validation somehow.

Out of curiosity, I scanned crates.io for any crates with keywords that wouldn't get validated today. I found some, presumably crates.io validation has changed over time.

keyword errors
"krust/krust-0.0.1/Cargo.toml" too many keywords: 6
"apply_pub/apply_pub-0.0.2/Cargo.toml" too many keywords: 6
"google-geo/google-geo-0.1.0/Cargo.toml" invalid keyword: "location infomation"
"sntp_client/sntp_client-1.2.0/Cargo.toml" invalid keyword: "command line"
"shm/shm-0.1.0/Cargo.toml" invalid keyword: "shared memory"
"owned-fd/owned-fd-0.1.0/Cargo.toml" invalid keyword: "file descriptor"
"scell/scell-1.0.0/Cargo.toml" invalid keyword: "smart cell"
"scell/scell-1.0.0/Cargo.toml" too many keywords: 11
"nl80211rs/nl80211rs-0.1.0/Cargo.toml" invalid keyword: "nl80211.h"
"dumbmath/dumbmath-0.2.2/Cargo.toml" too many keywords: 7
"routeros_rust/routeros_rust-0.0.21/Cargo.toml" invalid keyword: "Router Os"
"routeros_rust/routeros_rust-0.0.21/Cargo.toml" invalid keyword: "Router Os API"
"strtod/strtod-0.0.1/Cargo.toml" invalid keyword: "floating point"
"pairing-heap/pairing-heap-0.1.0/Cargo.toml" invalid keyword: "priority queue"
"comcart/comcart-0.1.0/Cargo.toml" invalid keyword: "common cartridge"
"packagemerge/packagemerge-0.1.0/Cargo.toml" too many keywords: 8
"alpaca/alpaca-0.1.0/Cargo.toml" invalid keyword: "Variant Calling"
"ithos/ithos-0.0.0/Cargo.toml" invalid keyword: "access control"
"cryptosphere/cryptosphere-0.0.0/Cargo.toml" too many keywords: 6
"snzip/snzip-0.1.0/Cargo.toml" too many keywords: 9
"meta_diff/meta_diff-0.0.1/Cargo.toml" invalid keyword: "machine learning"
"rustyham/rustyham-0.0.1/Cargo.toml" invalid keyword: "hamming code"
"carto/carto-0.1.0/Cargo.toml" invalid keyword: "text editor"
"jwk/jwk-0.1.0/Cargo.toml" invalid keyword: "RFC 7517"
"fixed_circular_buffer/fixed_circular_buffer-0.2.2/Cargo.toml" too many keywords: 7
"nickel_macros/nickel_macros-0.1.0/Cargo.toml" invalid keyword: "web server"
"message-format/message-format-0.0.1/Cargo.toml" too many keywords: 8
"nailgun/nailgun-0.1.0/Cargo.toml" too many keywords: 8
"tagua-parser/tagua-parser-0.1.0/Cargo.toml" invalid keyword: "virtual machine"
"tagua-parser/tagua-parser-0.1.0/Cargo.toml" too many keywords: 6
"kalman/kalman-0.0.0/Cargo.toml" invalid keyword: "Kálmán filter"
"sdp/sdp-0.1.0/Cargo.toml" invalid keyword: "Session Description Protocol"
"sdp/sdp-0.1.0/Cargo.toml" keyword too long (28): "Session Description Protocol"
"way-cooler-ipc/way-cooler-ipc-0.0.0/Cargo.toml" too many keywords: 6
"ghlabel/ghlabel-0.1.0/Cargo.toml" invalid keyword: "github issues"
"ghlabel/ghlabel-0.1.0/Cargo.toml" invalid keyword: "pull requests"
"netcdf/netcdf-0.1.0/Cargo.toml" too many keywords: 7
"humanity/humanity-0.1.0/Cargo.toml" invalid keyword: "humans.txt"
"switchboard/switchboard-0.1.0/Cargo.toml" invalid keyword: "state machine"
"smbclient-sys/smbclient-sys-0.1.0/Cargo.toml" too many keywords: 6
"has/has-0.1.0/Cargo.toml" invalid keyword: "has a"
"boehm_gc/boehm_gc-0.0.1/Cargo.toml" invalid keyword: "garbage collector"
"orc/orc-0.0.1/Cargo.toml" invalid keyword: "garbage collector"
"orc/orc-0.0.1/Cargo.toml" invalid keyword: "reference counting"
"ikura/ikura-0.0.1/Cargo.toml" invalid keyword: ""
"tg-labstatus/tg-labstatus-0.1.0/Cargo.toml" invalid keyword: "openlab augsburg"
"dual_quaternion/dual_quaternion-0.1.0/Cargo.toml" invalid keyword: "dual quaternion"
"i18n/i18n-0.0.1/Cargo.toml" too many keywords: 11
"bytereader/bytereader-0.1.0/Cargo.toml" too many categories: 7
"network-constants/network-constants-0.0.1/Cargo.toml" invalid keyword: "tcp/ip"
"network-constants/network-constants-0.0.1/Cargo.toml" too many keywords: 8
"rust-gm-paillier/rust-gm-paillier-0.1.0/Cargo.toml" too many keywords: 6
"findup/findup-0.1.0/Cargo.toml" too many keywords: 6
"swagger_to_md/swagger_to_md-1.0.0/Cargo.toml" too many keywords: 6
"currency/currency-0.4.0/Cargo.toml" too many keywords: 12
"hashindexed/hashindexed-0.1.1/Cargo.toml" too many keywords: 7
"checked_int_cast/checked_int_cast-1.0.0/Cargo.toml" too many keywords: 6
"uwp/uwp-0.0.0/Cargo.toml" invalid keyword: "Universal Windows Platform"
"uwp/uwp-0.0.0/Cargo.toml" keyword too long (26): "Universal Windows Platform"
"libmultilog/libmultilog-0.1.0/Cargo.toml" too many keywords: 7
"silverknife-pangocairo-sys/silverknife-pangocairo-sys-0.1.0/Cargo.toml" too many keywords: 8
"ipecho/ipecho-0.0.1/Cargo.toml" invalid keyword: "public ip"

I personally wouldn't be opposed to always imposing these restrictions regardless of the registry. Also, as noted in #4377, these would probably be warnings.

@MingweiSamuel
Copy link

Keyword format doesn't seem to be documented anywhere? I had to read the source code linked above to figure out what makes a keyword valid

@epage epage changed the title cargo publish should validate keywords cargo publish should validate categories Oct 14, 2023
@epage epage changed the title cargo publish should validate categories cargo publish should validate categories/keywords Oct 14, 2023
@epage epage changed the title cargo publish should validate categories/keywords cargo publish should warn on invalid categories/keywords Oct 14, 2023
@epage
Copy link
Contributor

epage commented Oct 17, 2023

Closing in favor of #4377

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Error and warning messages generated by Cargo itself. A-interacts-with-crates.io Area: interaction with registries C-enhancement Category: enhancement Command-publish
Projects
None yet
Development

No branches or pull requests

8 participants