Skip to content
This repository has been archived by the owner on Apr 18, 2018. It is now read-only.

Policy on Edge Case Breaking Changes #37

Closed
cjihrig opened this issue Apr 9, 2015 · 38 comments
Closed

Policy on Edge Case Breaking Changes #37

cjihrig opened this issue Apr 9, 2015 · 38 comments

Comments

@cjihrig
Copy link
Contributor

cjihrig commented Apr 9, 2015

It would be nice to have an official stance on breaking changes as they apply to unspecified/undocumented/internal/deprecated features. This has come up a few times in io.js, and has been handled differently on a case by case basis.

@jasnell
Copy link
Member

jasnell commented Apr 9, 2015

For reference, @cjihrig do you have an example case to point to? And possibly a preference for how you'd like to see it handled?

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 9, 2015

nodejs/node#1215 was caused due to reliance on undocumented usage of path.dirname().

More recently, nodejs/node#1356 was opened because of a breaking change with NODE_PATH. This may or may not have been officially deprecated, but it illustrates the point.

@jasnell jasnell mentioned this issue Apr 9, 2015
@jasnell
Copy link
Member

jasnell commented Apr 9, 2015

Note that the document currently does circle around on this. Specifically:

Node.js implements a number of default values and assumed behaviors. 
Some of these may be intended for module and application developers to 
take advantage of while others may not. As a general rule, if a given default 
value or assumed behavior is documented within the end-user API 
documentation, it should be considered part of the public API. When modifying 
such defaults, Collaborators should use their best judgement to determine 
whether any given change is "technically backwards incompatible but in
 practice should not be".

This obviously stops short of defining a strict policy, however.

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 9, 2015

This also ties in with the "how to handle reverts" discussion from the call today.

@jasnell
Copy link
Member

jasnell commented Apr 9, 2015

+1, I've opened an issue specifically with regards to the reversion policy. Input on that would be greatly appreciated :-)

@chrisdickinson
Copy link

What about:

The set of types, methods, and result types form the implicit public API. A subset of these are documented, and form the explicit public API. Code may rely on the explicit public API and trust it to never change. Code relying on the implicit public API is subject to change. To change the implicit public API, first a deprecation for that combination of types and result-types should be added in the current major line. Then, as soon as the next major release, the API may start returning a different outcome.

@jasnell
Copy link
Member

jasnell commented Apr 9, 2015

@chrisdickinson ... PR? 👍 :-)

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 9, 2015

That's fine for cases like using an underscored property or method. I don't think it will work in other cases, where the user might rely on some undocumented behavior that core does not use or test for.

@jasnell
Copy link
Member

jasnell commented Apr 9, 2015

@cjihrig ... do you have some suggested language or proposed approach you'd like to see?

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 9, 2015

Maybe something like this, in conjunction with what Chris wrote:

The behavior of the public API outside of its documented use should not be relied on, and is subject to change without warning.

@chrisdickinson
Copy link

@cjihrig I kind of disagree with that. Exhaustively documenting type signatures is kind of out of scope at present. I lean towards letting people know that if they're using types that the existing docs don't mention that there'll be a chance they have to change their code, and that we'll follow a stated process when we do so.

@chrisdickinson
Copy link

I think I should clarify the "set of" bit some:

A public method is defined as a set of input types and a set of outcomes given those types. Outcomes may be return values of certain types, or exceptions. The public API is comprised of the set of all API methods combined with all types and outcomes. The public API is implicitly defined. A subset of this public API is explicitly defined by the documentation. The explicit API should be relied upon when possible – its type signature will not change across releases. The implicit API, by contrast, may change across releases. Changing the implicit API by changing the outcome of a given set of input types for a method requires a deprecation process. The change may happen as soon as one major release after deprecation is added.

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

I believe "outcomes" could also consist of assumed behaviors... such as the order in which event callbacks are triggered or the assumed side effects of a particular method. If these are documented, those behaviors should be considered part of the explicit API.

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

Also note: the dev policy does not currently outline the deprecation process for any given item. It should at least lay the groundwork for such.

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 10, 2015

I disagree. I haven't thoroughly checked the docs, but I know that at least a number of the modules already document the expected types. We also consistently run into problems like nodejs/node-v0.x-archive#8618 because any data type can be passed in. I'm not even suggesting that we add typeof checks to every function in core - just that we should be able to without a major version bump.

cc: @misterdjules and @tjfontaine - I recall both of you being in favor of throwing TypeErrors in the mentioned bug.

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

Perhaps something like:

The Node.js Core Library API, Application Binary Interface and Binary Abstraction layer each expose methods, events, behaviors and default values that together comprise the implicit Node.js "Public API". This "Public API" is used directly by application and module developers and changes will have direct impact on the Community.

Methods exposed by the API can have either strongly or weakly types input parameters and return values. Methods may also throw any number of exceptions or trigger a certain sequence of events or side effects. When such types, exceptions, events or side-effects are documented, they become part of the Node.js Explicit API.

Likewise, when the ordering of events triggered by Node.js, the default values of optional input parameters, or the type and structure of return values and event payloads are documented, they become part of the Explicit API.

Undocumented input parameter types, return values, exceptions, events and side effects are considered to be part of the Implicit API.

The ecosystem of application and module developers must be able to rely on the stability of the Explicit API as much as possible. This means that backwards incompatible changes to the Explicit API must follow a clear deprecation process in which the existing documented behavior is clearly marked for deprecation and can change as soon as the next semver-major version increase. The Implicit API, on the other hand, may change at any point from one release to another given appropriate review.

That said, decisions to change undocumented Implicit APIs should not be made lightly as there is no reliable means of determining in advance the impact such a change could have on existing applications and modules. Collaborators should use their best judgement to determine whether any given change is "technically backwards incompatible but in practice should not be", then approach landing the change accordingly.

@mikeal
Copy link

mikeal commented Apr 10, 2015

I've noticed that in both node.js and io.js we've been moving towards policies that try to strictly define decisions that the people involved make as judgement calls. The reasoning is that we've made some mistakes and so we should re-scope our decision making to avoid being put back in that position.

Alternatively I think it would be better if we left decisions like this the way they are, deferring to the better judgement of the people involved and trust in their abilities to make these decisions better if we give them better information to make these decisions.

Recently @chrisdickinson has been working on tools that parse though popular modules, or potentially every module, in npm and tell us how many of them use a certain API. If we can build on this we can continue to feed a lot more information in to the TSC to help them make these tough calls.

Right now we're making a lot of assumptions about how many people use a particular API in a particular way and when we're wrong and we make changes assuming their effect will be small bad things happen. But if we continue down this road of restricting what kinds of judgement calls we can make we might strangle the projects ability to make good decisions once we have better information.

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

I generally don't see this as much a restriction on the kind of changes or
decisions that can be made as it is "simply" a process of vetting them.
Pre-release changes are easy. Post-release changes are not. Even if we had
perfect information about about extensively something is used, it's still
worthwhile to err on the side of caution.
On Apr 9, 2015 5:56 PM, "Mikeal Rogers" notifications@github.com wrote:

I've noticed that in both node.js and io.js we've been moving towards
policies that try to strictly define decisions that the people involved
make as judgement calls. The reasoning is that we've made some mistakes and
so we should re-scope our decision making to avoid being put back in that
position.

Alternatively I think it would be better if we left decisions like this
the way they are, deferring to the better judgement of the people involved
and trust in their abilities to make these decisions better if we give
them better information to make these decisions.

Recently @chrisdickinson https://github.com/chrisdickinson has been
working on tools that parse though popular modules, or potentially every
module, in npm and tell us how many of them use a certain API. If we can
build on this we can continue to feed a lot more information in to the TSC
to help them make these tough calls.

Right now we're making a lot of assumptions about how many people use a
particular API in a particular way and when we're wrong and we make changes
assuming their effect will be small bad things happen. But if we
continue down this road of restricting what kinds of judgement calls we can
make we might strangle the projects ability to make good decisions once we
have better information.


Reply to this email directly or view it on GitHub
#37 (comment).

@mikeal
Copy link

mikeal commented Apr 10, 2015

Pre-release changes are easy. Post-release changes are not.

Good point.

@chrisdickinson
Copy link

The explicit/implicit API split gives us a good basis to make decisions from and build tooling around. I'd lock it down a bit further than your proposal, @jasnell: the only future for type changes in explicit APIs is removal by wholesale deprecation. Implicit API changes that introduce new exceptions or change event order should follow a deprecation plan (1+ major versions worth of deprecation.) Other (behavioral) implicit API changes may change from release to release, though we should rely heavily on tooling to make sure the changes are as innocuous as we think they are.

Re: the tooling. My goals are:

  1. Automatic warning when a PR contains an exceptional change, to forewarn situations like the path.resolve() breakage.
  2. Ability to track the effects of proposed deprecation – both before we deprecate an API as well as while the deprecation is ongoing, to track the potential breakage before removal.
    1. This is the reason for the kind-of-weasel-worded "implicit APIs may change as soon as one major version after deprecation clause" – my hope is that we'll be able to track the progress of deprecation using this tool.
  3. Ability to compare/contrast how the npm ecosystem is using our APIs versus what surface area we are currently testing, with regards to types.

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 10, 2015

This is a bit off topic, but I question how effective tools are going to be at statically analyzing these types of things. There are a number of edge cases just to determine if a function is called - eval(), vm methods, new Function(), reassigning a function to a variable. The problem of determining the arguments of the function is even harder - apply() when you don't know the length of the array, the types of individual arguments (is it an argument being passed in from the outside world, etc.). JavaScript's typing makes this problem even harder. I hope I'm wrong, but it seems like tools will be only moderately helpful.

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

I'm less optimistic about the tools. Just prior to Node Summit I asked a colleague to run a static analysis of the most depended on npm modules to see what the breakdown was in the use of core APIs. While the numbers proved interesting, they were only of partial value because just knowing which APIs are used isn't telling us much about how they're being used or how extensively things might break if we happened to change any one of them. Yes tooling can help but it would be very dangerous to rely on it too much at this point (especially if it's not even fully developed yet ;-) ) For the sake this initial dev-policy document we need to rely on a policy and good sense driven approach and iterate later if/when the tools improve.

@chrisdickinson ... I'm not sure I completely follow what you mean by "the only future for type changes in explicit APIs is removal by wholesale deprecation."

One thing we should also factor in here is the fact that there is a difference between a "Release" and an "LTS Release". Every Major.Minor would be a release, yes, but it will be the LTS branches that we'll need to be the most sensitive to. If we know a behavior/type/event/default needs to change from one LTS to another, it needs to be captured within the LTS Roadmap well in advance of cutting that new LTS release.

@arb
Copy link

arb commented Apr 10, 2015

Along the same lines, all JS projects are rife with _foo style properties that we as a community know to mean "stay out". But users don't always. They end up using undocumented and unsupported APIs. Is the suggestion that renaming or moving these private attributes would also require a major bump?

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

@arb fortunately no. Things like _foo would be considered either part of the Implicit API or (better yet) internal implementation. If developers Do Silly Things(tm) and use the _foo properties, then they ought to expect to have things get broken.

@Fishrock123
Copy link
Contributor

Note: There are some _apis that are so widely used, removing them would cause lots of code to break.

I'm looking at _headers here..

@arb
Copy link

arb commented Apr 10, 2015

@jasnell so why not apply that some rational to undocumented functions? Why treat _foo one way, but undocumented APIs differently?

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 10, 2015

@Fishrock123 Node 2.0 with a fresh policy on that stuff would be a great place to start ;-)

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

@Fishrock123 .. yup, I know. I've made extensive use of _headers myself ;-) A large part of the intent here for this document is to capture what should happen as opposed to what's actually happening. Once we agree on the should part, we can work on guiding developers towards better practices in general.

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

I'll be submitting a PR in just a minute with some proposed language on this.

jasnell added a commit that referenced this issue Apr 10, 2015
re: #37

Expanded language around
@chrisdickinson
Copy link

The problem with the _foo properties is that developers will go where the functionality and information is, regardless of whether it's "implicit." Even Django, which is in a language community with a much longer tradition of using _-prefixed properties + the "we're all adults here" policy, had to eventually standardize their Model _meta property as a supported, public API. The point here is that if we don't want the public to use something, we must use Symbols or WeakMaps to hide that data. All _-prefixed APIs are fair game for users – they're part of the implicit API, and we can't change that in a point release without potentially breaking someone's code.

Re: tooling, I've been impressed by the quality of results thus far. We don't need to catch 100% of usage to still get actionable data. The static analyzer is functionally complete (though, it could always get better), and gathers usage context (exception targets, argument types, etc.) The missing part is wiring up the results of the tooling to search, and getting access to a complete npm clone. The goal is not to use static analysis by itself to make our decisions, it's to use it to pare down the number of modules we potentially need to test against a given PR so that running ecosystem tests against individual PRs becomes viable.

@Fishrock123
Copy link
Contributor

and getting access to a complete npm clone.

Maybe we should see if npm would be able to provide something for that purpose?

@jasnell
Copy link
Member

jasnell commented Apr 10, 2015

Seeing stuff like https://github.com/chrisdickinson/estoc just makes me happy.

@garthk
Copy link

garthk commented Apr 12, 2015

Would users participate in opt-in data gathering re implicit API usage if, in exchange, they got advance notice that particular code tickled Node APIs that might change? It could be as simple as dumping a stack trace and issue URL once per run to whichever file NODE_API_WARNINGS points to.

I'd want to examine the deprecation API before zooming out to derive a more general warning and gathering system module authors could use.

@garthk
Copy link

garthk commented Apr 12, 2015

@jasnell: nice. I was thinking more

  • module authors warning users they're using their API in a way they might want to reconsider; plus

  • tools helping users warn module authors they're thinking of changing the API in a way they might want to reconsider

    estoc hints at another use case:

  • module authors warning users about problems in some other module's API, blah blah reconsider

… which might depend either on static analysis, runtime hooks, or both. I'm not sure I'd go there first, though. Finding out whether anyone might depend on releasing Zalgo strikes me as an easy way to get a quick benefit.

@geek
Copy link
Member

geek commented Apr 13, 2015

@chrisdickinson if we can hide the unsupported parts then thats the better way to go. That being said, undocumented and underscore prefixed parts should be allowed to change freely without requiring major version changes. Are there other node/io projects that do major version changes for any _ property change?

@cjihrig
Copy link
Contributor Author

cjihrig commented Apr 13, 2015

I don't want to keep poking at this issue, but it's also worth noting that Node/io.js don't automagically update versions like npm packages are known to do. Before upgrading the platform, people should be testing their code. If you make it the policy that starting at 2.0, undocumented APIs can change, then that problem should go away

@Fishrock123
Copy link
Contributor

That being said, undocumented and underscore prefixed parts should be allowed to change freely without requiring major version changes.

Should be able to yes. In practice with the current state of things, not so much.

@jasnell
Copy link
Member

jasnell commented Apr 13, 2015

In practice users will always find new and creative ways of doing things the developer's feel they really ought not be doing. Things like internal modules will help, and any way we can increase the isolation between internal and external things will be good. However, there's only going to be so much we can do and we'll have to accept that at least some breakage will happen.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants