Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

Move to TOML for manifest and lock #119

Closed
sdboyer opened this issue Jan 23, 2017 · 119 comments · Fixed by #342
Closed

Move to TOML for manifest and lock #119

sdboyer opened this issue Jan 23, 2017 · 119 comments · Fixed by #342

Comments

@sdboyer
Copy link
Member

sdboyer commented Jan 23, 2017

DECISION IS MADE - we'll be using TOML.


At the time dep went public, the committee was generally in agreement that we would prefer to use TOML or YAML over JSON for the manifest and lock files. There are a few reasons for this, but the primary one is that those formats, unlike JSON, allow users to include arbitrary comments. Because the manifest is the locus of dependency decisionmaking for teams, such comments can add a lot of value:

# we pinned this to a rev because of <link to bug>. remember to unpin when that's fixed!

However, in looking through the existing TOML and YAML implementations in Go, we found that none of them actually support preserving arbitrary comments when regenerating the files. dep is constantly rewriting the manifest and lock files; without support for comment preservation, each rewrite would blow away any user comments. So, without this feature, TOML and YAML lose much of their advantage over JSON.

There are other arguments for using TOML or YAML, and it is possible that the we may still elect to use them even without support for comment preservation. However, we are putting a call out - if anyone feels particularly itchy at the thought of using JSON for these files, then please, contribute comment preservation support to an existing implementation, and we'll make the switch! The deadline for this change is the relevant milestone, after which the format and structure of the manifest and lock must be stable.

Alternatively, if we've missed an implementation of TOML or YAML that does support comment preservation, or there's some other crafty way around this problem, please tell us here!


UPDATE: we've decided to generally not have the tool update the manifest, which relieves us of the need for a comment-preserving implementation.

@kardianos
Copy link

In what instances would users be expected to manually modify any of the files, manifest or lock?

@freeformz
Copy link

FWIW: I don't think it should generally be necessary with the exception of adding comments or the possibility that there are some seldom used features that are (at least to start) implemented through directly modifying the manfiest file. That's my 2c anyway.

@jessfraz
Copy link
Contributor

agreed, don't think there should be much reason other than comments to directly edit the manifest, but if there is we should think about if that is the user experience we desire

@adg
Copy link
Contributor

adg commented Jan 23, 2017 via email

@sdboyer
Copy link
Member Author

sdboyer commented Jan 24, 2017

In what instances would users be expected to manually modify any of the files, manifest or lock?

echo what @freeformz and @jessfraz said. it's also possible that, when we start looking at direct integration into the go toolchain, we have to trim the command set down.

@kardianos
Copy link

when we start looking at direct integration into the go toolchain, we have to trim the command set down.

@sdboyer Is there a design document for a go tool integration? Are there docs from internal discussions? Are you just hypothesizing?

@jessfraz
Copy link
Contributor

somewhere around here: #25 (comment)

@sdboyer
Copy link
Member Author

sdboyer commented Jan 24, 2017

Is there a design document for a go tool integration?

Nope, there are no more design docs hiding anywhere.

Are there docs from internal discussions? Are you just hypothesizing?

I'm basing this on some email chains with @rsc. He set out some desired constraints for how he wants the overall go toolchain to work, and the first thought I had was, "it would be easier to meet these if we weren't trying to provide a command for ever possible manifest change." I shared that thought in an email about two weeks ago, and there's been no further discussion since. That's where we're at.

@kevinburke
Copy link
Collaborator

kevinburke commented Jan 24, 2017

Funny, I wrote about this recently - this is out of left field, but the only Go parsers I found that preserved comments were for XML, HCL and Go source files themselves, and they are about equally difficult to work with. https://kev.inburke.com/kevin/more-comment-preserving-configuration-parsers/

Here's a parser that finds a version number in a Go source file and bumps it: https://github.com/shyp/bump_version. I've also written code to manipulate HCL files and it's a little finicky but the resulting file is recognizable. https://github.com/hashicorp/hcl

@nathany
Copy link
Contributor

nathany commented Jan 24, 2017

There is also JSON5, which does allow comments, though I don't think it's anywhere near a standard.

Though I prefer the readability of TOML > YAML > JSON, another consideration is the licensing and CLA if dep is later incorporated into the go toolchain. As far as I understand, the TOML or YAML libraries would need to be contributed under the Google CLA or rewritten (with comment preservation as well). /cc @willnorris

@kevinburke
Copy link
Collaborator

I'll just put this out there: I'm available for hire as a contractor, and would love to work on a CLA-compliant, Go TOML parser that can preserve comments and allow AST manipulation, if anyone's interested. https://burke.services

@andrew
Copy link

andrew commented Jan 24, 2017

+1 for a non-executable format for manifest+lockfile, this will allow third party services (like https://libraries.io) to analyse the dependencies without needing to execute arbitrary code download from the internet.

Also YAML is basically a superset of JSON, so you could mix and match, but just because you can doesn't mean you should,

Aside from that, allowing comments to explain why you switched to a particular fork or pinned to a weird tag would be ✨

@kr
Copy link

kr commented Jan 24, 2017

There is a format that supports arbitrary comments, with regeneration preserving the comment positions, is supported by the Go standard library, and has syntax familiar to all Go programmers: Go source code.

Imagine a manifest file (with no .go filename extension) in a subset of Go syntax:

dependencies = {
        "github.com/Masterminds/semver": {
            "branch": "2.x",
        },
        "github.com/Masterminds/vcs": {
            "version": "^1.8.0",
        },
        "github.com/pkg/errors": {
            "version": ">=0.8.0, <1.0.0",
        },
        "github.com/sdboyer/gps": {
            "branch": "master",
        },
}

Something along these lines wouldn't be too hard to put through ParseExpr and get an AST out, which can then be modified and pretty-printed.

Is that too bizarre?

@jessfraz
Copy link
Contributor

jessfraz commented Jan 24, 2017 via email

@pquerna
Copy link

pquerna commented Jan 24, 2017

FWIW, I had the same issues with preserving comments in config files for a project, and ended up using HCL with some wrapper code for my use case.

@freeformz
Copy link

@kr Something similar was discussed early on, here was my attempt at the time (which was a little more aggressive than yours)....

package main

// +build deps

import (
    "deps"
)

var (
    // RootPackagName of the Project
    RootPackageName = "github.com/freeformz/tdeps"

    // IgnoreTags that should be ignored when analyzing deps
    IgnoreTags = []string{"appengine", "test"}

    // Dependencies stated for the package tree starting at RootPackage
    Dependencies = []deps.Dependency{
        {
            Name:    "github.com/Sirupsen/logrus",
            Version: ">= v0.9.0",
        },
        {
            Name:     "github.com/heroku/private",
            Version:  "1.0.0",
            Location: "git-hub-enterprise.herokai.com/privatething",
        },
    }

    // Lock of resolved dependencies which may be in vendor/
    Lock = []deps.Dependency{
        {
            Name:    "github.com/Sirupsen/logrus",
            Version: "v0.9.0",
        },
        {
            Name:    "golang.org/x/net/context",
            Version: "someVer",
        },
        {
            Name:    "golang.org/x/oauth2",
            Version: "someVer",
        },
        {
            Name:    "golang.org/x/oauth2/internal",
            Version: "someVer",
        },
    }
)

I don't remember why we didn't go down this route, possibly because my example wasn't a limited subset, but trying to do way more.

I do think @kr suggestion would be interesting to explore.

@kardianos
Copy link

@kr @freeformz I'm skeptical of that route, but if it were to be pursued, make those top level vars part of a type so the fields can be looked up / autocomplete type things can assist.

@mitsuhiko
Copy link

mitsuhiko commented Jan 24, 2017

I hate toml but I must admit that the choice of Toml was good in Rust land. It diffs very well which makes resolving conflicts much nicer than a json file would allow.

@typeless
Copy link

The composability with other existing tools does make JSON desirable, which also makes it easier to be 'scriptable' (with the help of jq for example).

Not sure if that really matters in the end though.

@vasi
Copy link

vasi commented Jan 25, 2017

An awful but effective workaround I've used in the past: Just have a key that is specced to be used for comments, and nothing else. Eg:

dependencies:
  github.com/Masterminds/semver:
    comment: "2.x is incompatible for reasons XYZ"
    branch: 1.x

This comment will clearly be preserved even if the structure is programmatically modified.

Caveats:

  • Can get awkward if keys are reordered. But usually it's easier to preserve key order than to preserve comments.
  • Only one comment is possible per dictionary. Extensions (comment2, commentXYZ…) can be allowed, but that's ugly.
  • Comments can't be added to arrays, only dictionaries. Not a big problem in practice. Arrays are typically small, or contain dictionaries—either of which allows comments to be close to where they should be.

@4ad
Copy link
Member

4ad commented Jan 25, 2017

@kr I like your suggestion (although I have no interest in Go dependency management) because it would encourage and make it easy for other projects to use the same configuration language.

@kegsay
Copy link

kegsay commented Jan 25, 2017

I would love to see TOML used instead of JSON or YAML.

Hand-rolling JSON files is incredibly fiddly: I find I always miss a , or a :. The lack of being able to justify your versions (as mentioned by the OP) are infuriating when working with package managers like NPM which use package.json. It does parse easily though.

YAML is a lovely format to write in, but extremely complicated to parse. I think devising a completely compliant YAML implementation is very tricky, and even if dep manages to use one, what about 3rd parties who want to parse these files?

TOML gets the balance right IMHO: easy to parse, easy to write and easy to read. The most popular TOML parser for Golang AFAIK is https://github.com/BurntSushi/toml - so perhaps @BurntSushi or @cespare may have some thoughts on adding comment preservation support? I'd be interested in helping out.

@BurntSushi
Copy link

@kegsay I'd expect comment preserving transformations to be a pretty significant change. I haven't done well with keeping the TOML implementation up to date with the latest spec either. I would personally support a replacement fork (i.e., we retire or move BurntSushi/toml) if you found it useful to start with the existing code.

@jrick
Copy link

jrick commented Jan 25, 2017

It may be too late in the design phase to change something like this, but I am completely ok with dropping the requirement of being able to modify the manifest through the tool instead of just hand editing it. Especially if the manifest is in an easily readable and writable format such as TOML. If the manifest is never rewritten by the tool, then there's no need to make it preserve comments.

@kegsay
Copy link

kegsay commented Jan 25, 2017

@BurntSushi thanks for the quick reply! I feel that your TOML parser is the most "battle-worn" out of the available parsers in that it superficially seems to have the most usage, and therefore has been tested in a large amount of real-world scenarios. For that reason, I'd be reluctant to do a complete rewrite unless the change was sufficiently large that you'd end up doing a rewrite in order to implement comment preservation. Given you wrote the parser, do you believe that we'd need to start from scratch?

@nathany
Copy link
Contributor

nathany commented Mar 17, 2017

@willfaught I agree that YAML has a nice aesthetic. The use of colons for key: value mappings is even reminiscent of composite literals.

Personally, any of HCL, TOML, YAML would be fine with me.
+ comments
+ one line per dependency for diffs and merging (JSON is adverse to extra trailing commas)
+ readable/ergonomics

YAML is the most mature, but there are a few strikes against it too -- in my opinion.

Significant Whitespace

"structure is determined by indentation" http://yaml.org/spec/1.2/spec.html#space/indentation/

I don't personally have a problem with significant whitespace, but I observe that Go intentionally doesn't rely on it.

"Some observers objected to Go's C-like block structure with braces, preferring the use of spaces for indentation, in the style of Python or Haskell. However, we have had extensive experience tracking down build and test failures caused by cross-language builds where a Python snippet embedded in another language, for instance through a SWIG invocation, is subtly and invisibly broken by a change in the indentation of the surrounding code. Our position is therefore that, although spaces for indentation is nice for small programs, it doesn't scale well, and the bigger and more heterogeneous the code base, the more trouble it can cause. It is better to forgo convenience for safety and dependability, so Go has brace-bounded blocks." https://talks.golang.org/2012/splash.article

Even if it turns out to be less of an issue for a dependency tool config file, a consistent stance does have some value IMO.

Complexity

Compare http://yaml.org/spec/1.2/spec.html to https://golang.org/ref/spec. YAML has way more than necessary for such a simple need. I do remember the remote code execution vulnerabilities the Ruby community experienced partially due to YAML. That may not be a problem for the Go parsers, but it was enough for the Rust (Cargo) folks to avoid the format.

@willfaught
Copy link

willfaught commented Mar 18, 2017 via email

@sdboyer
Copy link
Member Author

sdboyer commented Mar 20, 2017

🔨 🔨 🔨 (why is there no gavel emoji)

We're going to go with TOML. 😄

As I think discussion of their various characteristics here has demonstrated, there's no clear winner between YAML, TOML, or HCL in the general case. I say "various characteristics" to emphasize that I'm not talking about personal views (my view: "all formats are terrible and I will hate them at least sometimes - life sucks, buy a helmet").

The one thing that does stick out with TOML is, being not tree-structured, it's possible for us to append constraints to the manifest without rewriting it. That may turn out to be a very important factor in applying sane defaults that help guard us (that is, the entire public Go ecosystem) against nasty exponential growth in solver running time.

This would be a great moment for anyone with time to jump in with a PR that converts us to use TOML for the manifest and lock. For now at least, let's rely on BurntSushi/toml.

🎉 🎉 🎊 🎊


EDIT: due to licensing concerns, we're gonna at least start from https://github.com/pelletier/go-toml.

@carolynvs
Copy link
Collaborator

If it's okay, I have time this week to jump over to TOML.

@willfaught
Copy link

willfaught commented Mar 20, 2017 via email

@sdboyer
Copy link
Member Author

sdboyer commented Mar 20, 2017

@willfaught decision's made - let's move on, please.

@sdboyer
Copy link
Member Author

sdboyer commented Mar 20, 2017

@carolynvs absolutely!

@sdboyer
Copy link
Member Author

sdboyer commented Mar 20, 2017

Actually, let's maybe start from https://github.com/pelletier/go-toml to avoid dealing with the WTFPL right now. Or, another one. Honestly, idc, I'm just trying to make suggestions to break logjams before they happen 😄

@carolynvs
Copy link
Collaborator

carolynvs commented Mar 20, 2017

I don't have strong feels on the TOML library, happy to use whichever you prefer. I've never run into the WTFPL though. 😃 I'll give go-toml a try first.

@ghost
Copy link

ghost commented Mar 20, 2017

What about HCL? I know this has been mentioned before. Given that it has JSON interop, you could also allow certain JSON producing scripts as an alternative. For example, dep would recognize that the man file has a .go, .rb or .py extension. When those extensions are recognized, dep runs the file and collects the output as JSON etc.

This could be amazing for people who want to do something different with their project or simply want some logic determining what deps they are using.

dep "github.com/gin-gonic/gin" {
    version = "1.1.0"
}

job "build-client" { ...

or

require 'go-dep'

def build do
    puts dep.conf ...

@sdboyer
Copy link
Member Author

sdboyer commented Mar 20, 2017

@rucuriousyet decision's made - let's move on, please.

@ghost
Copy link

ghost commented Mar 20, 2017

Please close the issue then... theres so much content here I didn't notice...

@sdboyer sdboyer changed the title Manifest and Lock format: TOML/YAML/JSON Move to TOML for manifest and lock Mar 20, 2017
@sdboyer
Copy link
Member Author

sdboyer commented Mar 20, 2017

Fair enough - I've updated the issue title and OP accordingly.

I'm keeping the issue open, though, because we still need to actually do it 😄

@bep
Copy link

bep commented Mar 20, 2017

For now at least, let's rely on BurntSushi/toml.

TOML is a great choice!

But I want to chime in that we switched from BurntSushi/toml to go-toml in some of the @spf13 projects at one point, after a big license debate. go-toml have had some issues, but the maintainer is pretty responsive, and the bugs seem to have been ironed out.

@sdboyer
Copy link
Member Author

sdboyer commented Mar 20, 2017

@bep it's good to get some direct feedback on that, thank you!

@tro3
Copy link
Contributor

tro3 commented Mar 20, 2017

@carolynvs - I'm happy to help here, if you want to take the lead on the conversion. Just let me know.

@carolynvs
Copy link
Collaborator

@tro3 I'm working on marshaling to TOML right now, using go-toml. Once I have something worth looking at, I'll start a WIP PR.

@tro3
Copy link
Contributor

tro3 commented Mar 20, 2017 via email

@domgreen
Copy link
Contributor

Also happy to help on sub tasks 👍

@rsc
Copy link

rsc commented Mar 28, 2017

Please just keep the files with the names and formats they already have. They are perfectly fine for the dep experiment. They will probably change in the final go command integration based on that experience, and that's fine. But don't introduce unnecessary churn by changing them before that.

@peterbourgon
Copy link
Contributor

@rsc The opportunity for comment has come and gone. We will move forward with TOML as the means to establish intermediate stability for users.

@krisnova
Copy link
Contributor

krisnova commented Mar 28, 2017

We will move forward with TOML as the means to establish intermediate stability for users.

I think @peterbourgon's words here bring up a good point. Decision is made, let's get to coding.

I also agree with @rsc in the sense that iteration is always possible (if necessary). But we need to start somewhere, and it looks like we are starting with TOML.

@aajtodd
Copy link

aajtodd commented Mar 29, 2017

TOML is a good choice as is hand editing. I find this same workflow in Rust to be rather easy and intuitive. Hand edit manifest, tool generates the lock file. People can build tooling on top of it to automatically write a manifest if they really find that to be an issue.

@mitsuhiko
Copy link

May I suggest locking this issue to contributors? I do not want to unsubscribe from the issue because I'm interested in when it is being closed but I am less than thrilled by the continued inbox spam I get from it. And I do understand the irony of me just contributing to the problem with this comment.

@mibk
Copy link

mibk commented Mar 29, 2017

@mitsuhiko May I suggest https://tellmewhenitcloses.com/session/new (although I haven't tested it myself...)

@pierrre
Copy link

pierrre commented Mar 29, 2017

May I suggest locking this issue to contributors? I do not want to unsubscribe from the issue because I'm interested in when it is being closed but I am less than thrilled by the continued inbox spam I get from it. And I do understand the irony of me just contributing to the problem with this comment.

+1 👍 https://github.com/golang/go/wiki/NoMeToo

@golang golang locked and limited conversation to collaborators Mar 29, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.