Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unixfs symlinks #1104

Closed
whyrusleeping opened this issue Apr 21, 2015 · 17 comments
Closed

unixfs symlinks #1104

whyrusleeping opened this issue Apr 21, 2015 · 17 comments

Comments

@whyrusleeping
Copy link
Member

We need to implement symlinks in unixfs. This is non-trivial, so I am opening this issue for some discussion on how we should do it.

We want to store symlinks as a path, just like a normal filesystem, but this presents us with a bit of awkwardness. What if I cat a symlink ipfs cat <hash of link object> what does it do?

@travisperson
Copy link
Member

Since IPFS is read only, we should just be able to resolve the link right? It would almost be like giving an object the same hash, which is hard, and would break a lot of the hash checking. You wouldn't really want to return what the link points too.

@whyrusleeping
Copy link
Member Author

The other hard part is that we need to be able to support 'broken links' for a variety of reasons.

@whyrusleeping
Copy link
Member Author

One simple solution would be to disallow ipfs cat <symlink hash> for now, and then make the fuse fs understand them. ipfs get <symlink hash> would just create a symlink with the path from the object.

@chriscool @cryptix thoughts?

@travisperson
Copy link
Member

One issue with symlinks in ipfs in general is if they are simply just paths, relative links could cause issues when working in /ipfs.

Imagine a symlink point to ../etc, ../home, ../<whatever>. Resolving the link from the root of the DAG would probably be fine and work correctly as there is a proper folder structure above it's suppose to follow, however, if the symlink is accessed via it's own hash such as /ipfs/<symlink-hash> it could jump you out of /ipfs which could introduce some interesting security issues.

@wking
Copy link
Contributor

wking commented Apr 21, 2015

On Mon, Apr 20, 2015 at 06:01:37PM -0700, Jeromy Johnson wrote:

We need to implement symlinks in unixfs…

On the one hand, this helps you do things like round-trip a generic
tarball through IPFS. On the other hand (as you point out) it's not
really something that makes a lot of sense in IPFS-land. I think
we'll want IPNS-links and IPNS-descendant-links that we can embed in
an IPFS tree (see 1), but I'm not convinced that traditional
symlinks are worth the trouble. In fact, I can't even think of a
valid use-case for the IPFS-descendant links from #1093

One useful case for links (both symbolic and hard) is that the value
referenced via a linked name will auto-update when the value of
another linked name is updated. Is there a more native way to do that
in IPFS? It's hard when we don't have an inode equivalent or a list
of references pointing back to a given object, and it's pretty much
impossible to track all the references back up the tree without
leaving the content-addressable space.

@jbenet
Copy link
Member

jbenet commented Apr 21, 2015

@travisperson yeah, the symlinks can be problematic attack vectors if they can pull out of the ipfs root. (there may be ways around this, but perhaps will be clunky.)


@wking makes great points.

One useful case for links (both symbolic and hard) is that the value
referenced via a linked name will auto-update when the value of
another linked name is updated. Is there a more native way to do that
in IPFS?

could you spell this out in a graph? we may see a straightforward solution.


@whyrusleeping maybe you can describe the issue you encountered with git and/or containers very precisely and either we can find how to do it without.


i'll add some complicated support.

git allows symlinks strictly for the "outside of git" world. meaning that something like:

git checkout <branch>
ipfs get -o=<local-path> <ipfs-path>
ipfs mount <ipfs-path> <local-path>

# aside: these syntaxes may not be ideal. experimenting.

Either of these could setup symlinks out for me and this may be exactly what i want. it is what we want in git.

@gwillen
Copy link

gwillen commented Apr 21, 2015

In a normal filesystem, a symlink is just stored as a string containing a path, with a flag set indicating that it's a link. So if IPFS supports something that behaves just like a normal symlink, there should be no problem supporting broken links -- if the stored path points to a place that doesn't exist, the link is broken.

Whether 'ipfs cat' should support traversing symlinks, I'm not sure; I think they really only make sense in the context of a filesystem, so perhaps 'ipfs cat' should just issue an error or print the path. (Having it print the path seems like it might confuse tools though.)

@wking
Copy link
Contributor

wking commented Apr 21, 2015

On Mon, Apr 20, 2015 at 07:12:49PM -0700, Juan Batiz-Benet wrote:

@travisperson yeah, the symlinks can be problematic attack vectors
if they can pull out of the ipfs root. (there may be ways around
this, but perhaps will be clunky.)

I've changed my mind, and decided that it's best to just store the
symlink path (relative or absolute) with a “this is a symlink” flag
like a regular filesystem. Then you have this attack vector, but you
have the same issue (for example) if you're unpacking a tarball. I
think the main idea would be to look-before-you-leap into
dereferencing untrusted content (e.g. in the gateway servers) and
refuse to dereference content that you don't want to expose. But
that's consumer code. IPFS doesn't need to care.

One useful case for links (both symbolic and hard) is that the
value referenced via a linked name will auto-update when the value
of another linked name is updated. Is there a more native way
to do that in IPFS?

could you spell this out in a graph? we may see a straightforward
solution.

I had been tripping myself up by thinking about IPFS objects as
stand-alone entities. Clearly they are that, but symlinks don't make
much sense outside of the filesystem they're linking in. So you
could have an IPFS symlink that pointed to crazy places when viewed
out of context, just like you can have:

$ ln -s .bashrc x
$ mv x /tmp/
$ ls -l /tmx/x
lrwxrwxrwx 1 wking wking 10 Apr 20 19:53 x -> .bashrc
$ cat /tmp/x
cat: /tmp/x: No such file or directory

The solution is just that folks authoring and consuming filesystems
need to find their own ways to make that work.

@wking
Copy link
Contributor

wking commented Apr 21, 2015

On Mon, Apr 20, 2015 at 07:13:35PM -0700, gwillen wrote:

Whether 'ipfs cat' should support traversing symlinks, I'm not sure;
I think they really only make sense in the context of a filesystem…

It could be an absolute link, and the referenced path might exist on
your system. In that case, I'd follow ‘cat’ and print the contents of
the referenced file.

… so perhaps 'ipfs cat' should just issue an error or print the
path. (Having it print the path seems like it might confuse tools
though.)

Yeah. You could follow ‘cp’ and add a ‘--no-dereference’ flag, but I
don't think it's worth the hassle. Folks who wonder if it's a symlink
can just use ‘ipfs object get …’ to see.

@whyrusleeping
Copy link
Member Author

@wking so you think that if the symlink isnt broken (in whatever context) we should just follow it?

@whyrusleeping
Copy link
Member Author

@jbenet another question, if i implement symlinks in unixfs, path.Resolve wont understand paths containing symlinks, as path.Resolver doesnt understand unixfs. What do you think we should do there?

@jbenet
Copy link
Member

jbenet commented Apr 21, 2015

@jbenet another question, if i implement symlinks in unixfs, path.Resolve wont understand paths containing symlinks, as path.Resolver doesnt understand unixfs. What do you think we should do there?

Good question. I'm not sure. we can either:

  1. implement in both contexts-- i.e. symlinks don't work in ipfs dag (object) get/data/links but do work in ipfs (files) cat/ls
  2. find a way to express a symlink in a general way that the merkledag can understand how to follow it (if it is a symlink that can be resolved in the current dag). (similar to the language we keep talking about for composing child objects to represent the output.)

I'd go with 1. for now but don't close doors on 2.

@wking
Copy link
Contributor

wking commented Apr 21, 2015

On Mon, Apr 20, 2015 at 08:27:26PM -0700, Jeromy Johnson wrote:

@wking so you think that if the symlink isnt broken (in whatever
context) we should just follow it?

Yes, unless an operation decides not to follow symlinks. E.g. if
you're adding a directory with symlinks, it makes sense to follow ‘cp’
and have options like --dereference (copy resolved contents,
traversing symlinked directories), and --no-dereference (copy symlinks
as symlinks).

@gwillen
Copy link

gwillen commented Apr 21, 2015

@wking If the goal is to be able to store a unix symlink into IPFS, and then retrieve it on another unix machine and have it behave the same, then it better just be a file with a path, in which case it will be (1) and not (2).

I was going to propose some other type of object to fulfill (2), like some kind of ipfs-link, except that I can't think when such a thing would be useful -- since objects are content-addressed and immutable, any place you could use a link to an object, you can just use the object itself instead. (Unless you're linking from an ipfs merkledag into ip_ns_ -- that could be a place worth thinking about how a link could work.)

@wking
Copy link
Contributor

wking commented Apr 21, 2015

On Mon, Apr 20, 2015 at 11:05:20PM -0700, Juan Batiz-Benet wrote:

@jbenet another question, if i implement symlinks in unixfs,
path.Resolve wont understand paths containing symlinks, as
path.Resolver doesnt understand unixfs. What do you think we
should do there?

Good question. I'm not sure. we can either:

  1. implement in both contexts-- i.e. symlinks don't work in ipfs dag (object) get/data/links but do work in ipfs (files) cat/ls

I'm not clear on the unixfs/path.Resolve distinction, but I think
‘ipfs object get/data/links’ should just return the symlink object.
Something like:

{
"Links": [],
"Data": "\u0008\u0002/path/to/target",
}

‘ipfs cat/ls’ should dereference the symlink, but ‘ls’ might learn
something like ‘--directory’ to avoid dereferencing symlinks.
Following the usual POSIX implementation, symlink-ness would be part
of the mode. stat(2) has:

S_IFLNK 0120000 symbolic link

so a symlink implementation might just be “add mode information to the
Metadata structure”.

  1. find a way to express a symlink in a general way that the
    merkledag can understand how to follow it (if it is a symlink
    that can be resolved in the current dag). (similar to the
    language we keep talking about for composing child objects to
    represent the output.)

I think IPNS(-child) references are useful (see 1), and similar to
but not the same as symlinks.

@wking
Copy link
Contributor

wking commented Apr 21, 2015

On Tue, Apr 21, 2015 at 09:40:37AM -0700, gwillen wrote:

@wking If the goal is to be able to store a unix symlink into IPFS,
and then retrieve it on another unix machine and have it behave the
same, then it better just be a file with a path, in which case it
will be (1) and not (2).

Right, and I want (1) for that reason. I also want (2) for
establishing chains of trust.

I was going to propose some other type of object to fulfill (2),
like some kind of ipfs-link, except that I can't think when such a
thing would be useful -- since objects are content-addressed and
immutable, any place you could use a link to an object, you can just
use the object itself instead. (Unless you're linking from an ipfs
merkledag into ip_ns_ -- that could be a place worth thinking about
how a link could work.)

Exactly. Linking to “child of IPFS ” isn't very useful,
but linking to “child of IPNS ” is. It says “I trust
<owner-of-'s-private-key> to put something reasonable under
, so use whatever they're put there with my blessing”. Which is
just like a POSIX symlink, but rooted in the IPNS namespace so the
object will look a bit different. Related discussion in #1093.

wking added a commit to wking/oci-gentoo-minimal that referenced this issue Aug 13, 2015
IPFS doesn't track them yet [1].

[1]: ipfs/kubo#1104
@whyrusleeping
Copy link
Member Author

Yeaaaaah!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants