Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[http_proxy_over_p2p] #5526

Merged
merged 40 commits into from
Nov 30, 2018
Merged

[http_proxy_over_p2p] #5526

merged 40 commits into from
Nov 30, 2018

Conversation

cboddy
Copy link
Member

@cboddy cboddy commented Sep 26, 2018

This implements an http-proxy over p2p-streams, for context see #5341.

This script is a useful test of the functionality. In case it causes portability issues I've not included it as a sharness test (since it uses python to serve HTTP content, although happy to add it since python be ~ as available as bash).

(inline since GH doesn't support *.sh as an attachment)

#!/bin/bash                                                                                                                                                   
                                                                                                                                                              
#                                                                                                                                                             
# clean up all the things started in this script                                                                                                              
#                                                                                                                                                             
function teardown() {                                                                                                                                         
    jobs -p | xargs kill -9 ;                                                                                                                                 
}                                                                                                                                                             
trap teardown INT EXIT                                                                                                                                        
                                                                                                                                                              
#                                                                                                                                                             
# serve the thing over HTTP                                                                                                                                   
#                                                                                                                                                             
SERVE_PATH=$(mktemp -d)                                                                                                                                       
echo "YOU ARE THE CHAMPION MY FRIEND" > $SERVE_PATH/index.txt                                                                                                 
cd $SERVE_PATH                                                                                                                                                
# serve this on port 8000
python -m SimpleHTTPServer 8000 &


cd -

IPFS=cmd/ipfs/ipfs

PATH1=$(mktemp -d)
PATH2=$(mktemp -d)

RECEIVER_LOG=$PATH1/log.log
SENDER_LOG=$PATH2/log.log

export IPFS_PATH=$PATH1

#
# start RECEIVER IPFS daemon
#
$IPFS init >> $RECEIVER_LOG 2>&1
$IPFS config --json Experimental.Libp2pStreamMounting true >> $RECEIVER_LOG 2>&1
$IPFS config --json Addresses.API "\"/ip4/127.0.0.1/tcp/6001\"" >> $RECEIVER_LOG 2>&1
$IPFS config --json Addresses.Gateway "\"/ip4/127.0.0.1/tcp/8081\"" >> $RECEIVER_LOG 2>&1
$IPFS config --json Addresses.Swarm "[\"/ip4/0.0.0.0/tcp/7001\", \"/ip6/::/tcp/7001\"]" >> $RECEIVER_LOG 2>&1
$IPFS daemon >> $RECEIVER_LOG 2>&1 &
# wait for daemon to start.. maybe?
# ipfs id returns empty string if we don't wait here..
sleep 5

#
# start a p2p listener on RECIVER to the HTTP server with our content
#
$IPFS p2p listen /x/test /ip4/127.0.0.1/tcp/8000 >> $RECEIVER_LOG 2>&1
FIRST_ID=$($IPFS id -f "<id>")

export IPFS_PATH=$PATH2
$IPFS init >> $SENDER_LOG 2>&1
$IPFS config --json Experimental.Libp2pStreamMounting true >> $SENDER_LOG 2>&1
$IPFS daemon >> $SENDER_LOG 2>&1 &
# wait for daemon to start.. maybe?
sleep 5



# send a http request to SENDER via proxy to RECIEVER that will proxy to web-server

echo "******************"
echo proxy response
echo "******************"
curl http://localhost:5001/proxy/http/$FIRST_ID/test/index.txt



echo "******************"
echo link http://localhost:5001/proxy/http/$FIRST_ID/test/index.txt
echo "******************"
echo "RECEIVER IPFS LOG " $RECEIVER_LOG
echo "******************"
cat $RECEIVER_LOG

echo "******************"
echo "SENDER IPFS LOG " $SENDER_LOG
echo "******************"
cat $SENDER_LOG

@cboddy cboddy requested a review from Kubuxu as a code owner September 26, 2018 18:46
@cboddy
Copy link
Member Author

cboddy commented Sep 26, 2018

@cboddy cboddy force-pushed the feat/http_proxy_over_p2p branch 2 times, most recently from 2cb7199 to f984368 Compare September 26, 2018 18:55
@magik6k
Copy link
Member

magik6k commented Sep 26, 2018

Thanks! You should be able to add a sharness test which hijacks the request the request using netcat as is done in https://github.com/ipfs/go-ipfs/blob/master/test/sharness/t0235-cli-request.sh#L14.

@magik6k magik6k self-requested a review September 26, 2018 18:57
Copy link
Member

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good, it would be nice to get some better tests though.

p2p/proxy.go Outdated Show resolved Hide resolved
p2p/proxy.go Outdated Show resolved Hide resolved
p2p/proxy_test.go Outdated Show resolved Hide resolved
p2p/proxy.go Outdated Show resolved Hide resolved
@cboddy
Copy link
Member Author

cboddy commented Sep 26, 2018

@magik6k thanks for the swift feedback; have started and will munge that script into an sharness test as suggested.

@cboddy cboddy force-pushed the feat/http_proxy_over_p2p branch from 3dbdeb4 to 2c91cb6 Compare September 29, 2018 16:35
@Stebalien
Copy link
Member

Stebalien commented Oct 1, 2018

So, I believe this should actually be using the ReverseProxy helper. That will copy and modify all the right headers and handle all the edge-cases. However, that may just make it more complicated so take this as a suggestion. (You won't be able to use the NewSingleHostReverseProxy helper, you'll have to implement a custom RoundTripper.)

I believe we also need to read while writing. That is, the current version expects to write the entire request before it reads any response. We should start the request and then read the response as some worker finishes writing the request.

Note: We can probably fix these later if they turn out to be difficult to get right.

s := bufio.NewReader(stream)
proxyResponse, err := http.ReadResponse(s, proxyReq)

defer func() { proxyResponse.Body.Close() }()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I believe proxyResponse.Write will close the body when done.

Also, I'm not sure if this is valid if err != nil. Will it not panic?

core/corehttp/proxy.go Outdated Show resolved Hide resolved
@ghost ghost assigned ianopolous Oct 1, 2018
@ghost ghost added the status/in-progress In progress label Oct 1, 2018
@ianopolous ianopolous force-pushed the feat/http_proxy_over_p2p branch 2 times, most recently from 60092b8 to 5d20453 Compare October 2, 2018 00:14
@lanzafame
Copy link
Contributor

lanzafame commented Oct 2, 2018

@cboddy @Stebalien this might be a little late in the PR cycle to suggest but, if I understand correctly, @hsanjuan has built go-libp2p-http for this use case. It should simplify the implementation.

@hsanjuan
Copy link
Contributor

hsanjuan commented Oct 2, 2018

Hello, yes, https://github.com/hsanjuan/go-libp2p-http will probably simplify this (along with httputil.ReverseProxy). go-libp2p-http provides the libp2p transport (RoundTripper implementation) that you are looking for the ReverseProxy. If I'm correct you just need to initialize it with libp2p://peerid as target URL and manually set the transport. At least, it would not add another http RoundTripper to the libp2p world.

The RoundTrip in go-libp2p-http is simplified if you compare it to all the default transport's implementation (https://golang.org/src/net/http/transport.go?s=3628:10127#L385) but it's well tested and I think it should be solid enough (used by cluster for a while). It supports reading responses while sending the requests (for streaming requests).

The only problem would be to attach a custom transport tag to every Roundtripper. We can easily enable this upstream (currently a single tag is used for all), if you are willing to go down this path.

@cboddy
Copy link
Member Author

cboddy commented Oct 2, 2018

@Stebalien @lanzafame thanks for the feedback.

Will look into using ReverseProxy as suggested (most of the go stdlib still a bit new to me) and try to push something today/ in the next few days.

@cboddy cboddy force-pushed the feat/http_proxy_over_p2p branch from 9f6e044 to 4400212 Compare October 2, 2018 18:41
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly more complicated but I think this'll have better header handling.

}

// open connect to peer
stream, err := ipfsNode.P2P.PeerHost.NewStream(ipfsNode.Context(), parsedRequest.target, protocol.ID("/x/"+parsedRequest.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be using the request context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

sendRequest := func() {
err := req.Write(*rt.stream)
if err != nil {
(*(rt.stream)).Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be rt.stream.Reset().

}

type roundTripper struct {
stream *inet.Stream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real need for the pointer. This is already an interface (class-like thing) so we can just say inet.Stream.

@hsanjuan
Copy link
Contributor

hsanjuan commented Oct 2, 2018

fyi, protocol tags per Roundtripper coming about: libp2p/go-libp2p-http#12

@Stebalien
Copy link
Member

One more thing we'll need to do before merging this: put it behind an experimental feature-gate. Take a look at how the ipfs p2p command does this (using config.Experimental.Libp2pStreamMounting).

I don't really see this needing a long stabilization period, but we should probably leave it experimental at last until we stabilize the ipfs p2p feature...

@ianopolous
Copy link
Member

I've tested out all our api calls for Peergos through this, which includes gets, posts, and multiparts. And everything works!

peergos-over-p2p-stream

Copy link
Contributor

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a re-implementation of the logic of go-libp2p-http but it doesn't close the streams. And it if did, it probably would miss handling connection closing correctly (like this: https://github.com/hsanjuan/go-libp2p-gostream/blob/master/conn.go#L37).

This is all handled in go-libp2p-http because we have already hit a number of edge cases. I really don't like that we are cloning the logic here.

return
}
//send proxy request and response to client
newReverseHTTPProxy(parsedRequest, &stream).ServeHTTP(w, request)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is fishy here. For every request a new proxy is created but I think the stream is never closed ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. That's fixed now.

@ianopolous
Copy link
Member

@hsanjuan We can't use the RoundTripper in go-libp2p-http because it assumes a different incoming request format, and we don't want to do an extra socket+http round trip. We used your technique to close the stream though.

@ianopolous ianopolous force-pushed the feat/http_proxy_over_p2p branch 2 times, most recently from 199d25e to 2fd1c26 Compare October 3, 2018 20:24
@ianopolous
Copy link
Member

@Stebalien Does that mean we need a PR to go-ipfs-config to add a new config option and then gx publish that before this can be merged, or can we use the existing Libp2pStreamMounting option?

@hsanjuan
Copy link
Contributor

hsanjuan commented Oct 3, 2018

@hsanjuan We can't use the RoundTripper in go-libp2p-http because it assumes a different incoming request format, and we don't want to do an extra socket+http round trip. We used your technique to close the stream though.

That's not true. You just need to rewrite the path, which you are already doing anyway.

Stebalien and others added 15 commits November 28, 2018 23:17
We don't need to do this for every test and our tests are slow enough.

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>

Co-Authored-By: ianopolous <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>

Co-Authored-By: ianopolous <ianopolous@protonmail.com>
License: MIT
Signed-off-by: Ian Preston <ianopolous@protonmail.com>
@Stebalien Stebalien force-pushed the feat/http_proxy_over_p2p branch from 181aa64 to 6c35fbc Compare November 29, 2018 07:17
Instead of repeatedly starting the netcat server, start it once and wait for it
to fully start. Then, feed responses in using a fifo.

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
@Stebalien Stebalien force-pushed the feat/http_proxy_over_p2p branch from 6c35fbc to 9a443ad Compare November 30, 2018 02:01
@Stebalien
Copy link
Member

OK, tests pass and @hsanjuan has given a ✔️ 🚂 (this train may be a bit slow...).

@Stebalien Stebalien merged commit 2d94a3f into ipfs:master Nov 30, 2018
@ghost ghost removed the status/in-progress In progress label Nov 30, 2018
@ianopolous
Copy link
Member

Hooray! The last critical feature we need for Peergos!

@Stebalien is the plan to enable this on ipfs.io?

@cboddy
Copy link
Member Author

cboddy commented Dec 2, 2018

@Stebalien @magik6k @hsanjuan thanks for your help with this one!

@Stebalien
Copy link
Member

@Stebalien is the plan to enable this on ipfs.io?

Probably not for a while, if ever. We'll have to seriously consider the security and performance implications and, really, the public gateway's more of a crutch (we'd prefer if users would just load js-libp2p from their DAPP).

@ianopolous
Copy link
Member

Probably not for a while, if ever. We'll have to seriously consider the security and performance implications and, really, the public gateway's more of a crutch (we'd prefer if users would just load js-libp2p from their DAPP).

:-( That's a shame, because can do some really cool demos if that is enabled. Loading js-libp2p will never be an option for us for security reasons.

@Stebalien
Copy link
Member

Stebalien commented Dec 3, 2018

Are you forbidding javascript entirely? Can you not spin up a js-ipfs node in a sandboxed worker?

I do agree this would allow for some pretty cool demos but I'd like to sit on it at least until it's stabilized.

@ianopolous
Copy link
Member

@Stebalien In general, we veto anything that uses npm which is anathema to basic security practices. We're also trying to keep the JS down to a small easily auditable amount (as much as possible) for the same reasons. As well as being much harder to audit for attacks, having a complex P2P daemon running in the same process as that which has the secret keys is also asking for trouble. Browsers explicitly don't defend against spectre attacks within the same page to my knowledge (they'd have to run every worker in an independent OS process, not just different OS threads). We also have reproducible builds for the web-ui already.

For now we can just preface all ipfs-only demos with, install ipfs and enable these two flags, then browse to this local url.

@Stebalien
Copy link
Member

In general, we veto anything that uses npm which is anathema to basic security practices.

Well, I can't really blame you there...

Browsers explicitly don't defend against spectre attacks within the same page to my knowledge

Sandboxing has more to do with origins/contexts than pages. I'm not familiar with browser spectre mitigations but, a page in a sandboxed iframe/worker should have equivalent security to a page running in a separate tab.

@ianopolous
Copy link
Member

Sandboxing has more to do with origins/contexts than pages. I'm not familiar with browser spectre mitigations but, a page in a sandboxed iframe/worker should have equivalent security to a page running in a separate tab.

If the p2p daemon was running from a separate domain in an iframe, I believe this might be safe in some browsers (depending on their site isolation properties), but it's much harder to prove it safe, and it requires you to have two separate origins. We don't have any 3rd party hosted content in our web-ui - it's all self hosted (partially to enable easy offline and localhost access).

How about we continue this on irc to stop spamming all the other contributors to this pr? :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants