Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ERROR: retrieval failed: Retrieve: Retrieval Error: Unexpected deal response status: DealStatusErrored (miner rejected) #6297

Closed
ghost opened this issue May 19, 2021 · 9 comments · Fixed by filecoin-project/go-fil-markets#550
Labels
area/markets Area: Markets kind/bug Kind: Bug

Comments

@ghost
Copy link

ghost commented May 19, 2021

Describe the bug
A retrieval can't get past 816.1 MiB, despite repeated attempts. Provider is responsive. Client can complete other retrievals. Reproducible.

I ran this command lotus client retrieve --miner f019002 bafykbzacea6lb336ja5htts4c2cetbl5qeswllzfl72ynvcya65jpluhqxgpg /dev/null. The retrieval was working correctly for a while, but then froze at 816.1 MiB. It stayed in this state of ~15 minutes. Full output:

$ lotus client retrieve --miner f019002 bafykbzacea6lb336ja5htts4c2cetbl5qeswllzfl72ynvcya65jpluhqxgpg /dev/null
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
> Recv: 0 B, Paid 0 FIL, ClientEventDealProposed (DealStatusWaitForAcceptance)
> Recv: 626 B, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 52.62 KiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 1.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 2.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 3.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 4.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 5.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 6.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 7.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 8.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 9.051 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 10.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 11.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 12.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 13.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 14.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 15.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 16.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 17.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 18.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 19.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 20.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 20.05 MiB, Paid 0 FIL, ClientEventDealAccepted (DealStatusAccepted)
> Recv: 21.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 22.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 23.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 24.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 25.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 26.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 27.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 28.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 29.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 30.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 31.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 32.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 33.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 34.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 35.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 36.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 37.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 38.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusAccepted)
> Recv: 38.05 MiB, Paid 0 FIL, ClientEventPaymentChannelAddingFunds (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 39.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 40.05 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
[...snip...]
> Recv: 809.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 810.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 811.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 812.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 813.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 814.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 815.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 816.1 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 816.1 MiB, Paid 0 FIL, ClientEventPaymentRequested (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 816.1 MiB, Paid 0 FIL, ClientEventPaymentChannelReady (DealStatusPaymentChannelAllocatingLane)
> Recv: 816.1 MiB, Paid 0 FIL, ClientEventLaneAllocated (DealStatusOngoing)
> Recv: 816.1 MiB, Paid 0 FIL, ClientEventPaymentRequested (DealStatusFundsNeeded)
> Recv: 816.1 MiB, Paid 0 FIL, ClientEventSendFunds (DealStatusSendFunds)
> Recv: 816.1 MiB, Paid 0.000000000002097152 FIL, ClientEventPaymentSent (DealStatusOngoing)
> Recv: 816.1 MiB, Paid 0.000000000002097152 FIL, ClientEventPaymentRequested (DealStatusFundsNeeded)
> Recv: 816.1 MiB, Paid 0.000000000002097152 FIL, ClientEventSendFunds (DealStatusSendFunds)
> Recv: 816.1 MiB, Paid 0.000000000006291456 FIL, ClientEventPaymentSent (DealStatusOngoing)
> Recv: 816.1 MiB, Paid 0.000000000006291456 FIL, ClientEventUnknownResponseReceived (DealStatusFailing)
> Recv: 816.1 MiB, Paid 0.000000000006291456 FIL, ClientEventCancelComplete (DealStatusErrored)
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
ERROR: retrieval failed: Retrieve: Retrieval Error: Unexpected deal response status: DealStatusErrored

Version (run lotus version):
Occurs in 1.8.0 and a separate build off of PR #6149.

To Reproduce
Run the command above from dealbot-mainnet. (I can provide host, ip and credentials out of band.)

Expected behavior
I expect the transfer to complete successfully.

Logs
I have not been able to correlate this with any additional information in the logs.

EDIT: Added <details><pre>...</pre></details> around output to improve readability.

@ghost ghost added hint/needs-triaging kind/bug Kind: Bug labels May 19, 2021
@ghost ghost changed the title [BUG] [BUG] ERROR: retrieval failed: Retrieve: Retrieval Error: Unexpected deal response status: DealStatusErrored May 19, 2021
@dkkapur dkkapur added this to the 🤝 Deal Success milestone May 19, 2021
@aarshkshah1992
Copy link
Contributor

@mgoelzer Would you be able to share the client & miner logs for this deal ?

@ghost
Copy link
Author

ghost commented May 20, 2021

@aarshkshah1992 I just repeated the test exactly as described above and captured the daemon log only during the test period: test-repeat.log. This was done with Lotus @ PR #6149.

Unf I cannot get the miner logs. This is just a random miner on the network, not one that I'm running.

@ghost
Copy link
Author

ghost commented May 24, 2021

Here's another example of this bug with a different miner.

Command
lotus client retrieve --miner f024148 bafykbzaceccrmibxlfjwgsmchbrh3idqiirx3j7fpcsrxp7ncygptu3bnpncs /dev/null

Console Output

$ lotus client retrieve --miner f024148 bafykbzaceccrmibxlfjwgsmchbrh3idqiirx3j7fpcsrxp7ncygptu3bnpncs  /dev/null
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
> Recv: 0 B, Paid 0 FIL, ClientEventDealProposed (DealStatusWaitForAcceptance)
> Recv: 1018 B, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 53 KiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 1.052 MiB, Paid 0 FIL, ClientEventBlocksReceived (DealStatusWaitForAcceptance)
> Recv: 1.052 MiB, Paid 0 FIL, ClientEventDealAccepted (DealStatusAccepted)
> Recv: 1.052 MiB, Paid 0 FIL, ClientEventPaymentChannelAddingFunds (DealStatusPaymentChannelAddingInitialFunds)
> Recv: 1.052 MiB, Paid 0 FIL, ClientEventPaymentChannelReady (DealStatusPaymentChannelAllocatingLane)
> Recv: 1.052 MiB, Paid 0 FIL, ClientEventLaneAllocated (DealStatusOngoing)
> Recv: 1.052 MiB, Paid 0 FIL, ClientEventPaymentRequested (DealStatusFundsNeeded)
> Recv: 1.052 MiB, Paid 0 FIL, ClientEventSendFunds (DealStatusSendFunds)
> Recv: 1.052 MiB, Paid 0.000000000002097152 FIL, ClientEventPaymentSent (DealStatusOngoing)
> Recv: 1.052 MiB, Paid 0.000000000002097152 FIL, ClientEventPaymentRequested (DealStatusFundsNeeded)
> Recv: 1.052 MiB, Paid 0.000000000002097152 FIL, ClientEventUnknownResponseReceived (DealStatusFailing)
> Recv: 1.052 MiB, Paid 0.000000000002097152 FIL, ClientEventCancelComplete (DealStatusErrored)
> Recv: 0 B, Paid 0 FIL, ClientEventOpen (DealStatusNew)
ERROR: retrieval failed: Retrieve: Retrieval Error: Unexpected deal response status: DealStatusErrored

Logs
Client logs:
lotus-daemon-log-bafykbzaceccrmibxlfjwgsmchbrh3idqiirx3j7fpcsrxp7ncygptu3bnpncs.log
Miner logs: Not available (I don't run this miner)

Lotus Version
Built from tip of master on Friday of last week (commit 0f83515a4 )

$ lotus version
Daemon:  1.11.0-dev+mainnet+git.0f83515a4.dirty+api1.3.0
Local: lotus version 1.11.0-dev+mainnet+git.0f83515a4.dirty

@raulk
Copy link
Member

raulk commented May 24, 2021

Relevant log lines. Looks like the channel was rejected by the miner:

2021-05-25T04:09:37.557+0800	�[34mINFO�[0m	markets	loggers/loggers.go:25	retrieval client event	{"name": "ClientEventPaymentRequested", "deal ID": "1621630583961700696", "state": "DealStatusFundsNeeded", "message": ""}
2021-05-25T04:09:37.563+0800	�[34mINFO�[0m	markets	loggers/loggers.go:25	retrieval client event	{"name": "ClientEventSendFunds", "deal ID": "1621630583961700696", "state": "DealStatusSendFunds", "message": ""}
2021-05-25T04:09:37.581+0800	�[34mINFO�[0m	markets	loggers/loggers.go:25	retrieval client event	{"name": "ClientEventPaymentSent", "deal ID": "1621630583961700696", "state": "DealStatusOngoing", "message": ""}
2021-05-25T04:09:37.595+0800	�[34mINFO�[0m	markets	loggers/loggers.go:25	retrieval client event	{"name": "ClientEventPaymentRequested", "deal ID": "1621630583961700696", "state": "DealStatusFundsNeeded", "message": ""}
2021-05-25T04:09:37.628+0800	�[34mINFO�[0m	dt-impl	impl/events.go:180	channel 12D3KooWSuUQ1Mx97EbMxRNAvPCYab9XstBxT6iSJkqL97zUnVh5-12D3KooWJypkaXDBZGyTVxMsXqnH6aL4j9tCoVhQ2MydxpE4FMPg-1621630583961168148: received rejected response, erroring out channel
2021-05-25T04:09:37.629+0800	�[34mINFO�[0m	markets	loggers/loggers.go:25	retrieval client event	{"name": "ClientEventUnknownResponseReceived", "deal ID": "1621630583961700696", "state": "DealStatusFailing", "message": "Unexpected deal response status: DealStatusErrored"}
2021-05-25T04:09:37.636+0800	�[34mINFO�[0m	dt-impl	impl/impl.go:284	close channel 12D3KooWSuUQ1Mx97EbMxRNAvPCYab9XstBxT6iSJkqL97zUnVh5-12D3KooWJypkaXDBZGyTVxMsXqnH6aL4j9tCoVhQ2MydxpE4FMPg-1621630583961168148
2021-05-25T04:09:37.638+0800	�[33mWARN�[0m	dt-impl	impl/impl.go:294	unable to close channel 12D3KooWSuUQ1Mx97EbMxRNAvPCYab9XstBxT6iSJkqL97zUnVh5-12D3KooWJypkaXDBZGyTVxMsXqnH6aL4j9tCoVhQ2MydxpE4FMPg-1621630583961168148: channel not found
2021-05-25T04:09:37.638+0800	�[34mINFO�[0m	dt-impl	impl/impl.go:298	12D3KooWSuUQ1Mx97EbMxRNAvPCYab9XstBxT6iSJkqL97zUnVh5: sending cancel channel to 12D3KooWJypkaXDBZGyTVxMsXqnH6aL4j9tCoVhQ2MydxpE4FMPg for channel 12D3KooWSuUQ1Mx97EbMxRNAvPCYab9XstBxT6iSJkqL97zUnVh5-12D3KooWJypkaXDBZGyTVxMsXqnH6aL4j9tCoVhQ2MydxpE4FMPg-1621630583961168148

On retrieval, this usually indicates an error on the miner when accepting the retrieval channel. The error probably comes from here: https://github.com/filecoin-project/go-data-transfer/blob/77b948c4eb91aac1638cdd2759d8aff39216ffe2/impl/events.go#L355

Unfortunately without the miner logs, we don't have any insight into what happened, since the error itself is not serialised across the wire.

@raulk
Copy link
Member

raulk commented May 24, 2021

TL;DR: client created the payment channel (because retrieval price was non-zero), waited for it to be confirmed on chain. Miner then requested payment, client sent payment voucher. When the data channel was opened, the miner rejected it and the client failed.

@hannahhoward
Copy link
Contributor

One note: to my knowledge all of @mgoelzer 's miners are running markets 1.3.0, which includes filecoin-project/go-fil-markets#540 -- a big retrieval change. I assume that most nodes on the internet are running 1.9.0 which includes markets 1.2.5

Possible hypothesis: we are looking at a networking protocol problem between 1.3.0 and 1.2.5?

I'm not sure but we should likely do some compatibility testing and I think this is a signal we need to hold on tagging Lotus for markets 1.3.0, especially if Lotus 1.10 is not a mandatory upgrade

@raulk raulk changed the title [BUG] ERROR: retrieval failed: Retrieve: Retrieval Error: Unexpected deal response status: DealStatusErrored [BUG] ERROR: retrieval failed: Retrieve: Retrieval Error: Unexpected deal response status: DealStatusErrored (miner rejected) May 25, 2021
@raulk
Copy link
Member

raulk commented May 25, 2021

Compatibility testing between 1.9.0 (go-fil-markets 1.2.5) and master (go-fil-markets 1.3.0) was green on testground using the e2e deals test plan. However, this test plan uses 8MiB sectors and performs a 5MiB deal. We might be seeing something else here.

We continue investigating.

@aarshkshah1992
Copy link
Contributor

Note: Need to confirm that this is indeed a Markets v1.3.0/data-transfer v1.5.0 regression by seeing repeated success running a Lotus v1.9.0 client against a Lotus v1.9.0 miner.

@dkkapur dkkapur added the area/markets Area: Markets label May 27, 2021
@aarshkshah1992
Copy link
Contributor

aarshkshah1992 commented May 29, 2021

This is a compatibility issue between Markets 1.3.0 clients and <1.3.0 miners:

received voucher response &{Status:DealStatusErrored ID:1622205391038281095 PaymentOwed:+0 Message:addVoucher: supplied token amount too low; minD=2205594, D=2097152; laneAmt=0; v.Amt=2097152}

The line in the payment channel code that generates that error when voucher amount is below a certain expectation is at:

return delta, xerrors.Errorf("addVoucher: supplied token amount too low; minD=%s, D=%s; laneAmt=%s; v.Amt=%s", minDelta, delta, redeemed, sv.Amount)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/markets Area: Markets kind/bug Kind: Bug
Projects
None yet
5 participants