Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track sealing processes across lotus-miner restarts #3618

Merged
merged 78 commits into from
Oct 28, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
159ce13
Async worker API
magik6k Sep 6, 2020
5d73943
storage: Fix import cycle
magik6k Sep 6, 2020
06e3852
storage: Integrate async workers in sealing manager
magik6k Sep 7, 2020
9e6f974
storage: Fix build
magik6k Sep 7, 2020
231a9e4
Fix sealing sched tests
magik6k Sep 7, 2020
554570f
docsgen
magik6k Sep 7, 2020
5f08fe7
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 10, 2020
1ebca8f
more working code
magik6k Sep 14, 2020
381a6cd
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 14, 2020
e9d25e5
More fixes
magik6k Sep 14, 2020
03cf6cc
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 15, 2020
b1361aa
sectorstorage: wip manager work tracker
magik6k Sep 16, 2020
5e09581
sectorstorage: get new work tracker to run
magik6k Sep 16, 2020
d9d644b
sectorstorage: handle restarting manager, test that
magik6k Sep 16, 2020
17680ff
gofmt
magik6k Sep 16, 2020
aa5bd7b
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 21, 2020
03c3d8b
workers: Return unfinished tasks on restart
magik6k Sep 21, 2020
b8865fb
workers: Mark on-restart-failed returned tasks as returned
magik6k Sep 21, 2020
706f4f2
worker: Don't die with the connection
magik6k Sep 22, 2020
bb5cc06
Fix workid param hash
magik6k Sep 22, 2020
04ad179
localworker: Fix contexts
magik6k Sep 22, 2020
6185e15
sectorstorage: calltracker: work around cbor-gen bytearray len limit
magik6k Sep 22, 2020
ce6b924
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 23, 2020
86c222a
sectorstorage: fix work tracking
magik6k Sep 23, 2020
3003789
worker: Use a real datastore for keeping track of calls
magik6k Sep 23, 2020
c17f0d7
sectorstorage: Fix panic in returnResult
magik6k Sep 23, 2020
d817dce
Show lost calls in sealing jobs cli
magik6k Sep 23, 2020
f576525
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 23, 2020
04ee53e
sectorstorage: Show task type of ret-wait jobs
magik6k Sep 24, 2020
a8fcb86
miner allinfo: Don't fail if sector status fails
magik6k Sep 24, 2020
cf71f03
Merge remote-tracking branch 'origin/dev' into feat/async-restartable…
magik6k Sep 26, 2020
a9d1ca4
Change order in miner sectors list
magik6k Sep 28, 2020
aa7090d
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 28, 2020
9e7d682
sectorstorage: Cleanup callToWork mapping after work is done
magik6k Sep 28, 2020
86cf3c8
worker: Reconnect correctly
magik6k Sep 28, 2020
4ba7af6
worker: Mark return methods as retry-safe
magik6k Sep 28, 2020
810c767
worker: Redeclare storage on reconnect
magik6k Sep 28, 2020
bf554d0
worker: Redeclare storage early on reconnect
magik6k Sep 28, 2020
9bd2537
stores: Fix error printing in http handler
magik6k Sep 28, 2020
1e6a69f
localworker: Don't mark calls as returned when returning fails
magik6k Sep 28, 2020
0f2dcf2
fsm: Reuse tickets in PC1 on retry
magik6k Sep 29, 2020
46a5bea
shed: Datastore utils
magik6k Sep 30, 2020
6855284
sectorstorage: Cancel non-running work in case of abort in sched
magik6k Sep 30, 2020
6ddea62
shed: gofmt
magik6k Sep 30, 2020
54fdd6b
sectorstorage: Variable scopes are hard
magik6k Sep 30, 2020
a783bf9
storagefsm: Handle PC2 with missing replica
magik6k Sep 30, 2020
c228598
sectorstorage: Variable scopes are really hard
magik6k Sep 30, 2020
4f97d96
Fix storage-fsm tests
magik6k Sep 30, 2020
2d16af6
sectorstorage: Fix TestRedoPC1
magik6k Sep 30, 2020
e3ee4e4
Fix lint errors
magik6k Sep 30, 2020
2cfe22d
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Sep 30, 2020
79d2ddf
Review
magik6k Sep 30, 2020
5932f28
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 1, 2020
921d78f
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 4, 2020
0de3051
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 8, 2020
71b3b90
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 13, 2020
b74a322
fsm: process expired-ticket sectors
magik6k Oct 13, 2020
68be28c
Add Session API
magik6k Oct 17, 2020
7ac5dc5
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 17, 2020
8d06cca
sched: Handle workers using sessions instead of connections
magik6k Oct 18, 2020
f933e1d
miner cli: Update to uuid worker IDs
magik6k Oct 18, 2020
cf4dfa3
worker: Use http rpc for miner API
magik6k Oct 18, 2020
879aa95
worker: Use miner session for connectivity check
magik6k Oct 18, 2020
dbb421c
localworker: Use better context for calling returnFunc
magik6k Oct 18, 2020
8c86ea6
localworker: Try very hard to get ruselts to manager
magik6k Oct 18, 2020
1a10f95
worker: Better miner connectivity check on startup
magik6k Oct 18, 2020
268d292
docsgen
magik6k Oct 18, 2020
660236b
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 23, 2020
4d87473
Fix lint
magik6k Oct 23, 2020
e1da874
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 27, 2020
413643a
Merge remote-tracking branch 'origin/master' into feat/async-restarta…
magik6k Oct 27, 2020
84b567c
sched: move worker funcs to a separate file
magik6k Oct 28, 2020
8731fe9
sched: split worker handling into more funcs
magik6k Oct 28, 2020
96c5ff7
sched: use more letters for variables
magik6k Oct 28, 2020
4cf00b8
worker_local: address review
magik6k Oct 28, 2020
ed2f81d
sched: Fix tests
magik6k Oct 28, 2020
4100f6e
fix TestWDPostDoPost
magik6k Oct 28, 2020
da7ecc1
Fix flaky sealing manager tests
magik6k Oct 28, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions api/api_common.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ import (
"context"
"fmt"

"github.com/google/uuid"

"github.com/filecoin-project/go-jsonrpc/auth"
metrics "github.com/libp2p/go-libp2p-core/metrics"
"github.com/libp2p/go-libp2p-core/network"
Expand Down Expand Up @@ -58,6 +60,9 @@ type Common interface {
// trigger graceful shutdown
Shutdown(context.Context) error

// Session returns a random UUID of api provider session
Session(context.Context) (uuid.UUID, error)

Closing(context.Context) (<-chan struct{}, error)
}

Expand Down
2 changes: 2 additions & 0 deletions api/api_full.go
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,8 @@ type FullNode interface {
StateMinerInitialPledgeCollateral(context.Context, address.Address, miner.SectorPreCommitInfo, types.TipSetKey) (types.BigInt, error)
// StateMinerAvailableBalance returns the portion of a miner's balance that can be withdrawn or spent
StateMinerAvailableBalance(context.Context, address.Address, types.TipSetKey) (types.BigInt, error)
// StateMinerSectorAllocated checks if a sector is allocated
StateMinerSectorAllocated(context.Context, address.Address, abi.SectorNumber, types.TipSetKey) (bool, error)
// StateSectorPreCommitInfo returns the PreCommit info for the specified miner's sector
StateSectorPreCommitInfo(context.Context, address.Address, abi.SectorNumber, types.TipSetKey) (miner.SectorPreCommitOnChainInfo, error)
// StateSectorGetInfo returns the on-chain info for the specified miner's sector. Returns null in case the sector info isn't found
Expand Down
6 changes: 4 additions & 2 deletions api/api_storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"time"

datatransfer "github.com/filecoin-project/go-data-transfer"
"github.com/google/uuid"
"github.com/ipfs/go-cid"
"github.com/libp2p/go-libp2p-core/peer"

Expand Down Expand Up @@ -64,8 +65,9 @@ type StorageMiner interface {

// WorkerConnect tells the node to connect to workers RPC
WorkerConnect(context.Context, string) error
WorkerStats(context.Context) (map[uint64]storiface.WorkerStats, error)
WorkerJobs(context.Context) (map[uint64][]storiface.WorkerJob, error)
WorkerStats(context.Context) (map[uuid.UUID]storiface.WorkerStats, error)
WorkerJobs(context.Context) (map[uuid.UUID][]storiface.WorkerJob, error)
storiface.WorkerReturn

// SealingSchedDiag dumps internal sealing scheduler state
SealingSchedDiag(context.Context) (interface{}, error)
Expand Down
18 changes: 5 additions & 13 deletions api/api_worker.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,13 @@ package api

import (
"context"
"io"

"github.com/ipfs/go-cid"
"github.com/google/uuid"

"github.com/filecoin-project/go-state-types/abi"
"github.com/filecoin-project/lotus/extern/sector-storage/sealtasks"
"github.com/filecoin-project/lotus/extern/sector-storage/stores"
"github.com/filecoin-project/lotus/extern/sector-storage/storiface"
"github.com/filecoin-project/specs-storage/storage"

"github.com/filecoin-project/lotus/build"
)
Expand All @@ -23,18 +21,12 @@ type WorkerAPI interface {
Paths(context.Context) ([]stores.StoragePath, error)
Info(context.Context) (storiface.WorkerInfo, error)

AddPiece(ctx context.Context, sector abi.SectorID, pieceSizes []abi.UnpaddedPieceSize, newPieceSize abi.UnpaddedPieceSize, pieceData storage.Data) (abi.PieceInfo, error)
storiface.WorkerCalls

storage.Sealer

MoveStorage(ctx context.Context, sector abi.SectorID, types stores.SectorFileType) error

UnsealPiece(context.Context, abi.SectorID, storiface.UnpaddedByteIndex, abi.UnpaddedPieceSize, abi.SealRandomness, cid.Cid) error
ReadPiece(context.Context, io.Writer, abi.SectorID, storiface.UnpaddedByteIndex, abi.UnpaddedPieceSize) (bool, error)
// Storage / Other
Remove(ctx context.Context, sector abi.SectorID) error

StorageAddLocal(ctx context.Context, path string) error

Fetch(context.Context, abi.SectorID, stores.SectorFileType, stores.PathType, stores.AcquireMode) error

Closing(context.Context) (<-chan struct{}, error)
Session(context.Context) (uuid.UUID, error)
}
193 changes: 129 additions & 64 deletions api/apistruct/struct.go

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions build/version.go
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,8 @@ func VersionForType(nodeType NodeType) (Version, error) {
// semver versions of the rpc api exposed
var (
FullAPIVersion = newVer(0, 17, 0)
MinerAPIVersion = newVer(0, 16, 0)
WorkerAPIVersion = newVer(0, 15, 0)
MinerAPIVersion = newVer(0, 17, 0)
WorkerAPIVersion = newVer(0, 16, 0)
)

//nolint:varcheck,deadcode
Expand Down
1 change: 1 addition & 0 deletions chain/messagepool/selection.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"github.com/filecoin-project/go-address"
"github.com/filecoin-project/go-state-types/abi"
tbig "github.com/filecoin-project/go-state-types/big"

"github.com/filecoin-project/lotus/build"
"github.com/filecoin-project/lotus/chain/messagepool/gasguess"
"github.com/filecoin-project/lotus/chain/types"
Expand Down
36 changes: 34 additions & 2 deletions cli/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"context"
"fmt"
"net/http"
"net/url"
"os"
"os/signal"
"strings"
Expand Down Expand Up @@ -206,7 +207,22 @@ func GetFullNodeAPI(ctx *cli.Context) (api.FullNode, jsonrpc.ClientCloser, error
return client.NewFullNodeRPC(ctx.Context, addr, headers)
}

func GetStorageMinerAPI(ctx *cli.Context, opts ...jsonrpc.Option) (api.StorageMiner, jsonrpc.ClientCloser, error) {
type GetStorageMinerOptions struct {
PreferHttp bool
}

type GetStorageMinerOption func(*GetStorageMinerOptions)

func StorageMinerUseHttp(opts *GetStorageMinerOptions) {
opts.PreferHttp = true
}

func GetStorageMinerAPI(ctx *cli.Context, opts ...GetStorageMinerOption) (api.StorageMiner, jsonrpc.ClientCloser, error) {
var options GetStorageMinerOptions
for _, opt := range opts {
opt(&options)
}

if tn, ok := ctx.App.Metadata["testnode-storage"]; ok {
return tn.(api.StorageMiner), func() {}, nil
}
Expand All @@ -216,7 +232,23 @@ func GetStorageMinerAPI(ctx *cli.Context, opts ...jsonrpc.Option) (api.StorageMi
return nil, nil, err
}

return client.NewStorageMinerRPC(ctx.Context, addr, headers, opts...)
if options.PreferHttp {
u, err := url.Parse(addr)
if err != nil {
return nil, nil, xerrors.Errorf("parsing miner api URL: %w", err)
}

switch u.Scheme {
case "ws":
u.Scheme = "http"
case "wss":
u.Scheme = "https"
}

addr = u.String()
}

return client.NewStorageMinerRPC(ctx.Context, addr, headers)
}

func GetWorkerAPI(ctx *cli.Context) (api.WorkerAPI, jsonrpc.ClientCloser, error) {
Expand Down
4 changes: 2 additions & 2 deletions cmd/lotus-bench/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ import (
lcli "github.com/filecoin-project/lotus/cli"
"github.com/filecoin-project/lotus/extern/sector-storage/ffiwrapper"
"github.com/filecoin-project/lotus/extern/sector-storage/ffiwrapper/basicfs"
"github.com/filecoin-project/lotus/extern/sector-storage/stores"
"github.com/filecoin-project/lotus/extern/sector-storage/storiface"
"github.com/filecoin-project/specs-storage/storage"

lapi "github.com/filecoin-project/lotus/api"
Expand Down Expand Up @@ -614,7 +614,7 @@ func runSeals(sb *ffiwrapper.Sealer, sbfs *basicfs.Provider, numSectors int, par
if !skipunseal {
log.Infof("[%d] Unsealing sector", i)
{
p, done, err := sbfs.AcquireSector(context.TODO(), abi.SectorID{Miner: mid, Number: 1}, stores.FTUnsealed, stores.FTNone, stores.PathSealing)
p, done, err := sbfs.AcquireSector(context.TODO(), abi.SectorID{Miner: mid, Number: 1}, storiface.FTUnsealed, storiface.FTNone, storiface.PathSealing)
if err != nil {
return xerrors.Errorf("acquire unsealed sector for removing: %w", err)
}
Expand Down
Loading