Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Lotus tries to acquire ~500GB for sectors in FinalizeSector state #6182

Closed
nonsense opened this issue May 4, 2021 · 4 comments
Closed
Labels
kind/bug Kind: Bug

Comments

@nonsense
Copy link
Member

nonsense commented May 4, 2021

Describe the bug
For some reason I ended up with 4 sectors in FinalizeSector state on my miner:

639  Proving         YES      YES     2275725 (in 1 year 24 weeks)    CC
640  FinalizeSector  YES      NO      2275725 (in 1 year 24 weeks)    CC
  RecoveryTimeout: 766665 (in 1 week 6 days)
641  Proving         YES      YES     2275725 (in 1 year 24 weeks)    CC
642  FinalizeSector  YES      NO      2278605 (in 1 year 24 weeks)    CC
  RecoveryTimeout: 766665 (in 1 week 6 days)
643  FinalizeSector  YES      NO      2278605 (in 1 year 24 weeks)    CC
  RecoveryTimeout: 766665 (in 1 week 6 days)
644  FinalizeSector  YES      NO      2278605 (in 1 year 24 weeks)    CC
645  WaitDeals       NO       NO      n/a                             1

Sectors never moved to Proving state, so after rebooting the miner, I noticed that Lotus tries to acquire ~520GB disk space which is not available in order to process the sectors:

2021-05-04T13:26:27.384+0300    ERROR   advmgr  sector-storage/sched.go:404     trySched(1) req.sel.Ok error: finding best alloc storage:
    github.com/filecoin-project/lotus/extern/sector-storage.(*allocSelector).Ok
        /root/lotus/extern/sector-storage/selector_alloc.go:55
  - no good path found:
    github.com/filecoin-project/lotus/extern/sector-storage/stores.(*Index).StorageBestAlloc
        /root/lotus/extern/sector-storage/stores/index.go:421
2021-05-04T13:26:27.384+0300    DEBUG   stores  stores/index.go:403     not allocating on d3f534f4-e820-419d-8405-3debfaa4fa25, out of space (available: 512476852224, need: 518832049356)

Even though I had 1.2TB free on sealing location, none of the sectors could progress.

The workaround was to hack Lotus and override the stat.Available variable in order to move forward with finalizing the sectors:

diff --git a/extern/sector-storage/stores/local.go b/extern/sector-storage/stores/local.go
index 5a10b21b9..0166dde8c 100644
--- a/extern/sector-storage/stores/local.go
+++ b/extern/sector-storage/stores/local.go
@@ -151,6 +151,8 @@ func (p *path) stat(ls LocalStorage) (fsutil.FsStat, error) {
                }
        }

+       stat.Available = 5124768522240
+
        return stat, err
 }

Version (run lotus version):
c074031fa163c03578d110762eced57b783435ed

Expected behavior
Lotus should not need to acquire ~500GB per sector when the sectors are sealed and just need to be moved from the sealing directory to a given HDD for storage.

@nonsense
Copy link
Member Author

nonsense commented May 7, 2021

Ultimately the problem here seems to be that my HDDs are rather full - at 93%, and there is only ~500GB left on each of the storage locations.

I am not sure why Lotus needs ~520GB for the FinalizeSector, given that a sealed sector is 32GB and we just need to move it to a given storage location...

@moremorefun
Copy link

When calculating the storage space, miner calls the same function as in the sealing phase. This leads to this situatio.

@nonsense
Copy link
Member Author

nonsense commented May 7, 2021

@moremorefun that's right, I am just wondering if this is necessary, or just done for simplicity and needs fixing. I think it is the former.

@moremorefun
Copy link

moremorefun commented May 8, 2021

@moremorefun that's right, I am just wondering if this is necessary, or just done for simplicity and needs fixing. I think it is the former.

I have tried to modify the function to calculate the space required for storage, and the preliminary test did not find any problems.

extern/sector-storage/storiface/filetype.go

func (t SectorFileType) StoreSpaceUse(ssize abi.SectorSize) (uint64, error) {
	var need uint64
	for _, pathType := range PathTypes {
		if !t.Has(pathType) {
			continue
		}

		oh, ok := FsOverheadFinalized[pathType]
		if !ok {
			return 0, xerrors.Errorf("no finalized overhead info for %s", pathType)
		}

		need += uint64(oh) * uint64(ssize) / FSOverheadDen
	}

	return need, nil
}

extern/sector-storage/stores/index.go

-	spaceReq, err := allocate.SealSpaceUse(ssize)
+	var err error
+	spaceReq := uint64(math.MaxUint64)
+	switch pathType {
+	case storiface.PathSealing:
+		spaceReq, err = allocate.SealSpaceUse(ssize)
+	case storiface.PathStorage:
+		spaceReq, err = allocate.StoreSpaceUse(ssize)
+	}
 	if err != nil {
 		return nil, xerrors.Errorf("estimating required space: %w", err)
 	}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Kind: Bug
Projects
None yet
Development

No branches or pull requests

4 participants