Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testnet state migration preview.22 health check after migration error #3288

Closed
j1010001 opened this issue Apr 26, 2024 · 7 comments
Closed
Assignees
Labels
Bug Something isn't working

Comments

@j1010001
Copy link
Member

j1010001 commented Apr 26, 2024

flow-go build: v0.34.0-crescendo-preview.15-atree-inlining
migration doc: https://www.notion.so/flowfoundation/TN-state-migration-test-Atree-inlining-C1-0-Apr-24-preview-15-71afa024d73042da9c2bc020cb734850?pvs=4

error:
migration-inlined-payloads-apr24-preview15-check2-output.txt

@turbolent turbolent self-assigned this Apr 26, 2024
@turbolent turbolent added the Bug Something isn't working label Apr 26, 2024
@fxamacker
Copy link
Member

@j1010001 status of this is "in-progress" (since yesterday afternoon).

@fxamacker
Copy link
Member

@turbolent I looked into this today and it looks like the problem is in Cadence 1.0 migration's MigrateNestedValue function.

func (m *StorageMigration) MigrateNestedValue(

It seems when dictionary element is migrated, MigrateNestedValue() returns nil as migrated value. So parent container is unchanged even though nested element is modified.

if newKey == nil && newValue == nil {
continue
}

With testnet data:

  • dictionary (0xa47a2d3a3b7e9133.14553) is inlined and removed from storage as result of migration
  • MigrateNestedValue() returns nil (as migrated value) for this dictionary even though it is migrated
  • parent dictionary (0xa47a2d3a3b7e9133.16828) thinks child element isn't migrated (nil) so it still retain the old data (slab ID storable 0xa47a2d3a3b7e9133.14553) even though the child element should be reset.
  • given this, storage health check returns slab not found error for 0xa47a2d3a3b7e9133.14553.

I am going to look into this more tomorrow. Please let me know if you have some questions.

@turbolent
Copy link
Member

Thank you for the investigation @fxamacker!

I was able to reproduce this in flow-go, and started working on reproducer in the Cadence repo here: feature/atree-register-inlining-v1.0...bastian/debug-lost-slab-during-migration. So far I'm not able to reproduce it

@fxamacker
Copy link
Member

fxamacker commented May 3, 2024

I was able to reproduce this in flow-go, and started working on reproducer in the Cadence repo here: feature/atree-register-inlining-v1.0...bastian/debug-lost-slab-during-migration. So far I'm not able to reproduce it

@turbolent Thanks for working on this!

I created a reproducer just now (~9pm) in Cadence repo (without flow-go migration program).

I'll let you know in the morning after I double-check and clean it up before we chat. Not being familiar with Cadence 1.0 migration code tripped me up a few times while creating the reproducer.

fxamacker added a commit that referenced this issue May 3, 2024
The problem was initially detected in testnet migration
of atree inlined data with Cadence 1.0 migration.

The bug  happened with 5 nested levels of data while
this reproducer is simplified to use 3 nested levels.

More info at:
#3288
fxamacker added a commit that referenced this issue May 3, 2024
The problem was initially detected in testnet migration
of atree inlined data with Cadence 1.0 migration.

The bug  happened with 5 nested levels of data while
this reproducer is simplified to use 3 nested levels.

More info at:
#3288
@turbolent
Copy link
Member

Awesome work @fxamacker! 👏 I'm going to look into it

@fxamacker
Copy link
Member

Awesome work @fxamacker! 👏 I'm going to look into it

@turbolent Sounds great! I opened a PR with a test and a reproducer for you to use.

The testnet data for this bug happened at 5 nested levels, so I simplified the reproducer to 3 nested levels.

Hope this reproducer saves time while you resolve this issue. 🙏

@j1010001 The reproducer doesn't include a fix.

fxamacker added a commit that referenced this issue May 6, 2024
This commit fixes Cadence 1.0 migration when
using atree inlined data. See issue:

  #3288

Previously, MigrateNestedValue() migrates dictionary by using
readonly iterator and migrating values in place.

This commit migrates keys first using readonly iterator and
then migrates values using mutable iterator.
turbolent pushed a commit that referenced this issue May 7, 2024
This commit fixes Cadence 1.0 migration when
using atree inlined data. See issue:

  #3288

Previously, MigrateNestedValue() migrates dictionary by using
readonly iterator and migrating values in place.

This commit migrates keys first using readonly iterator and
then migrates values using mutable iterator.
@turbolent
Copy link
Member

Fixed by #3318 and #3314

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants