Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement list_root_keys to the AdminKeyValueStore #3142

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

MathieuDutSik
Copy link
Contributor

@MathieuDutSik MathieuDutSik commented Jan 16, 2025

Motivation

We need to have the feature of accessing to the list of root keys in order to have mutation function and accessing to the list of chain ids of a storage.

Fixes #3085

Proposal

The implementation causes some problems:

  • For ScyllaDb the implementation is very easy since the root_key is used as a partition key.
  • For DynamoDb we can implement the root_key as a partition key, but DynamoDb forbids iterating and determining all the partition keys. So, we need to keep track of the keys.
  • For RocksDb / StorageService / IndexedDb we need to keep track of the list of root_keys.
  • For RocksDb, this led to a simplification since the edge case of having a key of the form [255, ..., 255] disappears.
  • The result of the get_root_keys will not be the same on different storage. If storage has been created with fn create but no key is inserted then in ScyllaDb / DynamoDb it will not show the root_key but in RocksDb it will appear.

The end result is that the fn connect becomes slower for RocksDb, StorageService, IndexedDb but this is I think fine as the issue of performance shows up in other operations.

Test Plan

One test has been added for this feature.

Release Plan

No impact on the TestNet / DevNet. It can follow the normal release plan.

Links

None.

@MathieuDutSik MathieuDutSik marked this pull request as ready for review January 20, 2025 12:37
@@ -92,11 +92,19 @@ async fn get_config_internal(
}
}

/// The DynamoDb forbids the iteration over the partition keys.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The DynamoDb forbids the iteration over the partition keys.
/// DynamoDB forbids the iteration over the partition keys.
///

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks.

/// The DynamoDb forbids the iteration over the partition keys.
/// Therefore we use a special partition_key named [1] for storing
/// the root keys. For normal root_keys, we simply put a 0 in front
/// therefore no intersection is possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this comment be on PARTITION_KEY_ROOT_KEY?
Currently it's associated with PARTITION_ATTRIBUTE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, corrected.

let mut vec = vec![0];
vec.extend(root_key);
vec
let mut start_key = vec![0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to be the same as EMPTY_ROOT_KEY, is it?

Suggested change
let mut start_key = vec![0];
let mut start_key = EMPTY_ROOT_KEY.to_vec();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

linera-views/src/backends/dynamo_db.rs Show resolved Hide resolved
@@ -308,7 +314,7 @@ impl TransactionBuilder {
value: Vec<u8>,
store: &DynamoDbStoreInternal,
) -> Result<(), DynamoDbStoreInternalError> {
let transact = store.build_put_transact(&self.root_key, key, value)?;
let transact = store.build_put_transact(&self.start_key, key, value)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(transact is an uncommon abbreviation for transaction. I'd either use the full word or txn maybe?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I used transaction in the DynamoDb code.

@@ -964,6 +964,14 @@ pub enum DatabaseToolCommand {
#[arg(long = "storage")]
storage_config: String,
},

/// List the rot keys of the database
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// List the rot keys of the database
/// List the root keys of the database

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

match self {
StoreConfig::Memory(_, _) => Err(ViewError::StoreError {
backend: "memory".to_string(),
error: "list_all is not supported for the memory storage".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error: "list_all is not supported for the memory storage".to_string(),
error: "list_all_root_keys is not supported for the memory storage".to_string(),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

}

message ReplyInsertRootKey {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We could use google.protobuf.Empty instead of all those empty types. Not sure if that's preferable, but we do it in rpc.proto.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that is great.

root_key_admin_test::<DynamoDbStore>().await;
}

#[cfg(with_scylladb)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be more readable to use test_case for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you.


/// Obtains the list of existing namespaces.
async fn list_all(config: &Self::Config) -> Result<Vec<String>, Self::Error>;

/// Lists the root keys of the namespace.
/// It is possible that some root_keys have no keys.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean "some namespaces have no root keys"?

Copy link
Contributor Author

@MathieuDutSik MathieuDutSik Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I really mean that.
What happens is that for each key, there is at least one root key.
But for some root key, there is no corresponding key.

  • For ScyllaDb, we will have that the insertion is done via (root_ley, key) which lead to the fact that for each key we have a matching root_key.
  • For others the root key has to be stored as a key. That creation is done just at fn connect or fn clone_with_root_key. So, the creation itself creates root keys. But the key is not yet written.

What this designs allow is to access to all the keys and so to to transfer files from one storage to another.

Copy link
Contributor

@afck afck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I need some more explanations (or better: code comments, or pointers where to find them) before I can review this. I'm confused how the keys are structured and represented in the different databases. E.g. I thought the first byte was there to distinguish between actual key-value pairs, and markers for the existence of namespaces; in what way does that relate to root keys at all, and why is it changing now?

let set_root_keys = root_keys.iter().cloned().collect::<HashSet<_>>();
for read_root_key in &read_root_keys {
assert!(set_root_keys.contains(read_root_key));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only asserts that read_root_keys are a subset of set_root_keys. Shouldn't we assert that they are equal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.
For a database like ScyllaDb, the root keys would be inserted only if there is a corresponding key. This is why there is a size_select that could be 0.

let mut root_keys = HashSet::new();
while let Some(row) = rows.next().await {
let (root_key,) = row?;
let root_key = root_key[1..].to_vec();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we drop the first byte? Is this KeyTag::Key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ScyllaDb, a key has to hava non-zero length. Therefore, we have a function get_big_root_key that adds a 0 at the first entry.

}
let mut full_key1 = self.start_key.to_vec();
full_key1.extend(&key_prefix);
let full_key2 = get_upper_bound_option(&full_key1).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Why can't this be None? Is this because the edge case is gone, because the first byte is never 255?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.
Added, documentation.

@MathieuDutSik MathieuDutSik changed the title Implement get_root_keys to the AdminKeyValueStore Implement list_root_keys to the AdminKeyValueStore Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce the get_root_keys to the AdminKeyValueStore
2 participants