Skip to content

Commit

Permalink
feat: change data store backend to use liblmdb directly (#11357)
Browse files Browse the repository at this point in the history
This PR adds a new backend implementation for data stores that's based
on a thin layer on top of lmdb.c. This is the same layer used by
`NativeWorldState`.

This enables us to have tighter control over how data is serialized (no
more bigint issues #9690 #9793), how it's accessed and enable us to use
a consistent version of lmdb across our stack.

Things brings with it a change of interface since reads and writes are
asynchronous.

## Architecture

The architecture is similar to `NativeWorldState`: a module that wraps
lmdb.c and provides C++ idiomatic access to
databases/transactions/cursors
[liblmdb](https://github.com/AztecProtocol/aztec-packages/blob/feat/lmdb-wrapper/barretenberg/cpp/src/barretenberg/lmdblib/lmdb_store.hpp).
This module is thread safe.

This module is then exposed through node-module-api to Nodejs. The
communication interface between the C++ code and Nodejs is based on
passing msgpack encoded messages around. The addon interface is really
simple, only exposing a single class with a single asynchronous method
`call: (message: Buffer) => Promise<Buffer>`.

The C++ module does not have its own thread pool, it will piggy back off
the Nodejs thread pool, which means we have to be careful not to exhaust
it.

On the Nodejs side we create a new `AsyncStore` backend that implements
the same interface (only async).

## Transactions

LMDB supports multiple concurrent readers, but one writer.

The `WriteTransaction` class in Nodejs accumulates writes locally and
sends them to the database as one big, atomic batch. Any reads that
happen while a write transaction is open (and in the same async context)
take the uncommitted data into account.

While `WriteTransaction` is accumulating writes, reads to the database
are still honoured, but they will only see committed data (providing
isolation from dirty writes). The `WriteTransaction` object is only
available in the async context (using `AsyncLocalStorage`) that started
that operation.

The Nodejs store queues up write transactions so that only one is active
at a time.

## Cursors

Cursors on the Nodejs side implement the `AsyncIterable` protocol so
they can be used in `for await of` loops and can be passed to our
helpers in aztec/foundation (e.g. `toArray`, `take`, etc)

Cursors use a long-lived read transaction. A lot of the queries used in
our stores actual depend on cursors (e.g. `getLatestSynchedL2Block` -
starts a cursor at the end of the database and reads one block).

We have a limited number of readers available in C++, if this number is
reached then the text read will block until a reader becomes available.
The Nodejs store uses a semaphore that only allows up to `maxReaders -
1` cursors to be open at any one time. We always leave one reader
available to perform simple gets (otherwise we'd risk blocking the
entire thread pool)

We've added two 'optimizations' to our cursor implementation: (1) when
starting a cursor the first page of results is sent back immediately and
(2) if we know we want a small number of results (e.g. the last block in
`getLatestSynchedL2Block`) then close the cursors in the same operation
(this way we avoid keeping a reader open that will be closed in the next
async execution)

## Performance

In tests the performance is similar to the old backend. There is a
penalty to reads (reads are async now) but writes are on par.

## Changes to existing stores

The only modification necessary has been to have async reads and await
the write operations in transactions.

## Ported data stores

- the archiver (blocks, logs, contracts, txs)
- the tx mempool 
- the proving job store

## TODO

- [x] port attestation pool, peer store
- [ ] add metrics
- [ ] fix merge conflicts 😢

---------

Co-authored-by: PhilWindle <philip.windle@gmail.com>
  • Loading branch information
alexghr and PhilWindle authored Jan 31, 2025
1 parent ccaf6db commit 7e3a38e
Show file tree
Hide file tree
Showing 161 changed files with 6,607 additions and 1,801 deletions.
2 changes: 1 addition & 1 deletion Dockerfile.aztec
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ ENV BB_WORKING_DIRECTORY=/usr/src/bb
ENV BB_BINARY_PATH=/usr/src/barretenberg/cpp/build/bin/bb
ENV ACVM_WORKING_DIRECTORY=/usr/src/acvm
ENV ACVM_BINARY_PATH=/usr/src/noir/noir-repo/target/release/acvm
RUN mkdir -p $BB_WORKING_DIRECTORY $ACVM_WORKING_DIRECTORY /usr/src/yarn-project/world-state/build
RUN mkdir -p $BB_WORKING_DIRECTORY $ACVM_WORKING_DIRECTORY /usr/src/yarn-project/native/build

COPY /usr/src /usr/src

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile.end-to-end
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ ENV BB_BINARY_PATH=/usr/src/barretenberg/cpp/build/bin/bb
ENV ACVM_WORKING_DIRECTORY=/usr/src/acvm
ENV ACVM_BINARY_PATH=/usr/src/noir/noir-repo/target/release/acvm
ENV PROVER_AGENT_CONCURRENCY=8
RUN mkdir -p $BB_WORKING_DIRECTORY $ACVM_WORKING_DIRECTORY /usr/src/yarn-project/world-state/build
RUN mkdir -p $BB_WORKING_DIRECTORY $ACVM_WORKING_DIRECTORY /usr/src/yarn-project/native/build

COPY /usr/src /usr/src
COPY /anvil /opt/foundry/bin/anvil
Expand Down
8 changes: 4 additions & 4 deletions barretenberg/cpp/Earthfile
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,10 @@ test-cache-read:
--command="exit 1"
SAVE ARTIFACT build/bin

preset-release-world-state:
preset-release-nodejs-module:
FROM +source
DO +CACHE_BUILD_BIN --prefix=preset-release-world-state \
--command="cmake --preset clang16-pic -Bbuild && cmake --build build --target world_state_napi && mv ./build/lib/world_state_napi.node ./build/bin"
DO +CACHE_BUILD_BIN --prefix=preset-release-nodejs-module \
--command="cmake --preset clang16-pic -Bbuild && cmake --build build --target nodejs_module && mv ./build/lib/nodejs_module.node ./build/bin"
SAVE ARTIFACT build/bin

preset-release-assert:
Expand Down Expand Up @@ -317,4 +317,4 @@ build:
BUILD +preset-wasm
BUILD +preset-wasm-threads
BUILD +preset-release
BUILD +preset-release-world-state
BUILD +preset-release-nodejs-module
10 changes: 5 additions & 5 deletions barretenberg/cpp/bootstrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@ function build_native {
cache_upload barretenberg-release-$hash.tar.gz build/bin
fi

(cd src/barretenberg/world_state_napi && yarn --frozen-lockfile --prefer-offline)
if ! cache_download barretenberg-release-world-state-$hash.tar.gz; then
(cd src/barretenberg/nodejs_module && yarn --frozen-lockfile --prefer-offline)
if ! cache_download barretenberg-release-nodejs-module-$hash.tar.gz; then
rm -f build-pic/CMakeCache.txt
cmake --preset $pic_preset -DCMAKE_BUILD_TYPE=RelWithAssert
cmake --build --preset $pic_preset --target world_state_napi
cache_upload barretenberg-release-world-state-$hash.tar.gz build-pic/lib/world_state_napi.node
cmake --build --preset $pic_preset --target nodejs_module
cache_upload barretenberg-release-nodejs-module-$hash.tar.gz build-pic/lib/nodejs_module.node
fi
}

Expand Down Expand Up @@ -118,4 +118,4 @@ case "$cmd" in
*)
echo "Unknown command: $cmd"
exit 1
esac
esac
3 changes: 2 additions & 1 deletion barretenberg/cpp/cmake/lmdb.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ include(ExternalProject)
set(LMDB_PREFIX "${CMAKE_BINARY_DIR}/_deps/lmdb")
set(LMDB_INCLUDE "${LMDB_PREFIX}/src/lmdb_repo/libraries/liblmdb")
set(LMDB_LIB "${LMDB_INCLUDE}/liblmdb.a")
set(LMDB_HEADER "${LMDB_INCLUDE}/lmdb.h")
set(LMDB_OBJECT "${LMDB_INCLUDE}/*.o")

ExternalProject_Add(
Expand All @@ -15,7 +16,7 @@ ExternalProject_Add(
BUILD_COMMAND make -C libraries/liblmdb -e XCFLAGS=-fPIC liblmdb.a
INSTALL_COMMAND ""
UPDATE_COMMAND "" # No update step
BUILD_BYPRODUCTS ${LMDB_LIB} ${LMDB_INCLUDE}
BUILD_BYPRODUCTS ${LMDB_LIB} ${LMDB_HEADER}
)

add_library(lmdb STATIC IMPORTED GLOBAL)
Expand Down
4 changes: 2 additions & 2 deletions barretenberg/cpp/dockerfiles/Dockerfile.x86_64-linux-clang
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ RUN cmake --build --preset clang16 --target ultra_honk_rounds_bench --target bb

RUN npm install --global yarn
RUN cmake --preset clang16-pic
RUN cmake --build --preset clang16-pic --target world_state_napi
RUN cmake --build --preset clang16-pic --target nodejs_module

FROM ubuntu:lunar
WORKDIR /usr/src/barretenberg/cpp
Expand All @@ -40,4 +40,4 @@ COPY --from=builder /usr/src/barretenberg/cpp/build/bin/grumpkin_srs_gen /usr/sr
# Copy libs for consuming projects.
COPY --from=builder /usr/src/barretenberg/cpp/build/lib/libbarretenberg.a /usr/src/barretenberg/cpp/build/lib/libbarretenberg.a
COPY --from=builder /usr/src/barretenberg/cpp/build/lib/libenv.a /usr/src/barretenberg/cpp/build/lib/libenv.a
COPY --from=builder /usr/src/barretenberg/cpp/build-pic/lib/world_state_napi.node /usr/src/barretenberg/cpp/build-pic/lib/world_state_napi.node
COPY --from=builder /usr/src/barretenberg/cpp/build-pic/lib/nodejs_module.node /usr/src/barretenberg/cpp/build-pic/lib/nodejs_module.node
13 changes: 13 additions & 0 deletions barretenberg/cpp/scripts/lmdblib_tests.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/usr/bin/env bash

set -e

# run commands relative to parent directory
cd $(dirname $0)/..

DEFAULT_TESTS=LMDBStoreTest.*:LMDBEnvironmentTest.*
TEST=${1:-$DEFAULT_TESTS}
PRESET=${PRESET:-clang16}

cmake --build --preset $PRESET --target lmdblib_tests
./build/bin/lmdblib_tests --gtest_filter=$TEST
6 changes: 4 additions & 2 deletions barretenberg/cpp/src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ if (ENABLE_PIC AND CMAKE_CXX_COMPILER_ID MATCHES "Clang")
message("Building with Position Independent Code")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fPIC")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fPIC")
add_subdirectory(barretenberg/world_state_napi)
add_subdirectory(barretenberg/nodejs_module)
endif()

add_subdirectory(barretenberg/bb)
Expand All @@ -78,6 +78,7 @@ add_subdirectory(barretenberg/examples)
add_subdirectory(barretenberg/flavor)
add_subdirectory(barretenberg/goblin)
add_subdirectory(barretenberg/grumpkin_srs_gen)
add_subdirectory(barretenberg/lmdblib)
add_subdirectory(barretenberg/numeric)
add_subdirectory(barretenberg/plonk)
add_subdirectory(barretenberg/plonk_honk_shared)
Expand Down Expand Up @@ -176,8 +177,9 @@ if(NOT DISABLE_AZTEC_VM)
endif()

if(NOT WASM)
# enable merkle trees
# enable merkle trees and lmdb
list(APPEND BARRETENBERG_TARGET_OBJECTS $<TARGET_OBJECTS:crypto_merkle_tree_objects>)
list(APPEND BARRETENBERG_TARGET_OBJECTS $<TARGET_OBJECTS:lmdblib_objects>)
endif()

add_library(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
#include "barretenberg/crypto/merkle_tree/hash.hpp"
#include "barretenberg/crypto/merkle_tree/indexed_tree/content_addressed_indexed_tree.hpp"
#include "barretenberg/crypto/merkle_tree/indexed_tree/indexed_leaf.hpp"
#include "barretenberg/crypto/merkle_tree/lmdb_store/callbacks.hpp"
#include "barretenberg/crypto/merkle_tree/lmdb_store/lmdb_tree_store.hpp"
#include "barretenberg/crypto/merkle_tree/node_store/cached_content_addressed_tree_store.hpp"
#include "barretenberg/crypto/merkle_tree/response.hpp"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,11 @@
# merkle tree is agnostic to hash function
barretenberg_module(
crypto_merkle_tree
lmdb
lmdblib
)

if (NOT FUZZING)
# but the tests use pedersen and poseidon
target_link_libraries(crypto_merkle_tree_tests PRIVATE stdlib_pedersen_hash stdlib_poseidon2)
add_dependencies(crypto_merkle_tree_tests lmdb_repo)
add_dependencies(crypto_merkle_tree_test_objects lmdb_repo)
endif()

add_dependencies(crypto_merkle_tree lmdb_repo)
add_dependencies(crypto_merkle_tree_objects lmdb_repo)

Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@
#include "barretenberg/common/thread_pool.hpp"
#include "barretenberg/crypto/merkle_tree/hash.hpp"
#include "barretenberg/crypto/merkle_tree/hash_path.hpp"
#include "barretenberg/crypto/merkle_tree/lmdb_store/lmdb_environment.hpp"
#include "barretenberg/crypto/merkle_tree/lmdb_store/lmdb_tree_store.hpp"
#include "barretenberg/crypto/merkle_tree/node_store/array_store.hpp"
#include "barretenberg/crypto/merkle_tree/node_store/cached_content_addressed_tree_store.hpp"
#include "barretenberg/crypto/merkle_tree/response.hpp"
#include "barretenberg/crypto/merkle_tree/signal.hpp"
#include "barretenberg/crypto/merkle_tree/types.hpp"
#include "barretenberg/ecc/curves/bn254/fr.hpp"
#include "barretenberg/lmdblib/lmdb_environment.hpp"
#include "barretenberg/relations/relation_parameters.hpp"
#include <algorithm>
#include <array>
Expand All @@ -29,6 +29,7 @@

using namespace bb;
using namespace bb::crypto::merkle_tree;
using namespace bb::lmdblib;

using Store = ContentAddressedCachedTreeStore<bb::fr>;
using TreeType = ContentAddressedAppendOnlyTree<Store, Poseidon2HashPolicy>;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ static std::vector<fr> VALUES = create_values();
inline std::string random_string()
{
std::stringstream ss;
ss << random_engine.get_random_uint256();
ss << random_engine.get_random_uint32();
return ss.str();
}

Expand Down

This file was deleted.

This file was deleted.

Loading

1 comment on commit 7e3a38e

@AztecBot
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'C++ Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.05.

Benchmark suite Current: 7e3a38e Previous: d120cbe Ratio
wasmClientIVCBench/Full/6 82651.25711600001 ms/iter 75002.674347 ms/iter 1.10
commit(t) 3757146058 ns/iter 3232912435 ns/iter 1.16

This comment was automatically generated by workflow using github-action-benchmark.

CC: @ludamad @codygunton

Please sign in to comment.