Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

track_and_verify_wals_in_manifest causing crashes/coredumps #13235

Open
zaidoon1 opened this issue Dec 19, 2024 · 0 comments
Open

track_and_verify_wals_in_manifest causing crashes/coredumps #13235

zaidoon1 opened this issue Dec 19, 2024 · 0 comments

Comments

@zaidoon1
Copy link
Contributor

zaidoon1 commented Dec 19, 2024

rocksdb version: 9.9.3

setup:

Service A (runs rocksdb in read/write mode) and has track_and_verify_wals_in_manifest = true.

Service B is a cronjob that when started sends a request to Service A to create a checkpoint and then opens the checkpoint in read only mode.

Issue:

Service B crashes/coredumps while opening the DB in read only mode

back trace:

(gdb) bt
#0  std::_Hashtable<unsigned int, std::pair<unsigned int const, unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::size (this=0x0) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/hashtable.h:649
#1  std::_Hashtable<unsigned int, std::pair<unsigned int const, unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::find (this=0x0, __k=<optimized out>)
    at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/hashtable.h:1645
#2  std::unordered_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, unsigned int> > >::find (this=0x0, __x=<optimized out>) at /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/unordered_map.h:869
#3  rocksdb::DBImpl::RecoveryContext::UpdateVersionEdits (this=0x0, cfd=0x7fd6956d8c00, edit=...) at rocksdb/db/db_impl/db_impl.h:1437
#4  0x000055f793daf4e8 in rocksdb::DBImpl::Recover (this=0x7fd6956b7000, column_families=..., read_only=true, error_if_wal_file_exists=false, 
    error_if_data_exists_in_wals=false, is_retry=<optimized out>, recovered_seq=0x0, recovery_ctx=0x0, can_retry=0x0) at rocksdb/db/db_impl/db_impl_open.cc:774
#5  0x000055f793ba246c in rocksdb::DBImplReadOnly::OpenForReadOnlyWithoutCheck (db_options=..., dbname="/some/path/checkpoint", 
    column_families=std::vector of length 4, capacity 4 = {...}, handles=<optimized out>, dbptr=<optimized out>, error_if_wal_file_exists=<optimized out>)
    at rocksdb/db/db_impl/db_impl_readonly.cc:338
#6  rocksdb::DB::OpenForReadOnly (db_options=..., dbname="/some/path/checkpoint", column_families=std::vector of length 4, capacity 4 = {...}, 
    handles=<optimized out>, dbptr=<optimized out>, error_if_wal_file_exists=<optimized out>) at rocksdb/db/db_impl/db_impl_readonly.cc:322
#7  rocksdb_open_for_read_only_column_families (db_options=0x7fd695663000, name=<optimized out>, num_column_families=<optimized out>, 
    column_family_names=<optimized out>, column_family_options=0x7fd6956005c0, column_family_handles=0x7fd6956005a0, error_if_wal_file_exists=0 '\000', 
    errptr=0x7fd695dfd4e0) at rocksdb/db/c.cc:1038

Some notes:

  1. Service A writes keys to a certain cf with sync=true, while other cf don't use the sync option. Not sure if that matters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant