Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve namespace lookup using .debug_names parent chain #110062

Merged
merged 2 commits into from
Oct 30, 2024

Conversation

jeffreytan81
Copy link
Contributor

Summary

This PR is a continuation of #108907 by using .debug_names parent chain faster lookup for namespaces.

Implementation

Similar to #108907. This PR adds a new API: GetNamespacesWithParents in DWARFIndex base class. The API performs the same function as GetNamespaces() with additional filtering using parents CompilerDeclContext. A default implementation is given in DWARFIndex class which parses debug info and performs the matching. In the DebugNameDWARFIndex override, parents CompilerDeclContext is cross checked with parent chain in .debug_names for much faster filtering before fallback to base implementation for final filtering.

Performance Results

For the same benchmark used in #108907, this PR improves: 48s => 28s

@llvmbot
Copy link
Member

llvmbot commented Sep 26, 2024

@llvm/pr-subscribers-lldb

Author: None (jeffreytan81)

Changes

Summary

This PR is a continuation of #108907 by using .debug_names parent chain faster lookup for namespaces.

Implementation

Similar to #108907. This PR adds a new API: GetNamespacesWithParents in DWARFIndex base class. The API performs the same function as GetNamespaces() with additional filtering using parents CompilerDeclContext. A default implementation is given in DWARFIndex class which parses debug info and performs the matching. In the DebugNameDWARFIndex override, parents CompilerDeclContext is cross checked with parent chain in .debug_names for much faster filtering before fallback to base implementation for final filtering.

Performance Results

For the same benchmark used in #108907, this PR improves: 48s => 28s


Full diff: https://github.com/llvm/llvm-project/pull/110062.diff

5 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp (+41)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h (+23)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp (+164-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h (+36)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+4-3)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
index eafddbad03f57b..c6c5b8e8af1450 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.cpp
@@ -126,3 +126,44 @@ bool DWARFIndex::GetFullyQualifiedTypeImpl(
     return callback(die);
   return true;
 }
+
+void DWARFIndex::GetTypesWithQuery(
+    TypeQuery &query, llvm::function_ref<bool(DWARFDIE die)> callback) {
+  GetTypes(query.GetTypeBasename(), [&](DWARFDIE die) {
+    return ProcessTypeDieMatchQuery(query, die, callback);
+  });
+}
+
+bool DWARFIndex::ProcessTypeDieMatchQuery(
+    TypeQuery &query, DWARFDIE die,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  // Nothing to match from query
+  if (query.GetContextRef().size() <= 1)
+    return callback(die);
+
+  std::vector<lldb_private::CompilerContext> die_context;
+  if (query.GetModuleSearch())
+    die_context = die.GetDeclContext();
+  else
+    die_context = die.GetTypeLookupContext();
+
+  if (!query.ContextMatches(die_context))
+    return true;
+  return callback(die);
+}
+
+void DWARFIndex::GetNamespacesWithParents(
+    ConstString name, const CompilerDeclContext &parent_decl_ctx,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  GetNamespaces(name, [&](DWARFDIE die) {
+    return ProcessNamespaceDieMatchParents(parent_decl_ctx, die, callback);
+  });
+}
+
+bool DWARFIndex::ProcessNamespaceDieMatchParents(
+    const CompilerDeclContext &parent_decl_ctx, DWARFDIE die,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  if (!SymbolFileDWARF::DIEInDeclContext(parent_decl_ctx, die))
+    return true;
+  return callback(die);
+}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
index cb3ae8a06d7885..91f4a8d4713543 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFIndex.h
@@ -64,6 +64,21 @@ class DWARFIndex {
   virtual void
   GetNamespaces(ConstString name,
                 llvm::function_ref<bool(DWARFDIE die)> callback) = 0;
+  /// Get type DIEs meeting requires of \a query.
+  /// in its decl parent chain as subset.  A base implementation is provided,
+  /// Specializations should override this if they are able to provide a faster
+  /// implementation.
+  virtual void
+  GetTypesWithQuery(TypeQuery &query,
+                    llvm::function_ref<bool(DWARFDIE die)> callback);
+  /// Get namespace DIEs whose base name match \param name with \param
+  /// parent_decl_ctx in its decl parent chain.  A base implementation
+  /// is provided. Specializations should override this if they are able to
+  /// provide a faster implementation.
+  virtual void
+  GetNamespacesWithParents(ConstString name,
+                           const CompilerDeclContext &parent_decl_ctx,
+                           llvm::function_ref<bool(DWARFDIE die)> callback);
   virtual void
   GetFunctions(const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
                const CompilerDeclContext &parent_decl_ctx,
@@ -115,6 +130,14 @@ class DWARFIndex {
   bool
   GetFullyQualifiedTypeImpl(const DWARFDeclContext &context, DWARFDIE die,
                             llvm::function_ref<bool(DWARFDIE die)> callback);
+
+  /// Check if the type \a die can meet the requirements of \a query.
+  bool
+  ProcessTypeDieMatchQuery(TypeQuery &query, DWARFDIE die,
+                           llvm::function_ref<bool(DWARFDIE die)> callback);
+  bool ProcessNamespaceDieMatchParents(
+      const CompilerDeclContext &parent_decl_ctx, DWARFDIE die,
+      llvm::function_ref<bool(DWARFDIE die)> callback);
 };
 } // namespace dwarf
 } // namespace lldb_private::plugin
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 32d8a92305aafa..d22b5dbe7dee42 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -301,7 +301,8 @@ using Entry = llvm::DWARFDebugNames::Entry;
 /// If any parent does not have an `IDX_parent`, or the Entry data is corrupted,
 /// nullopt is returned.
 std::optional<llvm::SmallVector<Entry, 4>>
-getParentChain(Entry entry, uint32_t max_parents) {
+getParentChain(Entry entry,
+               uint32_t max_parents = std::numeric_limits<uint32_t>::max()) {
   llvm::SmallVector<Entry, 4> parent_entries;
 
   do {
@@ -374,6 +375,21 @@ void DebugNamesDWARFIndex::GetFullyQualifiedType(
   m_fallback.GetFullyQualifiedType(context, callback);
 }
 
+bool DebugNamesDWARFIndex::SameAsEntryContext(
+    const CompilerContext &query_context,
+    const DebugNames::Entry &entry) const {
+  // TODO: check dwarf tag matches.
+  // Peek at the AT_name of `entry` and test equality to `name`.
+  auto maybe_dieoffset = entry.getDIEUnitOffset();
+  if (!maybe_dieoffset)
+    return false;
+  DWARFUnit *unit = GetNonSkeletonUnit(entry);
+  if (!unit)
+    return false;
+  return query_context.name ==
+         unit->PeekDIEName(unit->GetOffset() + *maybe_dieoffset);
+}
+
 bool DebugNamesDWARFIndex::SameParentChain(
     llvm::ArrayRef<llvm::StringRef> parent_names,
     llvm::ArrayRef<DebugNames::Entry> parent_entries) const {
@@ -402,6 +418,45 @@ bool DebugNamesDWARFIndex::SameParentChain(
   return true;
 }
 
+bool DebugNamesDWARFIndex::SameParentChain(
+    llvm::ArrayRef<CompilerContext> parent_contexts,
+    llvm::ArrayRef<DebugNames::Entry> parent_entries) const {
+  if (parent_entries.size() != parent_contexts.size())
+    return false;
+
+  // If the AT_name of any parent fails to match the expected name, we don't
+  // have a match.
+  for (auto [parent_context, parent_entry] :
+       llvm::zip_equal(parent_contexts, parent_entries))
+    if (!SameAsEntryContext(parent_context, parent_entry))
+      return false;
+  return true;
+}
+
+bool DebugNamesDWARFIndex::WithinParentChain(
+    llvm::ArrayRef<CompilerContext> query_contexts,
+    llvm::ArrayRef<DebugNames::Entry> parent_chain) const {
+  if (query_contexts.size() == parent_chain.size())
+    return SameParentChain(query_contexts, parent_chain);
+
+  // If parent chain does not have enough entries, we can't possibly have a
+  // match.
+  while (!query_contexts.empty() &&
+         query_contexts.size() <= parent_chain.size()) {
+    if (SameAsEntryContext(query_contexts.front(), parent_chain.front())) {
+      query_contexts = query_contexts.drop_front();
+      parent_chain = parent_chain.drop_front();
+    } else {
+      // Name does not match, try next parent_chain entry if the current entry
+      // is namespace because the current one can be an inline namespace.
+      if (parent_chain.front().tag() != DW_TAG_namespace)
+        return false;
+      parent_chain = parent_chain.drop_front();
+    }
+  }
+  return query_contexts.empty();
+}
+
 void DebugNamesDWARFIndex::GetTypes(
     ConstString name, llvm::function_ref<bool(DWARFDIE die)> callback) {
   for (const DebugNames::Entry &entry :
@@ -444,6 +499,114 @@ void DebugNamesDWARFIndex::GetNamespaces(
   m_fallback.GetNamespaces(name, callback);
 }
 
+llvm::SmallVector<CompilerContext>
+DebugNamesDWARFIndex::GetTypeQueryParentContexts(TypeQuery &query) {
+  std::vector<lldb_private::CompilerContext> &query_decl_context =
+      query.GetContextRef();
+  llvm::SmallVector<CompilerContext> parent_contexts;
+  if (!query_decl_context.empty()) {
+    // Skip the last entry, it is the type we are looking for.
+    // Reverse the query decl context to match parent chain.
+    llvm::ArrayRef<CompilerContext> parent_contexts_ref(
+        query_decl_context.data(), query_decl_context.size() - 1);
+    for (const CompilerContext &ctx : llvm::reverse(parent_contexts_ref)) {
+      // Skip any context without name because .debug_names might not encode
+      // them. (e.g. annonymous namespace)
+      if ((ctx.kind & CompilerContextKind::AnyType) !=
+              CompilerContextKind::Invalid &&
+          !ctx.name.IsEmpty())
+        parent_contexts.push_back(ctx);
+    }
+  }
+  return parent_contexts;
+}
+
+void DebugNamesDWARFIndex::GetTypesWithQuery(
+    TypeQuery &query, llvm::function_ref<bool(DWARFDIE die)> callback) {
+  ConstString name = query.GetTypeBasename();
+  std::vector<lldb_private::CompilerContext> query_context =
+      query.GetContextRef();
+  if (query_context.size() <= 1)
+    return GetTypes(name, callback);
+
+  llvm::SmallVector<CompilerContext> parent_contexts =
+      GetTypeQueryParentContexts(query);
+  // For each entry, grab its parent chain and check if we have a match.
+  for (const DebugNames::Entry &entry : m_debug_names_up->equal_range(name)) {
+    if (!isType(entry.tag()))
+      continue;
+
+    // If we get a NULL foreign_tu back, the entry doesn't match the type unit
+    // in the .dwp file, or we were not able to load the .dwo file or the DWO ID
+    // didn't match.
+    std::optional<DWARFTypeUnit *> foreign_tu = GetForeignTypeUnit(entry);
+    if (foreign_tu && foreign_tu.value() == nullptr)
+      continue;
+
+    std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
+        getParentChain(entry);
+    if (!parent_chain) {
+      // Fallback: use the base class implementation.
+      if (!ProcessEntry(entry, [&](DWARFDIE die) {
+            return ProcessTypeDieMatchQuery(query, die, callback);
+          }))
+        return;
+      continue;
+    }
+
+    if (WithinParentChain(parent_contexts, *parent_chain) &&
+        !ProcessEntry(entry, [&](DWARFDIE die) {
+          // After .debug_names filtering still sending to base class for
+          // further filtering before calling the callback.
+          return ProcessTypeDieMatchQuery(query, die, callback);
+        }))
+      return;
+  }
+  m_fallback.GetTypesWithQuery(query, callback);
+}
+
+void DebugNamesDWARFIndex::GetNamespacesWithParents(
+    ConstString name, const CompilerDeclContext &parent_decl_ctx,
+    llvm::function_ref<bool(DWARFDIE die)> callback) {
+  std::vector<lldb_private::CompilerContext> parent_contexts =
+      parent_decl_ctx.GetCompilerContext();
+  if (parent_contexts.empty())
+    return GetNamespaces(name, callback);
+
+  llvm::SmallVector<CompilerContext> parent_named_contexts;
+  std::copy_if(parent_contexts.rbegin(), parent_contexts.rend(),
+               std::back_inserter(parent_named_contexts),
+               [](const CompilerContext &ctx) { return !ctx.name.IsEmpty(); });
+  for (const DebugNames::Entry &entry :
+       m_debug_names_up->equal_range(name.GetStringRef())) {
+    lldb_private::dwarf::Tag entry_tag = entry.tag();
+    if (entry_tag == DW_TAG_namespace ||
+        entry_tag == DW_TAG_imported_declaration) {
+      std::optional<llvm::SmallVector<Entry, 4>> parent_chain =
+          getParentChain(entry);
+      if (!parent_chain) {
+        // Fallback: use the base class implementation.
+        if (!ProcessEntry(entry, [&](DWARFDIE die) {
+              return ProcessNamespaceDieMatchParents(parent_decl_ctx, die,
+                                                     callback);
+            }))
+          return;
+        continue;
+      }
+
+      if (WithinParentChain(parent_named_contexts, *parent_chain) &&
+          !ProcessEntry(entry, [&](DWARFDIE die) {
+            // After .debug_names filtering still sending to base class for
+            // further filtering before calling the callback.
+            return ProcessNamespaceDieMatchParents(parent_decl_ctx, die,
+                                                   callback);
+          }))
+        return;
+    }
+  }
+  m_fallback.GetNamespacesWithParents(name, parent_decl_ctx, callback);
+}
+
 void DebugNamesDWARFIndex::GetFunctions(
     const Module::LookupInfo &lookup_info, SymbolFileDWARF &dwarf,
     const CompilerDeclContext &parent_decl_ctx,
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
index cb15c1d4f994b3..d2f3186fd9118a 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.h
@@ -52,6 +52,12 @@ class DebugNamesDWARFIndex : public DWARFIndex {
                 llvm::function_ref<bool(DWARFDIE die)> callback) override;
   void GetNamespaces(ConstString name,
                      llvm::function_ref<bool(DWARFDIE die)> callback) override;
+  void
+  GetTypesWithQuery(TypeQuery &query,
+                    llvm::function_ref<bool(DWARFDIE die)> callback) override;
+  void GetNamespacesWithParents(
+      ConstString name, const CompilerDeclContext &parent_decl_ctx,
+      llvm::function_ref<bool(DWARFDIE die)> callback) override;
   void GetFunctions(const Module::LookupInfo &lookup_info,
                     SymbolFileDWARF &dwarf,
                     const CompilerDeclContext &parent_decl_ctx,
@@ -118,6 +124,36 @@ class DebugNamesDWARFIndex : public DWARFIndex {
   bool SameParentChain(llvm::ArrayRef<llvm::StringRef> parent_names,
                        llvm::ArrayRef<DebugNames::Entry> parent_entries) const;
 
+  bool SameParentChain(llvm::ArrayRef<CompilerContext> parent_names,
+                       llvm::ArrayRef<DebugNames::Entry> parent_entries) const;
+
+  /// Returns true if \a parent_contexts entries are within \a parent_chain.
+  /// This is diffferent from SameParentChain() which checks for exact match.
+  /// This function is required because \a parent_chain can contain inline
+  /// namespace entries which may not be specified in \a parent_contexts by
+  /// client.
+  ///
+  /// \param[in] parent_contexts
+  ///   The list of parent contexts to check for.
+  ///
+  /// \param[in] parent_chain
+  ///   The fully qualified parent chain entries from .debug_names index table
+  ///   to check against.
+  ///
+  /// \returns
+  ///   True if all \a parent_contexts entries are can be sequentially found
+  ///   inside
+  ///   \a parent_chain, otherwise False.
+  bool WithinParentChain(llvm::ArrayRef<CompilerContext> parent_contexts,
+                         llvm::ArrayRef<DebugNames::Entry> parent_chain) const;
+
+  /// Returns true if .debug_names pool entry \p entry matches \p query_context.
+  bool SameAsEntryContext(const CompilerContext &query_context,
+                          const DebugNames::Entry &entry) const;
+
+  llvm::SmallVector<CompilerContext>
+  GetTypeQueryParentContexts(TypeQuery &query);
+
   static void MaybeLogLookupError(llvm::Error error,
                                   const DebugNames::NameIndex &ni,
                                   llvm::StringRef name);
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index f721ca00fd3559..f1f2b8663c43a1 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -2748,8 +2748,9 @@ void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
 
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
 
+  TypeQuery query_full(query);
   bool have_index_match = false;
-  m_index->GetTypes(type_basename, [&](DWARFDIE die) {
+  m_index->GetTypesWithQuery(query_full, [&](DWARFDIE die) {
     // Check the language, but only if we have a language filter.
     if (query.HasLanguage()) {
       if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
@@ -2813,7 +2814,7 @@ void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
       auto type_basename_simple = query_simple.GetTypeBasename();
       // Copy our match's context and update the basename we are looking for
       // so we can use this only to compare the context correctly.
-      m_index->GetTypes(type_basename_simple, [&](DWARFDIE die) {
+      m_index->GetTypesWithQuery(query_simple, [&](DWARFDIE die) {
         // Check the language, but only if we have a language filter.
         if (query.HasLanguage()) {
           if (!query.LanguageMatches(GetLanguageFamily(*die.GetCU())))
@@ -2898,7 +2899,7 @@ SymbolFileDWARF::FindNamespace(ConstString name,
   if (!DeclContextMatchesThisSymbolFile(parent_decl_ctx))
     return namespace_decl_ctx;
 
-  m_index->GetNamespaces(name, [&](DWARFDIE die) {
+  m_index->GetNamespacesWithParents(name, parent_decl_ctx, [&](DWARFDIE die) {
     if (!DIEInDeclContext(parent_decl_ctx, die, only_root_namespaces))
       return true; // The containing decl contexts don't match
 

@jeffreytan81
Copy link
Contributor Author

This is a stack PR from #108907, so please review the new namespaces lookup changes and ignore the changes in #108907. Let me know if there is a way to only show the delta changes (not merged view).

@labath
Copy link
Collaborator

labath commented Sep 27, 2024

This is a stack PR from #108907, so please review the new namespaces lookup changes and ignore the changes in #108907. Let me know if there is a way to only show the delta changes (not merged view).

That should be possible if you arrange the PR to only contain two commits, where the first one is the dependent PR, and the second one is the new changes.

I know that means you'll need to rebase (instead of merging) the PR when incorporating changes (which isn't exactly encouraged on github), but I think that's the lesser of two evils. When both of the PRs are touching the same set of files, it's very hard to separate the two in your head.

@jeffreytan81 jeffreytan81 force-pushed the improve_namespace_lookup_review branch from 9a020d8 to 353308a Compare October 9, 2024 23:43
@jeffreytan81 jeffreytan81 requested a review from Jlalond October 9, 2024 23:43
@jeffreytan81 jeffreytan81 force-pushed the improve_namespace_lookup_review branch from 353308a to c5bbc5f Compare October 9, 2024 23:46
@jeffreytan81 jeffreytan81 changed the title Improve namespace lookup review Improve namespace lookup using .debug_names parent chain Oct 9, 2024
Comment on lines 573 to 574
if (parent_contexts.empty())
return GetNamespaces(name, callback);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that right? AIUI, GetNamespaces will return all namespaces with the given name, regardless of their nesting level, but parent_contexts.empty() basically means "I'm looking for a top-level namespace".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, do we expect if "parent_decl_ctx" is empty that we return every namespace, or just top level ones? Depends on how this API is being used.

Comment on lines +576 to +579
llvm::SmallVector<CompilerContext> parent_named_contexts;
std::copy_if(parent_contexts.rbegin(), parent_contexts.rend(),
std::back_inserter(parent_named_contexts),
[](const CompilerContext &ctx) { return !ctx.name.IsEmpty(); });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to be really fancy, you could try something like this:

Suggested change
llvm::SmallVector<CompilerContext> parent_named_contexts;
std::copy_if(parent_contexts.rbegin(), parent_contexts.rend(),
std::back_inserter(parent_named_contexts),
[](const CompilerContext &ctx) { return !ctx.name.IsEmpty(); });
llvm::SmallVector<CompilerContext> parent_named_contexts(llvm::make_filter_range(llvm::reverse(parent_contexts),
[](const CompilerContext &ctx) { return !ctx.name.IsEmpty(); }));

:P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing, good to know. Did not find any lldb code using this API, so will keep existing code.

Comment on lines 597 to 598
if (WithinParentChain(parent_named_contexts, *parent_chain) &&
!ProcessEntry(entry, [&](DWARFDIE die) {
Copy link
Collaborator

@labath labath Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be better to unwrap this into two ifs. The way it is now, it took me a while to figure out that the return only happens when it should (i.e., only when the callback says it is done, and not because WithinParentChain did not find a match).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past, we've floated around the idea of swapping the boolean return from the callbacks with some enumeration saying: KeepSearching, StopSearching or something along those lines. This should also improve readability of code invoking the callbacks (and the callbacks themselves).

But I also agree with unwrapping the ifs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a private enum:

enum class IterationAction {
  Continue = 0,
  Stop,
};

ConstString name, const CompilerDeclContext &parent_decl_ctx,
llvm::function_ref<bool(DWARFDIE die)> callback) {
GetNamespaces(name, [&](DWARFDIE die) {
return ProcessNamespaceDieMatchParents(parent_decl_ctx, die, callback);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is die here a DW_AT_namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@@ -71,6 +71,14 @@ class DWARFIndex {
virtual void
GetTypesWithQuery(TypeQuery &query,
llvm::function_ref<bool(DWARFDIE die)> callback);
/// Get namespace DIEs whose base name match \param name with \param
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is out of date now. Update to mention query

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no query in GetNamespacesWithParents API. Are you reading GetTypesWithQuery() API whose comment has mentioned query above.

Comment on lines 573 to 574
if (parent_contexts.empty())
return GetNamespaces(name, callback);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, do we expect if "parent_decl_ctx" is empty that we return every namespace, or just top level ones? Depends on how this API is being used.

Comment on lines 597 to 598
if (WithinParentChain(parent_named_contexts, *parent_chain) &&
!ProcessEntry(entry, [&](DWARFDIE die) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a private enum:

enum class IterationAction {
  Continue = 0,
  Stop,
};

@jeffreytan81 jeffreytan81 merged commit cb04d33 into llvm:main Oct 30, 2024
7 checks passed
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
## Summary
This PR is a continuation of
llvm#108907 by using `.debug_names`
parent chain faster lookup for namespaces.


## Implementation
Similar to llvm#108907. This PR
adds a new API: `GetNamespacesWithParents` in `DWARFIndex` base class.
The API performs the same function as `GetNamespaces()` with additional
filtering using parents `CompilerDeclContext`. A default implementation
is given in `DWARFIndex` class which parses debug info and performs the
matching. In the `DebugNameDWARFIndex` override, parents
`CompilerDeclContext` is cross checked with parent chain in
`.debug_names` for much faster filtering before fallback to base
implementation for final filtering.

## Performance Results
For the same benchmark used in
llvm#108907, this PR improves: 48s
=> 28s

---------

Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
dmpots pushed a commit to dmpots/llvm-project that referenced this pull request Mar 1, 2025
Summary:
This diff fixes T199918951 by changing lldb type and namespace lookup to use .debug_names parent chain.
It has incorporated latest feedback from Apple/Google. We need to land it internally because:
* Unblock Ads partners from dogfooding (currently, they keep on hitting this long delay which disrupts the dogfooding experience)
* Dogfooding to flush out any new issues

Note: I will keep update the internal patch the same way as the single thread stepping deadlock timeout PR so that we won't get any merge conflict.

It includes 4 commits squashed into one diff:
* This reverts the internal patch from @WanYi to "Revert PeekDIEName related patch"
    * Note: `PeekDIEName()` has been removed my the new changes so we won't suffer the original crash anymore.
* Improve type lookup review: llvm#108907
* Improve namespace lookup review: llvm#110062
* Use a better algorithm for SameAsEntryContext: Pending

=================== Original PR Summary ===================
A customer has reported that expanding the "this" pointer for a non-trivial class method context takes more than **70–90 seconds**. This issue occurs on a Linux target with the `.debug_names` index table enabled and using split DWARF (DWO) files to avoid debug info overflow.

Profiling shows that the majority of the bottleneck occurs in the `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()` functions, which search for types and namespaces in the `.debug_names` index table by base name, then parse/load the underlying DWO files—an operation that can be very slow.

The whole profile trace is pretty big, but here is a part of the hot path:
```
--61.01%-- clang::ASTNodeImporter::VisitRecordDecl(clang::RecordDecl*)
  --61.00%-- clang::ASTNodeImporter::ImportDeclParts(clang::NamedDecl*, clang::DeclContext*&, clang::DeclContext*&, clang::DeclarationName&, clang::NamedDecl*&, clang::SourceLocation&)
    clang::ASTImporter::ImportContext(clang::DeclContext*)
    clang::ASTImporter::Import(clang::Decl*)
    lldb_private::ClangASTImporter::ASTImporterDelegate::Imported(clang::Decl*, clang::Decl*)
    lldb_private::ClangASTImporter::BuildNamespaceMap(clang::NamespaceDecl const*)
    lldb_private::ClangASTSource::CompleteNamespaceMap(
        std::shared_ptr<std::vector<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>,
        std::allocator<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>>> >&,
        lldb_private::ConstString,
        std::shared_ptr<std::vector<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>,
        std::allocator<std::pair<std::shared_ptr<lldb_private::Module>, lldb_private::CompilerDeclContext>>> >&) const
    )
    lldb_private::plugin::dwarf::SymbolFileDWARF::FindNamespace(lldb_private::ConstString, lldb_private::CompilerDeclContext const&, bool)
    lldb_private::plugin::dwarf::DebugNamesDWARFIndex::GetNamespaces(lldb_private::ConstString, llvm::function_ref<bool (lldb_private::plugin::dwarf::DWARFDIE)>)

      --60.29%-- lldb_private::plugin::dwarf::DebugNamesDWARFIndex::GetDIE(llvm::DWARFDebugNames::Entry const&) const
        --46.11%-- lldb_private::plugin::dwarf::DWARFUnit::GetDIE(unsigned long)
          --46.11%-- lldb_private::plugin::dwarf::DWARFUnit::ExtractDIEsIfNeeded()
            --45.77%-- lldb_private::plugin::dwarf::DWARFUnit::ExtractDIEsRWLocked()
              --23.22%-- lldb_private::plugin::dwarf::DWARFDebugInfoEntry::Extract(lldb_private::DWARFDataExtractor const&,
                           lldb_private::plugin::dwarf::DWARFUnit const&, unsigned long*)
                --10.04%-- lldb_private::plugin::dwarf::DWARFFormValue::SkipValue(llvm::dwarf::Form,
                             lldb_private::DWARFDataExtractor const&, unsigned long*,
                             lldb_private::plugin::dwarf::DWARFUnit const*)
```

This PR improves `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()` by utilizing the newly added parent chain `DW_IDX_parent` in `.debug_names`. The proposal was originally discussed in [this RFC](https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151).

This PR also introduces a new algorithm in `DebugNamesDWARFIndex::SameAsEntryATName()` that significantly reduces the need to parse DWO files.

To leverage the parent chain for `SymbolFileDWARF::FindTypes()` and `SymbolFileDWARF::FindNamespace()`, this PR adds two new APIs: `GetTypesWithParents` and `GetNamespacesWithParents` in the `DWARFIndex` base class. These APIs perform the same function as `GetTypes`/`GetNamespaces`, with additional filtering if the `parent_names` parameter is not empty. Since this only introduces filtering, the callback mechanisms at all call sites remain unchanged. A default implementation in the `DWARFIndex` class parses the type context from each base name matching DIE, then filter by parent_names. In the `DebugNameDWARFIndex` override, the parent_names is cross checked with parent chain in .debug_names for for much faster filtering.

Unlike the `GetFullyQualifiedType` API, which fully consumes the `DW_IDX_parent` parent chain for exact matching, these new APIs perform partial subset matching for type/namespace queries. This is necessary to support queries involving anonymous or inline namespaces. For instance, a user might request `NS1::NS2::NS3::Foo`, while the index table's parent chain might contain `NS1::inline_NS2::NS3::Foo`, which would fail exact matching.

Another optimization is implemented in `DebugNamesDWARFIndex::SameAsEntryATName()`. The old implementation used `GetNonSkeletonUnit()`, which triggered parsing/loading of underlying DWO files—a very costly operation we aim to avoid. The new implementation minimizes the need to touch DWO files by employing two strategies:
1. Build a `<pool_entry_range => NameEntry>` mapping table (built lazily as needed), allowing us to check the name of the parent entry.
2. Search for the name the entry is trying to match from the current `NameIndex`, then check if the parent entry can be found from the name entry.

In my testing and profiling, strategy (1) did not improve wall-time, as the cost of building the mapping table was high. However, strategy (2) proved to be very fast.

=========================================================

Test Plan:
* `ninja check-lldb` are passing.
* It improves T199918951 from around 110s+ => 1.8s.

Reviewers: toyang, wanyi, jsamouh, jaihari, jalalonde, #lldb_team

Reviewed By: wanyi

Subscribers: wanyi, #lldb_team

Differential Revision: https://phabricator.intern.facebook.com/D63440813

Tasks: T199918951
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants