Greatly optimize type search #100

PgBiel · 2025-01-24T05:28:39Z

I had reported an apparently glaring performance issue with the LSP here: #99 (comment)

It turns out this was due to an inefficiency while searching for types in module children: using range on maps returns a copy of its values, and it seems that this in combination with .Modules() caused entire modules to be cloned somehow (or something of the sort). I switched to ModuleIds(). In addition, I avoided cloning module children by just caching the names of module children. Thus, the issue seems to be fixed: CPU usage has gone down by 5-6x when triggering completions, and is now "normal", even in a project importing a large library such as raylib.c3l.

This could be made even more efficient by keeping a global map of type names to module pointers in the symbol table, but this should do for now.

Please consider giving it a quick spin locally first to confirm it's all right :)

How I got here

To debug this, I added a CPU profiler to the LSP with the following diff, thanks to some sources found in a Google search:¹²

diff --git a/server/cmd/lsp/main.go b/server/cmd/lsp/main.go
index 87a1168774..2a926e99d6 100644
--- a/server/cmd/lsp/main.go
+++ b/server/cmd/lsp/main.go
@@ -1,10 +1,14 @@
 package main
 
+import _ "net/http/pprof" // CPU PROFILER
+
 import (
        "fmt"
        "log"
        "time"
 
+       "net/http" // CPU PROFILER
+
        "github.com/getsentry/sentry-go"
        "github.com/pherrymason/c3-lsp/internal/lsp/server"
 )
@@ -14,6 +18,12 @@
 const appName = "C3-LSP"
 
 func main() {
+       // ########### EXPOSE CPU PROFILER ###########
+       go func() {
+               log.Println(http.ListenAndServe("localhost:6060", nil))
+       }()
+        // ###########################################
+
        options, showHelp, showVersion := cmdLineArguments()
        commitHash := buildInfo()
        if showHelp {

And then, while the LSP was running, I executed, in my terminal,

curl -o analyze1.prof localhost:6060/debug/pprof/profile?seconds=20

and then proceeded to perform the operations which caused severe CPU usage and lag (triggering completions).

The saved file would then contain the information necessary to generate a flame graph of expensive function calls in the SpeedScope online app³, which I found in a blog post⁴:

After some research, I found a StackOverflow post⁵ which indicated that a repeated call to runtime.duffcopy indicates that a value is being copied many times, pointing to the usage of range in for loops. I proceeded to try to fiddle with the outermost for loop in the lowest function in the stack, but turns out that wasn't enough, leading me to create another flamegraph which pointed to the line iterating over Modules() as the culprit, which eventually led me to replacing it with ModuleIds() with a simple Get() afterwards.

I also created ChildrenNames() as an additional optimization to avoid cloning module children unnecessarily. While Indexable types are usually pointers (and I forced BaseIndexable to be a pointer for its indexable implementation), some types, such as Type itself, don't appear to be.

After all this, I got the desired result of a fast and lean LSP. 🚀

"Profiling Go programs with pprof", by Julia Evans: https://jvns.ca/blog/2017/09/24/profiling-go-with-pprof/ ↩
Official documentation for net/http/pprof, used to spin a webserver for generation of CPU profiles while a program is running: https://golang.org/pkg/net/http/pprof/ ↩
SpeedScope visualizer: https://www.speedscope.app/ ↩
"FlameGraphs for Code Optimization with Golang and SpeedScope", by Sathish Vj: https://sathishvj.medium.com/flamegraphs-for-code-optimization-with-golang-and-speedscope-80c20725fdd2 ↩
"runtime.duffcopy is called a lot" - Stack Overflow: https://stackoverflow.com/questions/45786687/runtime-duffcopy-is-called-a-lot ↩

pherrymason · 2025-01-24T06:13:50Z

👏👏👏 amazing finding!
Thank you very much for this analysis, great job at finding this, search code is not easy.
I will review this on the next version branch. Even though the new search code is much more simple, I will check if same mistakes are being done.

optimize type search

b5236a8

PgBiel mentioned this pull request Jan 24, 2025

Bump C3 tree-sitter parser to 0.2.3 #99

Merged

pherrymason merged commit 17127e9 into pherrymason:main Jan 24, 2025
2 checks passed

PgBiel deleted the optimize-type-search branch January 24, 2025 17:49

pherrymason mentioned this pull request Jan 28, 2025

Use AST #109

Draft

32 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Greatly optimize type search #100

Greatly optimize type search #100

PgBiel commented Jan 24, 2025

pherrymason commented Jan 24, 2025

Greatly optimize type search #100

Greatly optimize type search #100

Conversation

PgBiel commented Jan 24, 2025

How I got here

Footnotes

pherrymason commented Jan 24, 2025