Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greatly optimize type search #100

Merged
merged 1 commit into from
Jan 24, 2025

Conversation

PgBiel
Copy link
Contributor

@PgBiel PgBiel commented Jan 24, 2025

I had reported an apparently glaring performance issue with the LSP here: #99 (comment)

It turns out this was due to an inefficiency while searching for types in module children: using range on maps returns a copy of its values, and it seems that this in combination with .Modules() caused entire modules to be cloned somehow (or something of the sort). I switched to ModuleIds(). In addition, I avoided cloning module children by just caching the names of module children. Thus, the issue seems to be fixed: CPU usage has gone down by 5-6x when triggering completions, and is now "normal", even in a project importing a large library such as raylib.c3l.

This could be made even more efficient by keeping a global map of type names to module pointers in the symbol table, but this should do for now.

Please consider giving it a quick spin locally first to confirm it's all right :)

How I got here

To debug this, I added a CPU profiler to the LSP with the following diff, thanks to some sources found in a Google search:12

diff --git a/server/cmd/lsp/main.go b/server/cmd/lsp/main.go
index 87a1168774..2a926e99d6 100644
--- a/server/cmd/lsp/main.go
+++ b/server/cmd/lsp/main.go
@@ -1,10 +1,14 @@
 package main
 
+import _ "net/http/pprof" // CPU PROFILER
+
 import (
        "fmt"
        "log"
        "time"
 
+       "net/http" // CPU PROFILER
+
        "github.com/getsentry/sentry-go"
        "github.com/pherrymason/c3-lsp/internal/lsp/server"
 )
@@ -14,6 +18,12 @@
 const appName = "C3-LSP"
 
 func main() {
+       // ########### EXPOSE CPU PROFILER ###########
+       go func() {
+               log.Println(http.ListenAndServe("localhost:6060", nil))
+       }()
+        // ###########################################
+
        options, showHelp, showVersion := cmdLineArguments()
        commitHash := buildInfo()
        if showHelp {

And then, while the LSP was running, I executed, in my terminal,

curl -o analyze1.prof localhost:6060/debug/pprof/profile?seconds=20

and then proceeded to perform the operations which caused severe CPU usage and lag (triggering completions).

The saved file would then contain the information necessary to generate a flame graph of expensive function calls in the SpeedScope online app3, which I found in a blog post4:

image

After some research, I found a StackOverflow post5 which indicated that a repeated call to runtime.duffcopy indicates that a value is being copied many times, pointing to the usage of range in for loops. I proceeded to try to fiddle with the outermost for loop in the lowest function in the stack, but turns out that wasn't enough, leading me to create another flamegraph which pointed to the line iterating over Modules() as the culprit, which eventually led me to replacing it with ModuleIds() with a simple Get() afterwards.

I also created ChildrenNames() as an additional optimization to avoid cloning module children unnecessarily. While Indexable types are usually pointers (and I forced BaseIndexable to be a pointer for its indexable implementation), some types, such as Type itself, don't appear to be.

After all this, I got the desired result of a fast and lean LSP. 🚀

Footnotes

  1. "Profiling Go programs with pprof", by Julia Evans: https://jvns.ca/blog/2017/09/24/profiling-go-with-pprof/

  2. Official documentation for net/http/pprof, used to spin a webserver for generation of CPU profiles while a program is running: https://golang.org/pkg/net/http/pprof/

  3. SpeedScope visualizer: https://www.speedscope.app/

  4. "FlameGraphs for Code Optimization with Golang and SpeedScope", by Sathish Vj: https://sathishvj.medium.com/flamegraphs-for-code-optimization-with-golang-and-speedscope-80c20725fdd2

  5. "runtime.duffcopy is called a lot" - Stack Overflow: https://stackoverflow.com/questions/45786687/runtime-duffcopy-is-called-a-lot

@pherrymason
Copy link
Owner

👏👏👏 amazing finding!
Thank you very much for this analysis, great job at finding this, search code is not easy.
I will review this on the next version branch. Even though the new search code is much more simple, I will check if same mistakes are being done.

@pherrymason pherrymason merged commit 17127e9 into pherrymason:main Jan 24, 2025
2 checks passed
@PgBiel PgBiel deleted the optimize-type-search branch January 24, 2025 17:49
@pherrymason pherrymason mentioned this pull request Jan 28, 2025
32 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants