Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working with NFS mounted projects #64

Closed
somebody32 opened this issue Jun 1, 2018 · 26 comments
Closed

Not working with NFS mounted projects #64

somebody32 opened this issue Jun 1, 2018 · 26 comments

Comments

@somebody32
Copy link

I'm using solargraph with VS code and found out that when I open a project from nfs-share go-to-definition/suggest is not working at all (or maybe working but is too slow to verify that), without any errors or warnings in devtools. The same project moved out from the share works perfectly. Any way to help you to identify the problem?

@castwide
Copy link
Owner

castwide commented Jun 2, 2018

Confirmed. The problem appears to be specific to network paths.

  • I opened a workspace on the local drive (e.g., C:\path\to\project) and confirmed that go-to-definition worked for one of the project's methods.
  • I opened a copy of the project on a network path (e.g., \\server\path\to\project). Trying go-to-definition on the same method did not work.
  • I opened the same project using a mapped network drive (e.g., S:\path\to\project). This time go-to-definition worked.

I suspect this is a glitch in how the server converts file URIs to paths.

Can you confirm whether it works for you with a mapped network drive instead of a network path?

@somebody32
Copy link
Author

@castwide hmm, I'm having issues exactly with mapped(mounted) version on osx, ie with:
mount -t nfs -o noatime,noowners,soft,vers=3,resvport server-path ./local-path

@castwide
Copy link
Owner

castwide commented Jul 1, 2018

The Workspace config was unnecessarily converting paths using File.realdirpath while reading workspace directories. Removing the conversion fixed the problem with mounted/network paths in my tests.

The update is on the master branch and will be included in gem version 0.23.0, which I expect to release tomorrow.

@somebody32
Copy link
Author

somebody32 commented Jul 2, 2018

Amazing, thanks a lot! Really eager to try new version! 🎉

@castwide
Copy link
Owner

castwide commented Jul 2, 2018

Published gem v0.23.0.

@somebody32
Copy link
Author

Just tried it: it works spontaneously: sometimes go to definition from vs code almost instantaneous but then for the same constants it could take minutes. Any way I could help you debug that?

@castwide
Copy link
Owner

castwide commented Jul 2, 2018

Hmm. I just tried another test and found an additional issue with Windows network paths that may or may not be related. I'll have to jump back on a MacBook later and give it another try.

I suppose there's also a possibility that it's related to network latency, but I would expect those problems to manifest during workspace initialization, not active usage.

If you see any error messages in the developer console or any other output that might help identify a root cause, please let me know.

@somebody32
Copy link
Author

I've spotted an interesting pattern: nfs activity never stops if I have vscode-solargraph extension enabled and looks like there is no relation between the number of calls and if I'm asking to go to definition or not. Is there any way to find out if indexing finished?

Btw, not seeing that behavior when just doing solargraph socket in the terminal

@castwide
Copy link
Owner

castwide commented Jul 3, 2018

In VS Code, there are two messages in the status bar at the bottom left: Starting the Solargraph language server... and Solargraph is ready. The ready message should indicate that the workspace has been mapped. After that, I would expect reduced NFS activity, since the maps are in memory and Solargraph should only need to access the files when it gets notified of a change on disk.

The socket server doesn't start reading the workspace until the client sends the initialize method.

How large is your project? Particularly number of Ruby files and required gems.

@somebody32
Copy link
Author

I'm seeing both of these messages but nfs activity is not going down after Solargraph is ready.

it is about 30+kloc and 300+ gems

@castwide
Copy link
Owner

castwide commented Jul 3, 2018

Are the gems installed on an NFS mount as well? Solargraph reads YARD documentation for gems, which might account for the latency. That could also account for some queries being instantaneous, as YARD results get cached in memory.

@somebody32
Copy link
Author

yep, the whole project is on nfs mount. I've left vscode untouched for almost 2 hours, the same level of nfs activity, where 95+% are getattr calls

Instantaneous (less than 3-4 sec) queries are really rare, and only for constants which were looked up already (ie hover on User -> go to definition -> wait for a couple of minutes -> see user.rb, go back and go to definition again). Sometimes the second go-to is fast but sometimes again a couple of minutes

@castwide
Copy link
Owner

castwide commented Jul 3, 2018

Thanks for the detail. I should have time to troubleshoot on a MacBook in the next day or two.

@somebody32
Copy link
Author

somebody32 commented Aug 17, 2018

Today I've tried the last release again, the problem is still there, but I've managed to run dtruss on the server pid to see what it is doing, and found out strange things:

it is full of calls like these:

getdirentries64(0xD, 0x7FF8D7886E00, 0x1000)		 = 216 0
lstat64("/Users/som32/code/remote_code/node_modules/parse-entities/readme.md\0", 0x70000DBAC1E8, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/node_modules/parse-entities/package.json\0", 0x70000DBAC1E8, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/node_modules/parse-entities/index.js\0", 0x70000DBAC1E8, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/node_modules/parse-entities/LICENSE\0", 0x70000DBAC1E8, 0x0)		 = 0 0
getdirentries64(0xD, 0x7FF8D7886E00, 0x1000)		 = 0 0
close_nocancel(0xD)		 = 0 0
lstat64("/Users/som32/code/remote_code/node_modules/postcss-discard-unused\0", 0x70000DBAC3B8, 0x0)		 = 0 0
open_nocancel("/Users/som32/code/remote_code/node_modules/postcss-discard-unused\0", 0x1100004, 0x41A2D70)		 = 13 0
fstatfs64(0xD, 0x70000DBAB8F8, 0x0)		 = 0 0
fgetattrlist(0xD, 0x70000DBAC320, 0x70000DBAC300)		 = 0 0
fgetattrlist(0xD, 0x70000DBAC300, 0x70000DBAC320)		 = -1 Err#22
getdirentries64(0xD, 0x7FF8D7886E00, 0x1000)		 = 296 0

Thousands of lines like that. The important part that it goes through node_modules and tmp folders (which are normally huge with a lot of subfolders)

Changing .solargraph.yml to that didn't help:

include:
  - "app/**/*.rb"
  - "lib/**/*.rb"
  - "lib_deps/**/*.rb"
  - "engines/**/*.rb"
exclude:
  - spec/**/*
  - test/**/*
  - node_modules/**/*
  - tmp/**/*
reporters: []
domains: []
required: []
plugins: []

still goes through all these directories for some reason

@somebody32
Copy link
Author

Sorry, was to fast. It did help, now hovers are fast and go to definition is less than 30sec (lstating only ruby files)

@castwide
Copy link
Owner

castwide commented Aug 17, 2018

Does the network traffic go down after you receive the Solargraph is ready message? The server should not require access to the gems' YARD docs after the initial load unless you change the requirements. It also shouldn't need to poll the filesystem; VS Code should notify it of changes to watched files.

I have another suspicion about this problem's root cause, but I'll have to run some tests to see if I can reproduce it. The server should completely ignore files in folders that are globally excluded, e.g., node_modules/**/*. If it's not, it may be mangling the paths to NFS mounts, e.g., it thinks that /Users/user/mount/path and //server/path are different directories.

@castwide
Copy link
Owner

@somebody32 Sorry, was too fast myself :) I just noticed what you mentioned about lstat, and I wasn't able to reproduce the path issue I suspected.

@somebody32
Copy link
Author

Did both sides logs and stats. Here is the whole flow.

Server starts

  • NFS activity is high
  • dtruss shows that there are a lot of lstat64 for files which are excluded in solargraph.yml, like tmp and node_modules
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/9e311615987bc3\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/2e694616cc522a\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/6178c362641a53\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/97326e81911b25\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/479729439bed5d\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/53434079b7daf1\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/c1e0b2b1715234\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/0a6ec103cd2d6d\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/d83cf74facb9af\0", 0x700000348068, 0x0)		 = 0 0
lstat64("/Users/som32/code/remote_code/tmp/cache/bootsnap-compile-cache/8e/73786d8802e651\0", 0x700000348068, 0x0)

Server is ready

  • no NFS activity
  • dtruss shows nothing interesting

Hover on a constant/go to definition

  • Spikes in NFS
  • dtruss shows that solargraph does lstats and getdirentries on included files, ie everything in app, lib, etc

But because my project is not so small, these last lstats take some time on NFS share and I suspect that is why go to definition takes 5-20 sec

@castwide
Copy link
Owner

Thanks, that's useful to know. The lstats and getdirentries operations on hover/go to definition are most likely unnecessary. I'll try to track down where it's happening.

@somebody32
Copy link
Author

Awesome, thank you!

@castwide
Copy link
Owner

Gem version 0.27.0 should require far less filesystem interaction. The language server depends on the client to be the authority for file states, so you should see much less NFS activity after the server is initialized.

@somebody32
Copy link
Author

Just tried it, and, unfortunately, I can't confirm that: I still see lstats for all directories in the project (even excluded on solargraph.yml) on server start and then again (but only on included dirs) when using go-to-definition.

The good thing is that I've tuned my nfs-setup so it is way faster now and all these interactions usually are less than 5 seconds

@somebody32
Copy link
Author

Not sure if I can help anyhow if you're still interested in finding the reason of these file accesses?

@castwide
Copy link
Owner

Your activity logs have been way helpful. Unfortunately, it sounds like there's still a use case I haven't reproduced yet.

I'm going to keep this problem on my radar internally, but you can close this issue unless you want to receive further updates. Thanks again for your help.

@castwide
Copy link
Owner

Extension v0.19.1 and gem v0.31.0 should further reduce filesystem access during normal usage.

  • The extension was watching more files than necessary, which resulted in the server updating the workspace index for file changes that it should have ignored.
  • The server no longer updates the entire workspace when reported file changes are limited to files that are already indexed.

@somebody32
Copy link
Author

I can confirm it is way faster now; everything is almost instantaneous. Fantastic, thanks a lot for your work! Will close this issue happily

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants