-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memoize to reduce web requests to external sites like GitHub. #1101
Conversation
…b. Note that Paket uses parallel downloads with async.
did you see where it repeatably calls the same url? |
No, I didn't try to track with Fiddler. If I blindly add counter to this memoize function, it will be hit 239 times with this paket.dependencies:
239 requests are not so much but if those (or like 20% of them) are happening at exact the same time to the same web-site, the firewall may not like that. I assume that this .NET concurrent dictionary will not just remove the dublicate calls, but also reduce the parallel request count by slowing down the process a few milliseconds (as it syncs between the threads) which may also help GitHub to not block the user. |
For me that sounds like you found another perf issue. Will investigate
|
let cache = ConcurrentDictionary<(string * obj) option,Lazy<obj>>() | ||
let memoizeConcurrent (caller:string) (f: ('a -> 'b)) = | ||
fun (x :'a) -> | ||
(cache.GetOrAdd(Some (caller, x|>box), lazy ((f x)|>box)).Force() |> unbox) : 'b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the lazy ((f x) |> box)).Force()
?
why not (f x) |> box)
?
Why is the key (string * obj) option
and not just (string * obj)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure it's really caching?
Lazy is due to the reason that ConcurrentDictionary is locking the threads during the update-action. So lazy will cause threads to be not locked while f is executed. Just tuple for the key should do well also. I think that the limitation is just that key has to be IEquatable. I just took the simplest and best memoize from fssnip.net to be the base implementation and improved that. You can test this code with Tomas Petricek's Fibonacci, just replace his non-threadsafe memoize command with memoizeConcurrent: http://www.fssnip.net/8P |
I was able to understand the root cause. I changed your strategy to only cache the hashes (and possible remote dependencies files). Please review ad109f3 |
A bit more verbose code. |
yeah. and it doesn't do full memoization. I don't really like that because it's harder to explain later. It's "just" cache lookup. Anyways it saves you 19 github calls per update. |
…es in order to reduce stress on API limit - references #1101
I was able to reduce the calls by checking if we already downloaded the corresponding dependencies file in the right version. please review f402345 |
(nice side effect it's also much faster for the second update) |
I had some problems with GitHub:
System.Net.WebException: The remote server returned an error: (403) Forbidden.
"API rate limit exceeded"
So after this it shouldn't make duplicate network calls when resolving versions inside one Paket-command (like paket outdated or paket update).