Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CKAN sometimes saves a mod YZ as CKAN/downloads/ABCDE123-SomeOtherMod-1.2.3.zip #1927

Closed
weissel opened this issue Oct 23, 2016 · 6 comments · Fixed by #2243
Closed

CKAN sometimes saves a mod YZ as CKAN/downloads/ABCDE123-SomeOtherMod-1.2.3.zip #1927

weissel opened this issue Oct 23, 2016 · 6 comments · Fixed by #2243
Assignees
Labels
Core (ckan.dll) Issues affecting the core part of CKAN

Comments

@weissel
Copy link

weissel commented Oct 23, 2016

Thanks for wanting to bring an issue to our attention!

Before you open a ticket, please do a quick search of the existing issues to see if it's already been brought up.

CKAN Version:
Affected are at least:
Alexey (v1.20.1 / d41fe54)
Yuri (v1.20.0 / 2f448b9)
StarTrammier (v1.18.1 / 0d2c4f0)
(did not check any older ones)

Operating System:
Windows 10 Pro, Version 1607 (i.e. the Anniversary Edition), Build 14393.321
KSP for Windows, x64, Version 1.2.0.1586

The issue you are experiencing:
CKAN has problems with downloading mods correctly. In this case CKAN seems to mix up downloaded/cached filenames and contents.

How to recreate this issue:

  • Create a completely new KSP directory by unpacking ksp_win64-1-2.zip.
  • Start ckan.exe
  • point ckan.exe at the new install, update repository data as needed
  • find the "SETI-Rebalance" mod and tick it for install
  • klick "Apply"
  • Agree to the default settings for recommended mods.
  • Select all suggested mods.
  • CKAN downloads
  • CKAN stops during install (this time) with
    "Bad metadata detected for module SAVE 1.4.1-2315: No files found in GameData/Nereid to install!"

Looking at the KSP_win64\CKAN\downloads\A2511907-SAVE-1.4.1-2315.zip, it contains what appears to be the "SETIprobePart" mod:
A2511907-SAVE-1.4.1-2315.zip

However:

  • Close CKAN.exe
  • delete the KSP_win64\CKAN\downloads\A2511907-SAVE-1.4.1-2315.zip file
  • Start CKAN.exe again, select S.A.V.E, switch the rider on the right top from "Metadata" to "Contents" and click on "Download"
  • the contents shown seem right for S.A.V.E:
    A2511907-SAVE-1.4.1-2315.zip

It seems CKAN gets somewhat confused when a larger number of mods are downloaded.

CKAN error codes (if applicable):
none

If I can be of any help --- say so. I do understand debugging, though I have very little .NET experience.

@ayan4m1 ayan4m1 added the Support Issues that are support requests label Oct 23, 2016
@weissel weissel changed the title CKAN ga CKAN sometimes saves a mod YZ as CKAN/downloads/ABCDE123-SomeOtherMod-1.2.3.zip Oct 23, 2016
@HebaruSan
Copy link
Member

HebaruSan commented Dec 2, 2017

This problem might not be in the saving code, but the cache search code. Cache lookup does this:

  1. Calculate 40-hex-character SHA-1 of the download URL
  2. Take the first eight characters of that SHA-1 sum
  3. Search the cache for files that start with those eight characters

string hash = CreateURLHash(url);
// Use our existing list of files, or retrieve and
// store the list of files in our cache. Note that
// we copy cachedFiles into our own variable as it
// *may* get cleared by OnCacheChanged while we're
// using it.
string[] files = cachedFiles;
if (files == null)
{
log.Debug("Rebuilding cache index");
cachedFiles = files = Directory.GetFiles(cachePath);
}
// Now that we have a list of files one way or another,
// check them to see if we can find the one we're looking
// for.
foreach (string file in files)
{
string filename = Path.GetFileName(file);
if (filename.StartsWith(hash))
{
return file;
}
}

So if you had two downloads for which the first eight characters of the SHA-1 of the URL were the same (not even the whole hash!), then only the first one would ever be downloaded, and that file would be used to attempt to install both mods.

There are 4,294,967,295 (4 billion/milliard) hash values available with that many characters (edited to correct: it uses hexadecimal, not Base64, so it's far less than 281 trillion). Every mod download ever released has to fit into one of those 4 billion buckets, by itself, for the cache to work properly.

Using hashes to index data normally requires handling collisions (different data that hash to the same value), and stripping off 32 of the 40 characters of the hash makes such collisions far more likely. When I saw this code I marveled that cache collisions had not been a problem up to now, but here it is. I will see if I can find these downloads to confirm the issue...

@HebaruSan HebaruSan added Bug Something is not working as intended Core (ckan.dll) Issues affecting the core part of CKAN and removed Support Issues that are support requests labels Dec 2, 2017
@HebaruSan
Copy link
Member

... I had some trouble confirming the exact mods and versions that were involved in the original report, so I ended up writing a script to search the registry for collisions. None were found. It's possible that some URLs were changed since this issue was reported (e.g. in the KerbalStuff migration), but if not, then the cause was not cache collisions.

The total number of downloads checked was 10953. Maybe someone with more math/stats background could tell us how many we can index before we have a high likelihood of a collision.

@techman83
Copy link
Member

We'll need to pop over to NetKAN-bot and implement any changes there as well. Also the archive.org uploads use the same hash -> https://archive.org/download/KWRocketryRedux-3.0.4/F5A511A7-KWRocketryRedux-3.0.4.zip

Nothing that a bit of code can't work around.

@politas
Copy link
Member

politas commented Dec 4, 2017

Would it make sense to combine the URL hash with the ZIP contents hash as a second level comparison?

@HebaruSan
Copy link
Member

Maybe. It might be bad for performance, though, as some of the downloads are hundreds of MB, all of which has to be read to generate the hash, and who knows when those calculations would need to happen.

I don't advocate a piecemeal approach to this. My investigation suggested that it's a potential future problem rather than a current one at the moment, so there is time to make a robust design.

@politas
Copy link
Member

politas commented Dec 7, 2017

On the other hand, if we generate a file contents hash and compare with our metadata, we validate the downloaded file.

@HebaruSan HebaruSan added the Network Issues affecting internet connections of CKAN label Dec 16, 2017
@HebaruSan HebaruSan self-assigned this Jan 6, 2018
@politas politas removed Bug Something is not working as intended Network Issues affecting internet connections of CKAN labels Jan 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core (ckan.dll) Issues affecting the core part of CKAN
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants