-
-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable tx timeouts, add tx debug logging, static DLL pattern, fix docs #3512
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2876e25
to
5b671a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried purposefully causing a timeout by setting timeoutMs
to a lower value (and bypassing the MaximumTimeout < timeout
check), but couldn't get it to work.
On the other hand, I also didn't get a timeout after 10 minutes, so I guess it does work.
In case you tried it on Linux, I'd expect different behavior because of differences in filesystem locking. If you tried it on Windows, then I dunno. 🤷 |
Did test on Linux, I expected the transaction thingy to still try to rollback the transaction, just not failing like on Windows. |
This comment was marked as off-topic.
This comment was marked as off-topic.
@willflatt Try a build that actually has this fix (Hubble was released 239 days before this change was merged):
|
Problems
I've been investigating several transaction-related issues lately.
Registry.DllPattern
is compiled for every call toRegisterDll
, which is not great because this regex will be evaluated on every DLL in GameDataworks great for things like GameData/Foo/Foo-1.2.dll
, but that's wrong, it's forGameData/Foo/Foo.1.2.dll
, the former would detect the identifier asFoo-1
EnlistTransaction
is called inRegisterDll
even if one of its validation checks decides we don't need to make any changesCauses
TransactionManager
has a severe design flaw: ItsMaximumTimeout
property is 10 minutes, limits all transactions, and can't be changed by application code through normal means. When you start a transaction, a background thread is started that sleeps for 10 minutes and then tries to abort the transaction.If you have a 1-hour operation that you want to have atomic rollback capabilities and you instantiate
TransactionScope
with a timeout of 1 hour plus padding, you're out of luck, all you get is 10 minutes. 🤦 And you won't find out about this until your code fails in production. 🤦 🤦 And you have no way to fix this problem once it's discovered 🤦 🤦 🤦, other than editing amachine.config
XML file in your .NET Framework folder 🤦 🤦 🤦 🤦, which would be completely unrealistic to attempt across CKAN's entire user base.When the background thread tries to abort the transaction, the ChinhDo file manager tries to restore the disk to how it was before the transaction started, deleting new files and restoring deleted ones. However, this cannot succeed on Windows because the new files will still be locked by the foreground thread while the original transaction is in progress! So the (misguided, inappropriate) attempt to abort fails, and the whole application crashes. Needless to say this is not what we would ever want, especially when the original operation is actually working fine, chugging along towards its completion.
Changes
log.Debug
calls are added so we can track how and when we enlist and de-enlist with transactions (this is what I added for the debug component uploaded in Registry already enlisted with tx error during install #3502)static
, so it will not be compiled for every DLL in GameDataGameInstance.DllPathToIdentifier
and shared with Netkan'sPluginsValidator
to reduce code duplicationRegistry.RegisterDll
now only callsEnlistTransaction
if all of its validation checks pass and it is going to make a changeTransactionManager
. This will ensure that cloning an instance and installing mods will not arbitrarily fail after 10 minutes.Even if this happened, attempting to abort a ChinhDo-based tx from another thread will always fail on Windows. The transaction timeout simply can't solve such a problem. We would need to identify the deadlock or infinite loop and fix it through normal bug reporting means.
Fixes #3227.
Fixes #3294.
Fixes #3455.
Fixes #3483.