Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sp_BlitzFirst] Division by zero bug if pass 1 and 2 are too close to each other #3370

Closed
Montro1981 opened this issue Oct 6, 2023 · 2 comments

Comments

@Montro1981
Copy link
Contributor

Montro1981 commented Oct 6, 2023

Version of the script
Version: 8.16
Date: 20230820

What is the current behavior?
kellyleia on Slack reported:

https://sqlcommunity.slack.com/archives/C1QU0QYS0/p1696512788446079

The waits stats are collected in two passes to see how many waits are being generated between those passes.
On quiet servers it is possible that both passes are within a second of each other.
The waits are divided by the number of seconds passed between pass 1 and 2 using:

DATEDIFF(ss, waits1.SampleTime, waits2.SampleTime)

However DATEDIFF can return zero of the difference in full seconds is 0 (it doesn't account for the number of milliseconds passed)

SELECT DATEDIFF(ss, N'2000-01-01 12:00:00.000', N'2000-01-01 12:00:00.999')

The result is 0 even if 999 milliseconds have passed.
DateDiff is a bit stupid sometimes, Brent linked a post once on LinkedIn:
https://debthedba.wordpress.com/2023/08/21/fun-with-datediff-how-long-did-that-process-run-for/

If the current behavior is a bug, please provide the steps to reproduce.
Copy the code for the wait stats collection (pass 1 and 2) and run these close to each other, then run check 20. This can give a divide by zero error.

What is the expected behavior?
There are a couple possible solutions:

  1. At the second pass artificially add 1 second with a WAIT FOR DELAY ‘00:00:01’ that will force at least 1 second to have passed before the second pass.
  2. In check 20 do a check if the datediff is zero if that is case than replace it with 1 second. The wait stats are kind of irrelevant at this point anyways.
  3. In check 20 do the datediff in MS and round it up with CEILING, this will affect all runs as it artificially adds 1 second to the datediff.
  4. In check 20 do the datediff in MS and divide by that, milliseconds pass a lot quicker, but the waits will be artificially raised on all runs i.e. 1/0.5 = 2 waits. The longer the time between pass 1 and 2 the difference will be less.

Option 2 seems the best solution because it impacts on this scenario and doesn't affect the current behavior.

richbenner also suggested option 2 as a solution.

https://sqlcommunity.slack.com/archives/C1QU0QYS0/p1696614648687679

Which versions of SQL Server and which OS are affected by this issue? Did this work in previous versions of our procedures?
Code hasn't changed in several years, hardware just got faster.

Montro1981 added a commit to Montro1981/SQL-Server-First-Responder-Kit that referenced this issue Oct 9, 2023
@Montro1981 Montro1981 changed the title [sp_BlitzFirst] Division by zero bug if pass 1 and 2 are to close to each other [sp_BlitzFirst] Division by zero bug if pass 1 and 2 are too close to each other Oct 9, 2023
@Montro1981
Copy link
Contributor Author

PR #3371

BrentOzar added a commit that referenced this issue Oct 10, 2023
@BrentOzar BrentOzar added this to the 2023-10 Release milestone Oct 10, 2023
@BrentOzar
Copy link
Member

Thanks for the pull request! I agree with the solution. Looks good, merging into the dev branch, will be in the next release with credit to you in the release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants