Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evalengine: Support built-in MySQL function CEIL() #11027

Merged
merged 8 commits into from
Sep 13, 2022

Conversation

Weijun-H
Copy link
Contributor

Signed-off-by: Weijun-H huangweijun1001@gmail.com

Description

Implement support CEIL() to understand how the evaluation engine works

Related Issue(s)

Checklist

  • "Backport me!" label has been added if this change should be backported
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Aug 17, 2022

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a new flag is being introduced, review whether it is really needed. The flag names should be clear and intuitive (as far as possible), and the flag's help should be descriptive. Additionally, flag names should use dashes (-) as word separators rather than underscores (_).
  • If a workflow is added or modified, each items in Jobs should be named in order to mark it as required. If the workflow should be required, the GitHub Admin should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should either include a link to an issue that describes the bug OR an actual description of the bug and how to reproduce, along with a description of the fix.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.

@Weijun-H Weijun-H marked this pull request as ready for review August 17, 2022 08:45
@vmg vmg self-assigned this Aug 17, 2022
Copy link
Collaborator

@vmg vmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Alex! Off to a good start! I left you some feedback on the things that need fixing.

if sqltypes.IsIntegral(argtype) {
inarg.makeFloat()
result.setInt64(int64(math.Ceil(inarg.float64())))
} else if sqltypes.Decimal == argtype {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small nit, but we don't have any reverse comparisons in the Vitess codebase. As you may be aware of, reverse comparisons have their origin in the C family of languages because if (argtype = sqltypes.Decimal) is valid syntax and does something very different than if (argtype == sqltypes.Decimal), so reversing the comparison may prevent a bad bug from being deployed.

However, Go does not allow assignments in if expressions, so we always write if argtype == sqltypes.Decimal { -- there's no point on doing otherwise except making the code harder to read!


if sqltypes.IsIntegral(argtype) {
inarg.makeFloat()
result.setInt64(int64(math.Ceil(inarg.float64())))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this could possibly do anything! The result of converting an integral to a float64 will always be a round number, so calling math.Ceil on it will be a no-op. The whole roundtrip from integral -> float -> integral seems to be superfluous.

Have you found any corner cases that that roundtrip handles properly? From testing in the MySQL CLI, CEIL on an integer looks like a noop -- it always returns the same integer.

Comment on lines 664 to 673
if num.Cmp(decimal.NewFromInt(math.MaxInt32)) == 1 || num.Cmp(decimal.NewFromInt(math.MinInt32)) == -1 {
if num.Sign() == 1 {
result.setDecimal(num.Add(decimal.NewFromInt(1)), 0)
} else {
result.setDecimal(num.Add(decimal.NewFromInt(-1)), 0)
}
} else {
floatpart, _ := num.Float64()
result.setInt64(int64(math.Ceil(floatpart)))
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little bit what are you trying to do here? It seems like you're using different rounding modes for decimals that are larger than INT32_MAX or smaller than INT32_MIN -- otherwise you're just rounding the decimal as if it were a float...

I don't think this is exactly MySQL's behavior. Are you attempting an optimization here? I.e. doing the float rounding if it would always be safe to do so? I'm afraid this won't behave like you're expecting it to... Converting a decimal to a float64 is super expensive! The comparisons against math.MaxInt32 and math.MaxInt32 are also expensive, particularly because they allocate two temporary decimals, and the num.Float64 would tell you anyway if the conversion to float64 was safe to perform.

I think this is a case where the simplest code would would also be the safest: you can add (or subtract) one from the decimal and truncate the decimals, and that should give you a good ceil... but here's an even better tip: if you check the implementation of our decimal package, you'll see that we've adapted an existing arbitrary-precision decimal implementation to behave like MySQL's. Maybe the original implementation has a Ceil function you could port over? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused with the mysql behaviors here. For example, when I test the value like 1.5

/home/hwj/cncf/vitess/go/vt/vtgate/evalengine/integration/comparison_test.go:710: 
different results: DECIMAL(2); mysql response: INT64(2) (local collation: binary; mysql collation: binary)
      query: SELECT CEIL(1.5) (SIMPLIFY=true)

Then I checked the type of the 1.5, it is not the FLOAT, but Demical. Do I misunderstand the process here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the MySQL docs:

For exact-value numeric arguments, the return value has an exact-value numeric type. For string or floating-point arguments, the return value has a floating-point type.

It seems like rounding a DECIMAL, which is an exact value numeric type, will return INT64 if the result is small enough to fit, otherwise a DECIMAL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It it tricky to detect when to return INT, because:

"9223372036854775810.4", // bigger than MAXINT64, return DECIMAL
"7223372036854775010.1", // bigger than MAXINT32, return DECIMAL
"2147483648.1", // bigger than MAXINT32, return INT

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All you have to check is whether the result of the rounding fits in int64 or not -- there are no intermediate types. If it fits, it is returned as INT64, if not as DECIMAL.

Copy link
Contributor Author

@Weijun-H Weijun-H Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the type of args is DECIMAL, but the result is INT64. How could we return the true sqltypes.Type in the type function. My current solution:

func (builtinCeil) call(env *ExpressionEnv, args []EvalResult, result *EvalResult) {
	inarg := &args[0]
	argtype := inarg.typeof()
	if inarg.isNull() {
		result.setNull()
		return
	}

	if sqltypes.IsIntegral(argtype) {
		result.setInt64(inarg.int64())
	} else if sqltypes.Decimal == argtype {
		num := inarg.decimal()
		num = num.Ceil()
		intnum, isfit := num.Int64()
		if isfit {
			inarg.makeSignedIntegral()
			result.setInt64(intnum)	
		} else {
			result.setDecimal(num, 0)
		}
	} else {
		inarg.makeFloat()
		result.setFloat(math.Ceil(inarg.float64()))
	}
}

func (builtinCeil) typeof(env *ExpressionEnv, args []Expr) (sqltypes.Type, flag) {
	if len(args) != 1 {
		throwArgError("CEIL")
	}
	t, f := args[0].typeof(env)
	if sqltypes.IsIntegral(t) {
		return sqltypes.Int64, f
	} else if sqltypes.Decimal == t {
		return sqltypes.Decimal, f
	} else {
		return sqltypes.Float64, f
	}
}

How could we catch the new return type, like the test1.5?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, that's a good question. MySQL's behavior here is really bad... Our typeof helpers cannot (should not, really) return variable types for the same input, so we're gonna have to hack it and do something different to MySQL... Let's assume for now that typeof returns Decimal for Decimal. That'll give some bad typeof results from the tests, and we'll have to ignore them, but I think that's the best we can do for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inarg.makeSignedIntegral() line is superfluous by the way. You can remove it.

"7223372036854775010.1",
"2147483640.1",
"9223372036854775010.4",
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start with the tests. Can you think of more corner cases that would be worth adding here? Larger negative floats? Larger negative decimals?

Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
@Weijun-H Weijun-H requested review from vmg and removed request for systay, harshit-gangal and frouioui August 19, 2022 15:35
Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
@vmg
Copy link
Collaborator

vmg commented Aug 22, 2022

I don't think your Ceil implementation is quite correct. Here's a big hint: https://github.com/shopspring/decimal/blob/f55dd564545cec84cf84f7a53fb3025cdbec1c4f/decimal.go#L1288-L1305

Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
@Weijun-H
Copy link
Contributor Author

Weijun-H commented Sep 2, 2022

I don't think your Ceil implementation is quite correct. Here's a big hint: https://github.com/shopspring/decimal/blob/f55dd564545cec84cf84f7a53fb3025cdbec1c4f/decimal.go#L1288-L1305

Hi @vmg, I have modified Ceil function, could you have a look?

Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
@vmg
Copy link
Collaborator

vmg commented Sep 12, 2022

We'll need to ignore the bad typeof results from the tests. 👍

Signed-off-by: Weijun-H <huangweijun1001@gmail.com>
@vmg vmg added Type: Feature Component: Evalengine changes to the evaluation engine labels Sep 13, 2022
@vmg
Copy link
Collaborator

vmg commented Sep 13, 2022

Great job on your first PR! It wasn't an easy one!

@vmg vmg merged commit babd1d1 into vitessio:main Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Evalengine changes to the evaluation engine Type: Feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants