Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement earliest_by_offset() UDAF #5273

Merged
merged 4 commits into from
May 14, 2020

Conversation

spena
Copy link
Member

@spena spena commented May 5, 2020

Description

Fixes #5268

Implements earliest_by_offset() UDAF which computes the earliest value for a column. Earliest being defined as offset order.

Note for the reviewer:

Testing done

Added new unit test and QTT test

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@spena spena added this to the 0.10.0 milestone May 5, 2020
@spena spena requested review from purplefox and a team May 5, 2020 18:10
@spena spena requested a review from JimGalasyn as a code owner May 5, 2020 18:10
Copy link
Member

@JimGalasyn JimGalasyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with one request.

@spena spena force-pushed the earliest_by_offset branch 2 times, most recently from 07cecf7 to cf2d846 Compare May 11, 2020 14:54
Copy link
Contributor

@purplefox purplefox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think some constants and method could be reused (put in utility or base class) to avoid duplication.

static final String SEQ_FIELD = "SEQ";
static final String VAL_FIELD = "VAL";

public static final Schema STRUCT_INTEGER = SchemaBuilder.struct().optional()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these constants look the same as the ones in LatestbyOffset. It would be nice to move them to another shared file instead of duplicating them.

return earliest(STRUCT_STRING);
}

static <T> Struct createStruct(final Schema schema, final T val) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could go in a common base class instead of duplicating (or a utils class)

return sequence.getAndIncrement();
}

private static int compareStructs(final Struct struct1, final Struct struct2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could go in common class too.

@spena spena force-pushed the earliest_by_offset branch from cf2d846 to 4a7a99b Compare May 12, 2020 17:09
@spena spena force-pushed the earliest_by_offset branch from 4a7a99b to 1517c06 Compare May 13, 2020 22:14
@spena spena merged commit bc17046 into confluentinc:master May 14, 2020
@spena spena deleted the earliest_by_offset branch May 14, 2020 02:14
spena added a commit to spena/ksql that referenced this pull request May 14, 2020
spena added a commit that referenced this pull request May 14, 2020
* specific language governing permissions and limitations under the License.
*/

package io.confluent.ksql.function.udaf;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho we should create a directory and put both udafs and this file in the same package. That way it can be package protected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for earliest_by_offset
3 participants