Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-16384][table sql/client] Support SHOW CREATE TABLE statement #13011

Closed
wants to merge 7 commits into from

Conversation

fsk119
Copy link
Member

@fsk119 fsk119 commented Jul 29, 2020

What is the purpose of the change

Support 'SHOW CREATE TABLE' statement.

Brief change log

  • Modify parser in flink and hive to parse SHOW CREATE TABLE DDLs.
  • Add Operation of SHOW CREATE TABLE DDLs and convert sqlNode to corresponding Operation.
  • Modify TableEnv to adapt to new Operation and return result according to input Operation type.

Verifying this change

This change added tests and can be verified as follows:

  • Added test that parser can parse SHOW CURRENT DDLs with flink parser, hive parser and sql command parser.
  • Added test that SHOW CREATE TABLE DDLs can return correct result with all kinds of TableEnvironment(StreamTableEnv and BatchTableEnv) and sql client.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Jul 29, 2020

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 353393c (Sat Aug 28 11:17:08 UTC 2021)

✅no warnings

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Jul 29, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@fsk119
Copy link
Member Author

fsk119 commented Jul 30, 2020

@wuchong cc

@LadyForest
Copy link
Contributor

Hi @fsk119, If you're not free, I'd like to work on this PR and resolve conflicts.

@fsk119
Copy link
Member Author

fsk119 commented Mar 24, 2021

@LadyForest . Thanks for your help. Just go ahead!

@@ -193,6 +195,8 @@ private SqlToOperationConverter(
return Optional.of(converter.convertShowCurrentDatabase((SqlShowCurrentDatabase) validated));
} else if (validated instanceof SqlShowTables) {
return Optional.of(converter.convertShowTables((SqlShowTables) validated));
} else if (validated instanceof SqlShowCreateTable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to support this feature on the old planner right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I agree with you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Then I'll revert this change.

// | CREATE TABLE `my_table` ( |
// | ... |
// | ) WITH ( |
// | ... |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall it also prints sensitive options such as jdbc url contains password when on an external MySQL table?

Copy link
Contributor

@LadyForest LadyForest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fsk119, sorry for the late reply.

I updated this PR with the following changes:

  1. rebase master and resolve the conflicts
  2. revert the changes made on the old planner
  3. apply spotless formatting(from tab to four spaces)
  4. fix DDL missing TEMPORARY keyword for the temporary table
  5. display table's full object path as catalog.db.table
  6. change the implementation for view, from table schema to the expanded query for the view
  7. add view test in CatalogTableITCase, and adapt sql client test to the new test framework
  8. adapt docs

I also made some changes on the original impl of TableEnvironmentImpl#buildShowCreateTableResult since TableSchema is deprecated.

Please take a look when you're free.

String comment = table.getComment();
Map<String, String> options = table.getOptions();

sb.append(String.format("`%s` (\n", sqlIdentifier.getObjectName()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String.format("`%s` (\n", sqlIdentifier.getObjectName())

Will only display table name without catalog and database info. Is it by design?
If not, then I guess it should be sqlIdentifier.asSerializableString()

|)
|""".stripMargin
tableEnv.executeSql(createDDL)
val row = tableEnv.executeSql("SHOW CREATE TABLE `TBL1`").collect().next();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

; can be removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd better add a case for create view

}
sb.append("\n) ");
// append comment
if (comment != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment won't be null for DefaultCatalogTable#getComment (see its impl), I think it's better to use StringUtils.isNotEmpty(comment)

.collect(Collectors.toList())))
.append("\n)\n");

Object[][] rows = new Object[][]{new Object[]{sb.toString()}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that the row might be extremely large if the DDL string resides in one row since all show statements are now printed as tableau form. See the impl of PrintUntils#columnWidthsByContent, it calculates the column size by the content size.

@@ -1095,6 +1109,75 @@ private TableResult buildShowResult(String columnName, String[] objects) {
Arrays.stream(objects).map((c) -> new String[]{c}).toArray(String[][]::new));
}

private TableResult buildShowCreateTableResult(CatalogBaseTable table, ObjectIdentifier sqlIdentifier) {
StringBuilder sb = new StringBuilder("CREATE TABLE ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add a temporary table check here.

@LadyForest
Copy link
Contributor

@flinkbot run azure

Copy link
Member Author

@fsk119 fsk119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. I left some comments.

/**
* Parse a "Show Create Table" query command.
*/
SqlShowCreateTable SqlShowCreateTable() :
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two space SqlShowCreateTable and SqlShowCreateTable

Comment on lines 77 to 93
show create table orders;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| create table |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CREATE TABLE `default_catalog`.`default_database`.`orders` (
`user` BIGINT NOT NULL,
`product` VARCHAR(32),
`amount` INT,
`ts` TIMESTAMP(3),
`ptime` AS PROCTIME(),
WATERMARK FOR `ts` AS `ts` - INTERVAL '1' SECOND,
CONSTRAINT `PK_3599338` PRIMARY KEY (`user`) NOT ENFORCED
) WITH (
'connector' = 'datagen'
)
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to print the chart. You can take a look at how sql client print EXPLAIN results.

It's very unconvenice to copy the ddl if we have chart.

Comment on lines 1323 to 1334
private String[] buildShowCreateTableRow(
ResolvedCatalogBaseTable<?> table,
ObjectIdentifier sqlIdentifier,
boolean isTemporary) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hive dialect has its own grammar to create table. Do we need to build table with hive dialect if the table is Permannet?

return new String[] {sb.toString()};
}

private String getColumnString(Column column) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add "TODO: Print the comment when FLINK-18958 is fix"

@@ -985,6 +985,80 @@ class CatalogTableITCase(isStreamingMode: Boolean) extends AbstractTestBase {
assertEquals(false, tableSchema2.getPrimaryKey.isPresent)
}

@Test
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the table is created by TableAPi. The show create table result is strange, e.g.

Table t =
                tEnv.fromDataStream(
                        env.fromCollection(
                                Arrays.asList(
                                        new Tuple2<>(1, "1"),
                                        new Tuple2<>(1, "1"),
                                        new Tuple2<>(2, "3"),
                                        new Tuple2<>(2, "5"),
                                        new Tuple2<>(3, "5"),
                                        new Tuple2<>(3, "8"))),
                        $("id1"),
                        $("id2"),
                        $("proctime").proctime());

        tEnv.createTemporaryView("T", t);
CREATE TEMPORARY VIEW `default_catalog`.`default_database`.`T` AS
DataStream: (id: [default_catalog.default_database.T], fields: [id1, id2, proctime])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I went through TableEnvironmentImpl#createTemporaryView and found that view is transformed to QueryOperationCatalogView, and in its constructor, the QueryOperation#asSummaryString is passed to originalQuery and expandedQuery, which causes this problem. And meanwhile, not all tables created by Table API encountered this problem, take org.apache.flink.table.planner.utils.TestTableSourceSinks#createPersonCsvTemporaryTable for example

def createPersonCsvTemporaryTable(tEnv: TableEnvironment, tableName: String): Unit = {
    tEnv.connect(new FileSystem().path(getPersonCsvPath))
      .withFormat(
        new OldCsv()
          .fieldDelimiter("#")
          .lineDelimiter("$")
          .ignoreFirstLine()
          .commentPrefix("%"))
      .withSchema(
        new Schema()
          .field("first", DataTypes.STRING)
          .field("id", DataTypes.INT)
          .field("score", DataTypes.DOUBLE)
          .field("last", DataTypes.STRING))
      .createTemporaryTable(tableName)
  }

will generate

CREATE TEMPORARY TABLE `default_catalog`.`default_database`.`MyTable` (
  `first` STRING,
  `id` INT,
  `score` DOUBLE,
  `last` STRING
) WITH (
  'connector.path' = '/var/folders/xd/9dp1y4vd3h56kjkvdk426l500000gn/T/csv-test3761923201962233845tmp',
  'connector.property-version' = '1',
  'format.type' = 'csv',
  'format.line-delimiter' = '$',
  'format.ignore-first-line' = 'true',
  'format.property-version' = '1',
  'connector.type' = 'filesystem',
  'format.field-delimiter' = '#',
  'format.comment-prefix' = '%'
)

The reason is that the table created via ConnectTableDescriptor is a type of CatalogTableImpl, which is the same as the one that is created via SQL(please correct me if I'm wrong).

Shall we revisit the decision that throwing an exception for the table which is created by Table API?

Copy link
Contributor

@LadyForest LadyForest Apr 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fsk119, as discussed with Jark, I think we do not need to support `SHOW CREATE TABLE statement for view in this PR.

Copy link

@smm321 smm321 May 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LadyForest @wuchong I have implemented SHOW CREATE TABLE from hive catalog, but remains an issue to be solved:
I can't get the original order when I generate "WITH" information. Is there a way to solve it?

@LadyForest
Copy link
Contributor

LadyForest commented Apr 14, 2021

Hi Shengkai, as discussed offline, we're on the same page that

  • SHOW CREATE TABLE statement only supports the table created via DDL
  • Hive parser does not support this statement for now
  • This is the only SHOW statement that printed not as tableau form
  • In future work, SHOW CREATE TABLE will desensitize the sensitive options

I'll update the PR according to this agreement, and thanks for your review:)

Copy link
Member Author

@fsk119 fsk119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. I left some minor suggestions.

Flink SQL> CREATE VIEW my_view AS ...;
[INFO] View has been created.

Flink SQL> SHOW VIEWS;
my_view

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this.

Comment on lines 1059 to 1060
"Could not execute SHOW CREATE TABLE. " +
"View with identifier `default_catalog`.`default_database`.`tmp` is not supported.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about

SHOW CREATE TABLE doesn't support to show the statement to create view with identifier `default_catalog`.`default_database`.`tmp` .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this. nit: support showing

fsk119 and others added 5 commits April 16, 2021 14:41
This commit tries to
1. resolve the conflicts
2. revert the changes made on old planner
3. apply spotless formatting
4. fix DDL missing `TEMPORARY` keyword for temporary table
5. display table's full object path as catalog.db.table
6. support displaying the expanded query for view
7. add view test in CatalogTableITCase, and adapt sql client test to the new test framework
8. adapt docs
1. Revert changes on Hive parser and remove unnecessary whitespace in parserImpls.ftl
2. Change the output format from tableau form to raw content
3. Add TODO to track missing column comment
4. Only support showing catalog table
5. Update docs accordingly
1. Change exception message
2. Revert newline
3. Rebase master
Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great contribution @fsk119 and @LadyForest , the pull request looks good to me in general, I left some minor comments.

Optional<CatalogManager.TableLookupResult> result =
catalogManager.getTable(showCreateTableOperation.getSqlIdentifier());
if (result.isPresent()) {
return buildShowCreateTableResult(result.get().getTable(), ((ShowCreateTableOperation) operation).getSqlIdentifier());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return buildShowCreateTableResult(result.get().getTable(), ((ShowCreateTableOperation) operation).getSqlIdentifier());
return buildShowCreateTableResult(result.get().getTable(), showCreateTableOperation.getSqlIdentifier());

@@ -1072,6 +1086,75 @@ private TableResult buildShowResult(String columnName, String[] objects) {
Arrays.stream(objects).map((c) -> new String[]{c}).toArray(String[][]::new));
}

private TableResult buildShowCreateTableResult(CatalogBaseTable table, ObjectIdentifier sqlIdentifier) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this must be a CatalogTable, because this is SHOW CREATE TABLE. We should throw exception before this if it is a view.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't like to add more and more methods in the TableEnvironmentImpl, this makes the class fat and hard to maintain. I think such methods can be utilities in a separate class, e.g. ShowStatementUtils

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't like to add more and more methods in the TableEnvironmentImpl, this makes the class fat and hard to maintain. I think such methods can be utilities in a separate class, e.g. ShowStatementUtils

+1 and I think we can open another PR to improve this.

throw new IllegalStateException("Unknown key type: " + getType());
}

final String typeString = getTypeString();
return String.format("CONSTRAINT %s %s (%s)", getName(), typeString, String.join(", ", columns));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add NOT ENFORCED for the summary string. Currentlly, it is not correct.
Besides, we should add NOT ENFORCED according the underlying enforced flag, even though it is always flase for now.

*/
public final String asCanonicalString() {
final String typeString = getTypeString();
return String.format("CONSTRAINT %s %s (%s) NOT ENFORCED",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add NOT ENFORCED according the underlying enforced flag, even though it is always flase for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add NOT ENFORCED according the underlying enforced flag, even though it is always flase for now.

Yes, it has been resolved in FLINK-21435

* E.g CONSTRAINT pk PRIMARY KEY (`f0`, `f1`) NOT ENFORCED
* </pre>
*/
public final String asCanonicalString() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to call it asSerializableString, keep align with others, e.g. ObjectIdentifier#asSerializableString.

sb.append("PARTITIONED BY (")
.append(
catalogTable.getPartitionKeys().stream()
.map(key -> String.format("`%s`", key))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use EncodingUtils.escapeIdentifier to escape which can handle "`" in field names.

column.getName()))));
} else {
DataType dataType = column.getDataType();
String type = dataType.toString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dataType#asSerializableString instead of toString/asSummaryString. asSerializableString doesn't show timestamp kind and we don't need the following special logic then.

Actually, we should always use asSerializableString, rather than asSummaryString, because only the serialized string is fully supported to be parsed, see LogicalTypeParser.

Besides, it would be better to avoid using toString and explicitly use asSerializableString, otherwise, it's hard to trace where the asSerializableString/asSummaryString are used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dataType#asSerializableString instead of toString/asSummaryString. asSerializableString doesn't show timestamp kind and we don't need the following special logic then.

Actually, we should always use asSerializableString, rather than asSummaryString, because only the serialized string is fully supported to be parsed, see LogicalTypeParser.

Besides, it would be better to avoid using toString and explicitly use asSerializableString, otherwise, it's hard to trace where the asSerializableString/asSummaryString are used.

Hi Jark, I agree with you that asSerializableString goes first. Just one thing to confirm, LogicalTypeParser#parseTypeByKeyword uses VarCharType.MAX_LENGTH as varchar length, and for VarCharType#asSerializableString will always print string data type as VARCHAR(2147483647).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is acceptable, then column.getDataType().getLogicalType().asSerializableString() is definitely better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use STRING as serializable string for VarCharType.MAX_LENGTH too. What do you think @twalthr ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another case is for ts timestamp_ltz(3), the summary format is TIMESTAMP_LTZ(%d), and the serializable format is TIMESTAMP(%d) WITH LOCAL TIME ZONE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the case-by-case check is not attractive, I'm just wondering do we need to preserve the original statement made by users as much as possible? If so, maybe we should add another API method to fulfill this requirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can leave this into another issue. STRING, BYTES, TIMESTAMP_LTZ are just abbreviations, it's fine to just display VARCHAR(MAX), TIMESTAMP(%d) WITH LOCAL TIME ZONE, because they are single source of truth.

/** Operation to describe a SHOW CREATE TABLE statement. */
public class ShowCreateTableOperation implements ShowOperation {

private final ObjectIdentifier sqlIdentifier;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a meaningful name here, e.g. tableIdentifier.

@@ -56,6 +56,17 @@ private UniqueConstraint(
return columns;
}

private final String getTypeString() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't need final for private method.

val executedDDL =
"""
|create temporary table TBL1 (
| a bigint not null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some special field names, e.g.

`__source__` varchar(255),
```myname``` timestamp_ltz(3) metadata from 'timestamp',

1. Remove final modifier in UniqueConstraint#getTypeString
2. Rename sqlIdentifier to tableIdentifier both in ShowCreateTableOperation and TableEnvironmentImpl
3. Use asSerializableString and EncodingUtils#escapeIdentifier
4. Add more data types in CatalogTableITCase
5. Remove unnecessary String#join in WatermarkSpec
6. Update documentation
@LadyForest
Copy link
Contributor

@flinkbot run azure

Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@wuchong wuchong closed this in a3437d5 Apr 21, 2021
wuchong pushed a commit that referenced this pull request Apr 21, 2021
…TE TABLE' statement.

This commit tries to
1. resolve the conflicts
2. revert the changes made on old planner
3. apply spotless formatting
4. fix DDL missing `TEMPORARY` keyword for temporary table
5. display table's full object path as catalog.db.table
6. support displaying the expanded query for view
7. add view test in CatalogTableITCase, and adapt sql client test to the new test framework
8. adapt docs

This closes #13011
xinggit pushed a commit to xinggit/flink that referenced this pull request Apr 22, 2021
xinggit pushed a commit to xinggit/flink that referenced this pull request Apr 22, 2021
…TE TABLE' statement.

This commit tries to
1. resolve the conflicts
2. revert the changes made on old planner
3. apply spotless formatting
4. fix DDL missing `TEMPORARY` keyword for temporary table
5. display table's full object path as catalog.db.table
6. support displaying the expanded query for view
7. add view test in CatalogTableITCase, and adapt sql client test to the new test framework
8. adapt docs

This closes apache#13011
xinggit pushed a commit to xinggit/flink that referenced this pull request Apr 22, 2021
xinggit pushed a commit to xinggit/flink that referenced this pull request Apr 22, 2021
…TE TABLE' statement.

This commit tries to
1. resolve the conflicts
2. revert the changes made on old planner
3. apply spotless formatting
4. fix DDL missing `TEMPORARY` keyword for temporary table
5. display table's full object path as catalog.db.table
6. support displaying the expanded query for view
7. add view test in CatalogTableITCase, and adapt sql client test to the new test framework
8. adapt docs

This closes apache#13011
xinggit pushed a commit to xinggit/flink that referenced this pull request Apr 22, 2021
xinggit pushed a commit to xinggit/flink that referenced this pull request Apr 22, 2021
…TE TABLE' statement.

This commit tries to
1. resolve the conflicts
2. revert the changes made on old planner
3. apply spotless formatting
4. fix DDL missing `TEMPORARY` keyword for temporary table
5. display table's full object path as catalog.db.table
6. support displaying the expanded query for view
7. add view test in CatalogTableITCase, and adapt sql client test to the new test framework
8. adapt docs

This closes apache#13011
sangyo7 added a commit to sangyo7/flink that referenced this pull request May 18, 2021
* [hotfix][docs][python] Fix the invalid url in state.md

* [hotfix][table-planner-blink] Fix bug for window join: plan is wrong if join condition contains 'IS NOT DISTINCT FROM'

Fix Flink-22098 caused by a mistake when rebasing

This closes apache#15695

* [hotfix][docs] Fix the invalid urls in iterations.md

* [FLINK-16384][table][sql-client] Support 'SHOW CREATE TABLE' statement

This closes apache#13011

* [FLINK-16384][table][sql-client] Improve implementation of 'SHOW CREATE TABLE' statement.

This commit tries to
1. resolve the conflicts
2. revert the changes made on old planner
3. apply spotless formatting
4. fix DDL missing `TEMPORARY` keyword for temporary table
5. display table's full object path as catalog.db.table
6. support displaying the expanded query for view
7. add view test in CatalogTableITCase, and adapt sql client test to the new test framework
8. adapt docs

This closes apache#13011

* [hotfix][docs][python] Add missing pages for PyFlink documentation

* [FLINK-22348][python] Fix DataStream.execute_and_collect which doesn't declare managed memory for Python operators

This closes apache#15665.

* [FLINK-22350][python][docs] Add documentation for user-defined window in Python DataStream API

This closes apache#15702.

* [FLINK-22354][table-planner] Fix the timestamp function return value precision isn't matching with declared DataType

This closes apache#15689

* [FLINK-22354][table] Remove the strict precision check of time attribute field

This closes apache#15689

* [FLINK-22354][table] Fix TIME field display in sql-client

This closes apache#15689

* [hotfix] Set default value of resource-stabilization-timeout to 10s

Additionally, set speed up the AdaptiveScheduler ITCases by configuring a very low
jobmanager.adaptive-scheduler.resource-stabilization-timeout.

* [FLINK-22345][coordination] Remove incorrect assertion in scheduler

When this incorrect assertion is violated, the scheduler can trip into an unrecoverable
failover loop.

* [hotfix][table-planner-blink] Fix the incorrect ExecNode's id in testIncrementalAggregate.out

After FLINK-22298 is finished, the ExecNode's id should always start from 1 in the json plan tests, while the testIncrementalAggregate.out was overrided by FLINK-20613

This closes apache#15698

* [FLINK-20654] Fix double recycling of Buffers in case of an exception on persisting

Exception can be thrown for example if task is being cancelled. This was leading to
same buffer being recycled twice. Most of the times that was just leading to an
IllegalReferenceCount being thrown, which was ignored, as this task was being cancelled.
However on rare occasions this buffer could have been picked up by another task
after being recycled for the first time, recycled second time and being picked up
by another (third task). In that case we had two users of the same buffer, which
could lead to all sort of data corruptions.

* [hotfix] Regenerate html configuration after 4be9aff

* [FLINK-22001] Fix forwarding of JobMaster exceptions to user

[FLINK-XXXX] Draft separation of leader election and creation of JobMasterService

[FLINK-XXXX] Continued work on JobMasterServiceLeadershipRunner

[FLINK-XXXX] Integrate RunningJobsRegistry, Cancelling state and termination future watching

[FLINK-XXXX] Delete old JobManagerRunnerImpl classes

[FLINK-22001] Add tests for DefaultJobMasterServiceProcess

[FLINK-22001][hotfix] Clean up ITCase a bit

[FLINK-22001] Add missing check for empty job graph

[FLINK-22001] Rename JobMasterServiceFactoryNg to JobMasterServiceFactory

This closes apache#15715.

* [FLINK-22396][legal] Remove unnecessary entries from sql hbase connector NOTICE file

This closes apache#15706

* [FLINK-22396][hbase] Remove unnecessary entries from shaded plugin configuration

* [FLINK-19980][table-common] Add a ChangelogMode.upsert() shortcut

* [FLINK-19980][table-common] Add a ChangelogMode.all() shortcut

* [FLINK-19980][table] Support fromChangelogStream/toChangelogStream

This closes apache#15699.

* [hotfix][table-api] Properly deprecate StreamTableEnvironment.execute

* [FLINK-22345][coordination] Catch pre-mature state restore for Operator Coordinators

* [FLINK-22168][table] Partition insert can not work with union all

This closes apache#15608

* [FLINK-22341][hive] Fix describe table for hive dialect

This closes apache#15660

* [FLINK-22000][io] Set a default character set in InputStreamReader

* [FLINK-22385][runtime] Fix type cast error in NetworkBufferPool

This closes apache#15693.

* [FLINK-22085][tests] Update TestUtils::tryExecute() to cancel the job after execution failure.

This closes apache#15713

* [hotfix][runtime] fix typo in ResourceProfile

This closes apache#15723

* [hotfix] Fix typo in TestingTaskManagerResourceInfoProvider

* [FLINK-21174][coordination] Optimize the performance of DefaultResourceAllocationStrategy

This commit optimize the performance of matching requirements with available/pending resources in DefaultResourceAllocationStrategy.

This closes apache#15668

* [FLINK-22177][docs][table] Add documentation for time functions and time zone support

This closes apache#15634

* [FLINK-22398][runtime] Fix incorrect comments in InputOutputFormatVertex (apache#15705)

* [FLINK-22384][docs] Fix typos in Chinese "fault-tolerance/state"  page (apache#15694)

* [FLINK-21903][docs-zh] Translate "Importing Flink into an IDE" page into Chinese

This closes apache#15721

* [FLINK-21659][coordination] Properly expose checkpoint settings for initializing jobs

* [hotfix][tests] Remove unnecessary field

* [FLINK-20723][tests] Ignore expected exception when retrying

* [FLINK-20723][tests] Allow retries to be defined per class

* [FLINK-20723][cassandra][tests] Retry on NoHostAvailableException

* [hotfix][docs] Fix typo in jdbc execution options

* [hotfix][docs] Fix typos

* [FLINK-12351][DataStream] Fix AsyncWaitOperator to deep copy StreamElement when object reuse is enabled apache#8321 (apache#8321)

* [FLINK-22119][hive][doc] Update document for hive dialect

This closes apache#15630

* [FLINK-20720][docs][python] Add documentation about output types for Python DataStream API

This closes apache#15733.

* [FLINK-20720][python][docs] Add documentation for ProcessFunction in Python DataStream API

This closes apache#15733.

* [FLINK-22294][hive] Hive reading fail when getting file numbers on different filesystem nameservices

This closes apache#15624

* [FLINK-22356][hive][filesystem] Fix partition-time commit failure when watermark is applied defined TIMESTAMP_LTZ column

This closes apache#15709

* [FLINK-22433][tests] Make CoordinatorEventsExactlyOnceITCase work with Adaptive Scheduler.

The test previously relied on an implicit contract that instances of OperatorCoordinators are never recreated
on the same JobManager. That implicit contract is no longer true with the Adaptive Scheduler.

This change adjusts the test to no longer make that assumption.

This closes apache#15739

* [hotfix][tests] Remove timout rule from KafkaTableTestBase

That way, we get thread dumps from the CI infrastructure when the test hangs.

* [FLINK-21247][table] Fix problem in MapDataSerializer#copy when there exists custom MapData

This closes apache#15569

* [FLINK-21923][table-planner-blink] Fix ClassCastException in SplitAggregateRule when a query contains both sum/count and avg function

This closes apache#15341

* [FLINK-19449][doc] Fix wrong document for lead and lag

* [FLINK-19449][table] Introduce LinkedListSerializer

* [FLINK-19449][table] Pass isBounded to AggFunctionFactory

* [FLINK-19449][table-planner] LEAD/LAG cannot work correctly in streaming mode

This closes apache#15747

* [FLINK-18199][doc] translate FileSystem SQL Connector page into chinese

This closes apache#13459

* [FLINK-22444][docs] Drop async checkpoint description of state backends in Chinese docs

* [FLINK-22445][python][docs] Add documentation for row-based operations

This closes apache#15757.

* [FLINK-22463][table-planner-blink] Fix IllegalArgumentException in WindowAttachedWindowingStrategy when two phase is enabled for distinct agg

This closes apache#15759

* [hotfix] Do not use ExecutorService.submit since it can swallow exceptions

This commit changes the KubernetesLeaderElector to use ExecutorService.execute instead of submit
which ensures that potential exceptions are forwarded to the fatal uncaught exeception handler.

This closes apache#15740.

* [FLINK-22431] Add information when and why the AdaptiveScheduler restarts or fails jobs

This commit adds info log statements to tell the user when and why it restarts or fails a job.

This closes apache#15736.

* [hotfix] Add debug logging to the states of the AdaptiveScheduler

This commit adds debug log statements to the states of the AdaptiveScheduler to log
whenever we ignore a global failure.

* [hotfix] Harden against FLINK-21376 by checking for null failure cause

In order to harden the AdaptiveScheduler against FLINK-21376, this commit checks whether a task
failure cause is null or not. In case of null, it will replace the failure with a generic cause.

* [FLINK-22301][runtime] Statebackend and CheckpointStorage type is not shown in the Web UI

[FLINK-22301][runtime] Statebackend and CheckpointStorage type is not shown in the Web UI

This closes apache#15732.

* [hotfix][network] Remove unused method BufferPool#getSubpartitionBufferRecyclers

* [FLINK-22424][network] Prevent releasing PipelinedSubpartition while Task can still write to it

This bug was happening when a downstream tasks were failing over or being cancelled. If all
of the downstream tasks released subpartition views, underlying memory segments/buffers could
have been recycled, while upstream task was still writting some records.

The problem is fixed by adding the writer (result partition) itself as one more reference
counted user of the result partition

* [FLINK-22085][tests] Remove timeouts from KafkaSourceLegacyITCase

* [FLINK-19606][table-runtime-blink] Refactor utility class JoinConditionWithFullFilters from AbstractStreamingJoinOperator

This closes apache#15752

* [hotfix][coordination] Add log for slot allocation in FineGrainedSlotManager

This closes apache#15748

* [FLINK-22074][runtime][test] Harden FineGrainedSlotManagerTest#testRequirementCheckOnlyTriggeredOnce in case deploying on a slow machine

This closes apache#15751

* [FLINK-22470][python] Make sure that the root cause of the exception encountered during compiling the job was exposed to users in all cases

This closes apache#15766.

* [hotfix][docs] Removed duplicate word

This closes apache#15756 .

* [hotfix][python][docs] Add missing debugging page for PyFlink documentation

* [FLINK-22136][e2e] Odd parallelism for resume_externalized_checkpoints was added to run-nightly-tests.sh.

* [FLINK-22136][e2e] Exception instead of system out is used for errors in DataStreamAllroundTestProgram

* [hotfix][python][docs] Clarify python.files option (apache#15779)

* [FLINK-22476][docs] Extend the description of the config option `execution.target`

This closes apache#15777.

* [FLINK-18952][python][docs] Add "10 minutes to DataStream API" documentation

This closes apache#15769.

* [hotfix][table-planner-blink] Fix unstable itcase in OverAggregateITCase#testRowNumberOnOver

This closes apache#15782

* [FLINK-22378][table] Derive type of SOURCE_WATERMARK() from time attribute

This closes apache#15730.

* [hotfix][python][docs] Fix compile issues

* [FLINK-22469][runtime-web] Fix NoSuchFileException in HistoryServer

* [FLINK-22289] Update JDBC XA sink docs

* [FLINK-22428][docs][table] Translate "Timezone" page into Chinese (apache#15753)

* [FLINK-22471][connector-elasticsearch] Remove commads from list

This is a responsibility of the Formatter implementation.

* [FLINK-22471][connector-elasticsearch] Do not repeat default value

* [FLINK-22471][connector-jdbc] Improve spelling and avoid repeating default values

* [FLINK-22471][connector-kafka] Use proper Description for connector options

* [FLINK-22471][connector-kinesis] Remove required from ConfigOption

This is defined by where the factory includes the option.

* [FLINK-22471][connector-kinesis] Use proper Description for connector options

* [FLINK-22471][table-runtime-blink] Remove repetition of default values

* [FLINK-22471][table-runtime-blink] Use proper Description for connector options

This closes apache#15764.

* [hotfix][python][docs] Fix flat_aggregate example in Python Doc

* [hotfix][python][docs] Add documentation to remind users to bundle Python UDF definitions when submitting the job

This closes apache#15790.

* [FLINK-17783][orc] Add array,map,row types support for orc row writer

This closes apache#15746

* [FLINK-22489][webui] Fix displaying individual subtasks backpressure-level

Previously (incorrectly) backpressure-level from the whole task was being displayed for
each of the subtasks.

* [FLINK-22479[Kinesis][Consumer] Potential lock-up under error condition

* [FLINK-22304][table] Refactor some interfaces for TVF based window to improve the extendability

This closes apache#15745

* [hotfix][python][docs] Fix python.archives docs

This closes apache#15783.

* [FLINK-20086][python][docs] Add documentation about how to override open() in UserDefinedFunction to load resources

This closes apache#15795.

* [FLINK-22438][metrics] Add numRecordsOut metric for Async IO

This closes apache#15791.

* [FLINK-22373][docs] Add Flink 1.13 release notes

This closes apache#15687

* [FLINK-22495][docs] Add Reactive Mode section to K8s

* [FLINK-21967] Add documentation on the operation of blocking shuffle

This closes apache#15701

* [FLINK-22232][tests] Add UnalignedCheckpointsStressITCase

* [FLINK-22232][network] Add task name to output recovery tracing.

* [FLINK-22232][network] Add task name to persist tracing.

* [FLINK-22232][network] More logging of network stack.

* [FLINK-22109][table-planner-blink] Resolve misleading exception message in invalid nested function

This closes apache#15523.

* [FLINK-22426][table] Fix several shortcomings that prevent schema expressions

This fixes a couple of critical bugs in the stack that prevented Table API
expressions to be used for schema declaration. Due to time constraints,
this PR could not make it into the 1.13 release but should be added to
the next bugfix release for a smoother user experience. For testing and
consistency, it exposes the Table API expression sourceWatermark() in Java,
Scala, and Python API.

This closes apache#15798

* [FLINK-22493] Increase test stability in AdaptiveSchedulerITCase.

This addresses the following problem in the testStopWithSavepointFailOnFirstSavepointSucceedOnSecond() test.

Once all tasks are running, the test triggers a savepoint, which intentionally fails, because of a test exception in a Task's checkpointing method. The test then waits for the savepoint future to fail, and the scheduler to restart the tasks. Once they are running again, it performs a sanity check whether the savepoint directory has been properly removed. In the reported run, there was still the savepoint directory around.

The savepoint directory is removed via the PendingCheckpoint.discard() method. This method is executed using the i/o executor pool of the CheckpointCoordinator. There is no guarantee that this discard method has been executed when the job is running again (and the executor shuts down with the dispatcher, hence it is not bound to job restarts).

* [FLINK-22510][core] Format durations with highest unit

When converting a configuration value to a string, durations were formatted
in nanoseconds regardless of their values. This produces serialized outputs
which are hard to understand for humans.

The functionality of formatting in the highest unit which allows the value
to be an integer already exists, thus we can simply defer to it to produce
a more useful result.

This closes apache#15809.

* [FLINK-22250][sql-parser] Add missing 'createSystemFunctionOnlySupportTemporary' entry in ParserResource.properties (apache#15582)

* [FLINK-22524][docs] Fix the incorrect Java groupBy clause in Table API docs (apache#15802)

* [FLINK-22522][table-runtime-blink] BytesHashMap prints many verbose INFO level logs (apache#15801)

* [FLINK-22539][python][docs] Restructure the Python dependency management documentation (apache#15818)

* [FLINK-22544][python][docs] Add the missing documentation about the command line options for PyFlink

* Update japicmp configuration for 1.13.0

* [hotfix][python][docs] Correct a few invalid links and typos in PyFlink documentation

* [FLINK-22368] Deque channel after releasing on EndOfPartition

...and don't enqueue the channel if it received EndOfPartition
previously.

Leaving a released channel enqueued may lead to
CancelTaskException which can prevent EndOfPartitionEvent
propagation and the job being stuck.

* [FLINK-14393][webui] Add an option to enable/disable cancel job in web ui

This closes apache#15817.

* [FLINK-22323] Fix typo in JobEdge#connecDataSet (apache#15647)

Co-authored-by: yunhua@dtstack.com <123456Lq>

* [hotfix][python] Add missing space to exception message

* [hotfix][docs] Fix typo

* [FLINK-22535][runtime] CleanUp is invoked for task even when the task fail during the restore

* [FLINK-22535][runtime] CleanUp is invoked despite of fail inside of cancelTask

* [FLINK-22432] Update upgrading.md with 1.13.x

* [FLINK-22253][docs] Update back pressure monitoring docs with new WebUI changes

* [hotfix][docs] Update unaligned checkpoint docs

* [hotfix][network] Remove redundant word from comment

* [FLINK-21131][webui] Alignment timeout displayed in checkpoint configuration(WebUI)

* [FLINK-22548][network] Remove illegal unsynchronized access to PipelinedSubpartition#buffers

After peeking last buffer, this last buffer could have been processed
and recycled by the task thread, before NetworkActionsLogger.traceRecover
would manage to check it's content causing IllegalReferenceCount exceptions.

Further more, this log seemed excessive and was accidentally added after
debugging session, so it's ok to remove it.

* [FLINK-22563][docs] Add migration guide for new StateBackend interfaces

Co-authored-by: Nico Kruber <nico.kruber@gmail.com>

This closes apache#15831

* [FLINK-22379][runtime] CheckpointCoordinator checks the state of all subtasks before triggering the checkpoint

* [FLINK-22488][hotfix] Update SubtaskGatewayImpl to specify the cause of sendEvent() failure when triggering task failover

* [FLINK-22442][CEP] Using scala api to change the TimeCharacteristic of the PatternStream is invalid

This closes apache#15742

* [FLINK-22573][datastream] Fix AsyncIO calls timeout on completed element.

As long as the mailbox is blocked, timers are not cancelled, such that a completed element might still get a timeout.
The fix is to check the completed flag when the timer triggers.

* [FLINK-22573][datastream] Fix AsyncIO calls timeout on completed element.

As long as the mailbox is blocked, timers are not cancelled, such that a completed element might still get a timeout.
The fix is to check the completed flag when the timer triggers.

* [FLINK-22233][table] Fix "constant" typo in PRIMARY KEY exception messages

This closes apache#15656

Co-authored-by: wuys <wuyongsheng@qtshe.com>

* [hotfix][hive] Add an ITCase that checks partition-time commit pretty well

This closes apache#15754

* [FLINK-22512][hive] Fix issue of calling current_timestamp with hive dialect for hive-3.1

This closes apache#15819

* [FLINK-22362][network] Improve error message when taskmanager can not connect to jobmanager

* [FLINK-22581][docs] Keyword CATALOG is missing in sql client doc (apache#15844)

* [hotfix][docs] Replace local failover with partial failover

* [FLINK-22266][hotfix] Do not clean up jobs in AbstractTestBase if the MiniCluster is not running

This closes apache#15829

* [hotfix][docs] Fix code tabs for state backend migration

* [hotfix][docs][python] Fix the example in intro_to_datastream_api.md

* [hotfix][docs][python] Add an overview page for Python UDFs

* [hotfix] Make reactive warning less strong, clarifications

* [hotfix][docs] Re-introduce note about FLINK_CONF_DIR

This closes apache#15845

* [hotfix][docs][python] Add introduction about the open method in Python DataStream API

* [FLINK-21095][ci] Remove legacy slot management profile

* [FLINK-22406][tests] Add RestClusterClient to MiniClusterWithClientResource

* [FLINK-22406][coordination][tests] Stabilize ReactiveModeITCase

* [FLINK-22419][coordination][tests] Rework RpcEndpoint delay tests

* [FLINK-22560][build] Move generic filters/transformers into general shade-plugin configuration

* [FLINK-22560][build] Filter maven metadata directory

* [FLINK-22560][build] Add dedicated name to flink-dist shade-plugin execution

* [FLINK-22555][build][python] Exclude leftover jboss files

* [hotfix] Ignore failing test reported in FLINK-22559

* [hotfix][docs] Fix typo in dependency_management.md

* [hotfix][docs] Mention new StreamTableEnvironment.fromDataStream in release notes

* [hotfix][docs] Mention new StreamTableEnvironment.fromDataStream in Chinese release notes

* [FLINK-22505][core] Limit the scale of Resource to 8

This closes apache#15815

* [FLINK-19606][table-runtime-blink] Introduce WindowJoinOperator and WindowJoinOperatorBuilder

This closes apache#15760

* [FLINK-22536][runtime] Promote the critical log in FineGrainedSlotManager to INFO level

This closes apache#15850

* [FLINK-22355][docs] Fix simple task manager memory model image

This closes apache#15862

* [FLINK-19606][table-planner] Introduce StreamExecWindowJoin and window join it cases

This closes apache#15479

* [hotfix][docs] Correct the examples in Python DataStream API

* [FLINK-17170][kinesis] Move KinesaliteContainer to flink-connector-kinesis.

This testcontainer will be used in an ITCase in the next commit.

Also move system properties required for test into pom.xml.

* [FLINK-17170][kinesis] Fix deadlock during stop-with-savepoint.

During stop-with-savepoint cancel is called under lock in legacy sources. Thus, if the fetcher is trying to emit a record at the same time, it cannot obtain the checkpoint lock. This behavior leads to a deadlock while cancel awaits the termination of the fetcher.
The fix is to mostly rely on the termination inside the run method. As a safe-guard, close also awaits termination where close is always caused without lock.

* [FLINK-21181][runtime] Wait for Invokable cancellation before releasing network resources

* [FLINK-22609][runtime] Generalize AllVerticesIterator

* [hotfix][docs] Fix image links

* [FLINK-22525][table-api] Fix gmt format in Flink from GMT-8:00 to GMT-08:00 (apache#15859)

* [FLINK-22559][table-planner] The consumed DataType of ExecSink should only consider physical columns

This closes apache#15864.

* [hotfix][core] Remove unused import

* [FLINK-22537][docs] Add documentation how to interact with DataStream API

This closes apache#15837.

* [FLINK-22596] Active timeout is not triggered if there were no barriers

The active timeout did not take effect if it elapsed before the first
barrier arrived. The reason is that we did not reset the future for
checkpoint complete on barrier announcement. Therefore we considered the
completed status for previous checkpoint when evaluating the timeout for
current checkpoint.

* [hotfix][test] Adds -e flag to interpret newline in the right way

* [FLINK-22566][test] Adds log extraction for the worker nodes

We struggled to get the logs of the node manager which made it hard to
investigate FLINK-22566 where there was a lag between setting up the YARN
containers and starting the TaskExecutor. Hopefully, the nodemanager logs
located on the worker nodes will help next time to investigate something like
that.

* [FLINK-22577][tests] Harden KubernetesLeaderElectionAndRetrievalITCase

This commit introduces closing logic to the TestingLeaderElectionEventHandler which would
otherwise forward calls after the KubernetesLeaderElectionDriver is closed.

This closes apache#15849.

* [FLINK-22604][table-runtime-blink] Fix NPE on bundle close when task failover after a failed task open

This closes apache#15863

* [hotfix] Disable broken savepoint tests tracked in FLINK-22067

* [FLINK-22407][build] Bump log4j to 2.24.1

- CVE-2020-9488

* [FLINK-22313][table-planner-blink] Redundant CAST in plan when selecting window start and window end in window agg (apache#15806)

* [FLINK-22523][table-planner-blink] Window TVF should throw helpful exception when specifying offset parameter (apache#15803)

* [FLINK-22624][runtime] Utilize the remain resource of new pending task managers to fulfill requirement in DefaultResourceAllocationStrategy

This closes apache#15888

* [FLINK-22413][WebUI] Hide Checkpointing page for batch jobs

* [FLINK-22364][doc] Translate the page of "Data Sources" to Chinese. (apache#15763)

* [FLINK-22586][table] Improve the precision dedivation for decimal arithmetics

This closes apache#15848

* [hotfix][e2e] Output and collect the logs for Kubernetes IT cases

* [FLINK-17857][test] Make K8s e2e tests could run on Mac

This closes apache#14012.

* [hotfix][ci] Use static methods

* [FLINK-22556][ci] Extend JarFileChecker to search for traces of incompatible licenses

* [FLINK-21700][security] Add an option to disable credential retrieval on a secure cluster (apache#15131)

* [FLINK-22408][sql-parser] Fix SqlDropPartitions unparse Error

This closes apache#15894

* [FLINK-21469][runtime] Implement advanceToEndOfEventTime for MultipleInputStreamTask

For stop with savepoint, StreamTask#advanceToEndOfEventTime() is called (in source tasks)
to advance to the max watermark. This PR implments advanceToEndOfEventTime for
MultipleInputStreamTask chained sources.

* [hotfix][docs] Fix all broken images

Co-authored-by: Xintong Song <6509172+xintongsong@users.noreply.github.com>

This closes apache#15865

* [hotfix][docs] Fix typo in k8s docs

This closes apache#15886

* [FLINK-22628][docs] Update state_processor_api.md

This closes apache#15892

* [hotfix] Add missing TestLogger to Kinesis tests

* [FLINK-22574] Adaptive Scheduler: Fix cancellation while in Restarting state.

The Canceling state of Adaptive Scheduler was expecting the ExecutionGraph to be in state RUNNING when entering the state.
However, the Restarting state is cancelling the ExecutionGraph already, thus the ExectionGraph can be in state CANCELING or CANCELED when entering the Canceling state.

Calling the ExecutionGraph.cancel() method in the Canceling state while being in ExecutionGraph.state = CANCELED || CANCELLED is not a problem.

The change is guarded by a new ITCase, as this issue affects the interplay between different AS states.

This closes apache#15882

* [FLINK-22640] [datagen] Fix DataGen SQL Connector does not support defining fields min/max option of decimal type field

This closes apache#15900

* [FLINK-22534][runtime][yarn] Set delegation token's service name as credential alias

This closes apache#15810

* [FLINK-22618][runtime] Fix incorrect free resource metrics of task managers

This closes apache#15887

* [FLINK-15064][table-planner-blink] Remove XmlOutput util class in blink planner since Calcite has fixed the issue

This closes apache#15911

* [FLINK-19796][table] Explicit casting shoule be made if the type of an element in `ARRAY/MAP` not equals with the derived component type

This closes apache#15906

* [FLINK-22475][table-common] Document usage of '#' placeholder in option keys

* [FLINK-22475][table-common] Exclude options with '#' placeholder from validation of required options

* [FLINK-22475][table-api-java-bridge] Add placeholder options for datagen connector

This closes apache#15896.

* [FLINK-22511][python] Fix the bug of non-composite result type in Python TableAggregateFunction

This closes apache#15796.

* [[hotfix][docs] Changed argument for toDataStream to Table

This closes apache#15923 .

* [FLINK-22654][sql-parser] Fix SqlCreateTable#toString()/unparse() lose CONSTRAINTS and watermarks

This closes apache#15918

* [FLINK-22658][table] Remove Deprecated util class TableConnectorUtil (apache#15914)

* [FLINK-22592][runtime] numBuffersInLocal is always zero when using unaligned checkpoints

This closes apache#15915

* [FLINK-22400][hive] fix NPE problem when convert flink object for Map

This closes apache#15712

* [FLINK-22649][python][table-planner-blink] Support StreamExecPythonCalc json serialization/deserialization

This closes apache#15913.

* [FLINK-22502][checkpointing] Don't tolerate checkpoint retrieval failures on recovery

Ignoring such failures and running with an incomplete
set of checkpoints can lead to consistency violation.

Instead, transient failures should be mitigated by
automatic job restart.

* [FLINK-22667][docs] Add missing slash

* [FLINK-22650][python][table-planner-blink] Support StreamExecPythonCorrelate json serialization/deserialization

This closes apache#15922.

* [FLINK-22652][python][table-planner-blink] Support StreamExecPythonGroupWindowAggregate json serialization/deserialization

This closes apache#15934.

* [FLINK-22666][table] Make structured type's fields more lenient during casting

Compare children individually for anonymous structured types. This
fixes issues with primitive fields and Scala case classes.

This closes apache#15935.

* [hotfix][table-planner-blink] Give more helpful exception for codegen structured types

* [FLINK-22620][orc] Drop BatchTableSource OrcTableSource and related classes

This removes the OrcTableSource and related classes including OrcInputFormat. Use
the filesystem connector with a ORC format as a replacement. It is possible to
read via Table & SQL API snd convert the Table to DataStream API if necessary.
DataSet API is not supported anymore.

This closes apache#15891.

* [FLINK-22651][python][table-planner-blink] Support StreamExecPythonGroupAggregate json serialization/deserialization

This closes apache#15928.

* [FLINK-22653][python][table-planner-blink] Support StreamExecPythonOverAggregate json serialization/deserialization

This closes apache#15937.

* [FLINK-12295][table] Fix comments in MinAggFunction and MaxAggFunction

This closes apache#15940

* [FLINK-20695][ha] Clean ha data for job if globally terminated

At the moment Flink only cleans up the ha data (e.g. K8s ConfigMaps, or
Zookeeper nodes) while shutting down the cluster. This is not enough for
a long running session cluster to which you submit multiple jobs. In
this commit, we clean up the data for the particular job if it reaches a
globally terminal state.

This closes apache#15561.

* [FLINK-22067][tests] Wait for vertices to using API

* [hotfix][tests] Remove try/catch from SavepointWindowReaderITCase.testApplyEvictorWindowStateReader

* [FLINK-22622][parquet] Drop BatchTableSource ParquetTableSource and related classes

This removes the ParquetTableSource and related classes including various ParquetInputFormats.
Use the filesystem connector with a Parquet format as a replacement. It is possible to
read via Table & SQL API and convert the Table to DataStream API if necessary.
DataSet API is not supported anymore.

This closes apache#15895.

* [hotfix] Ignore failing KinesisITCase traacked in FLINK-22613

* [FLINK-19545][e2e] Add e2e test for native Kubernetes HA

The HA e2e test will start a Flink application first and wait for three successful checkpoints. Then kill the JobManager. A new JobManager should be launched and recover the job from latest successful checkpoint. Finally, cancel the job and all the K8s resources should be cleaned up automatically.

This closes apache#14172.

* [FLINK-22656] Fix typos

* [hotfix][runtime] Fixes JavaDoc for RetrievableStateStorageHelper

RetrievableStateStorageHelper is not only used by ZooKeeperStateHandleStore but
also by KubernetesStateHandleStore.

* [hotfix][runtime] Cleans up unnecessary annotations

* [FLINK-22494][kubernetes] Introduces PossibleInconsistentStateException

We experienced cases where the ConfigMap was updated but the corresponding HTTP
request failed due to connectivity issues. PossibleInconsistentStateException
is used to reflect cases where it's not clear whether the data was actually
written or not.

* [FLINK-22494][ha] Refactors TestingLongStateHandleHelper to operate on references

The previous implementation stored the state in the StateHandle. This causes
problems when deserializing the state creating a new instance that does not
point to the actual state but is a copy of this state.

This refactoring introduces LongStateHandle handling the actual state and
LongRetrievableStateHandle referencing this handle.

* [FLINK-22494][ha] Introduces PossibleInconsistentState to StateHandleStore

* [FLINK-22494][runtime] Refactors CheckpointsCleaner to handle also discardOnFailedStoring

* [FLINK-22494][runtime] Adds PossibleInconsistentStateException handling to CheckpointCoordinator

This closes apache#15832.

* [FLINK-22515][docs] Add documentation for GSR-Flink Integration

* [FLINK-20487][table-planner-blink] Remove restriction on StreamPhysicalGroupWindowAggregate which only supports insert-only input node

This closes apache#14830

* [FLINK-22623][hbase] Drop BatchTableSource/Sink HBaseTableSource/Sink and related classes

This removes the HBaseTableSource/Sink and related classes including various HBaseInputFormats and
HBaseSinkFunction. It is possible to read via Table & SQL API and convert the Table to DataStream API
(or vice versa) if necessary. DataSet API is not supported anymore.

This closes apache#15905.

* [hotfix][hbase] Fix warnings around decimals in HBaseTestBase

* [FLINK-22636][zk] Group job specific zNodes under /jobs zNode

In order to better clean up job specific HA services, this commit changes the layout of the
zNode structure so that the JobMaster leader, checkpoints and checkpoint counter is now grouped
below the jobs/ zNode.

Moreover, this commit groups the leaders of the cluster components (Dispatcher, ResourceManager,
RestServer) under /leader/process/latch and /leader/process/connection-info.

This closes apache#15893.

* [FLINK-22696][tests] Enable Confluent Schema Registry e2e test on jdk 11

Co-authored-by: Dian Fu <dianfu@apache.org>
Co-authored-by: Jing Zhang <beyond1920@126.com>
Co-authored-by: Shengkai <1059623455@qq.com>
Co-authored-by: Jane Chan <qingyue.cqy@gmail.com>
Co-authored-by: huangxingbo <hxbks2ks@gmail.com>
Co-authored-by: Leonard Xu <xbjtdcq@gmail.com>
Co-authored-by: Till Rohrmann <trohrmann@apache.org>
Co-authored-by: Stephan Ewen <sewen@apache.org>
Co-authored-by: godfreyhe <godfreyhe@163.com>
Co-authored-by: Piotr Nowojski <piotr.nowojski@gmail.com>
Co-authored-by: Robert Metzger <rmetzger@apache.org>
Co-authored-by: Dawid Wysakowicz <dwysakowicz@apache.org>
Co-authored-by: Timo Walther <twalthr@apache.org>
Co-authored-by: Jingsong Lee <jingsonglee0@gmail.com>
Co-authored-by: Rui Li <lirui@apache.org>
Co-authored-by: dbgp2021 <80257284+dbgp2021@users.noreply.github.com>
Co-authored-by: sharkdtu <sharkdtu@tencent.com>
Co-authored-by: Dong Lin <lindong28@gmail.com>
Co-authored-by: chuixue <chuixue@dtstack.com>
Co-authored-by: Yangze Guo <karmagyz@gmail.com>
Co-authored-by: kanata163 <35188210+kanata163@users.noreply.github.com>
Co-authored-by: zhaoxing <zhaoxing@dayuwuxian.com>
Co-authored-by: Chesnay Schepler <chesnay@apache.org>
Co-authored-by: Cemre Mengu <cemremengu@gmail.com>
Co-authored-by: paul8263 <xzhangyao@126.com>
Co-authored-by: Jark Wu <jark@apache.org>
Co-authored-by: zhangjunfan <zuston.shacha@gmail.com>
Co-authored-by: Shuo Cheng <mf1533007@smail.nju.edu.cn>
Co-authored-by: Tartarus0zm <zhangmang1@163.com>
Co-authored-by: JingsongLi <lzljs3620320@aliyun.com>
Co-authored-by: Michael Li <brighty916@gmail.com>
Co-authored-by: Yun Tang <myasuka@live.com>
Co-authored-by: SteNicholas <programgeek@163.com>
Co-authored-by: mans2singh <mans2singh@users.noreply.github.com>
Co-authored-by: Anton Kalashnikov <kaa.dev@yandex.ru>
Co-authored-by: yiksanchan <evan.chanyiksan@gmail.com>
Co-authored-by: Tony Wei <tony19920430@gmail.com>
Co-authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Co-authored-by: Yuan Mei <yuanmei.work@gmail.com>
Co-authored-by: hackergin <jinfeng1995@gmail.com>
Co-authored-by: Ingo Bürk <ingo.buerk@tngtech.com>
Co-authored-by: wangwei1025 <jobwangwei@yeah.net>
Co-authored-by: Danny Cranmer <dannycranmer@apache.org>
Co-authored-by: shuo.cs <shuo.cs@alibaba-inc.com>
Co-authored-by: zhangzhengqi3 <zhangzhengqi5@jd.com>
Co-authored-by: Yun Gao <gaoyunhenhao@gmail.com>
Co-authored-by: Arvid Heise <arvid@ververica.com>
Co-authored-by: MaChengLong <592577182@qq.com>
Co-authored-by: zhaown <51357674+chaozwn@users.noreply.github.com>
Co-authored-by: Huachao Mao <huachaomao@gmail.com>
Co-authored-by: GuoWei Ma <guowei.mgw@gmail.com>
Co-authored-by: Roman Khachatryan <khachatryan.roman@gmail.com>
Co-authored-by: fangyue1 <fangyuefy@163.com>
Co-authored-by: Jacklee <lqjacklee@126.com>
Co-authored-by: zhang chaoming <72908278+ZhangChaoming@users.noreply.github.com>
Co-authored-by: sasukerui <199181qr@sina.com>
Co-authored-by: Seth Wiesman <sjwiesman@gmail.com>
Co-authored-by: chennuo <chennuo@didachuxing.com>
Co-authored-by: wysstartgo <wysstartgo@163.com>
Co-authored-by: wuys <wuyongsheng@qtshe.com>
Co-authored-by: 莫辞 <luoyuxia.luoyuxia@alibaba-inc.com>
Co-authored-by: gentlewangyu <yuwang0917@gmail.com>
Co-authored-by: Youngwoo Kim <ywkim@apache.org>
Co-authored-by: HuangXiao <hx36w35@163.com>
Co-authored-by: Roc Marshal <64569824+RocMarshal@users.noreply.github.com>
Co-authored-by: Leonard Xu <xbjtdcq@163.com>
Co-authored-by: Matthias Pohl <matthias@ververica.com>
Co-authored-by: lincoln lee <lincoln.86xy@gmail.com>
Co-authored-by: Senhong Liu <amazingliux@gmail.com>
Co-authored-by: wangyang0918 <danrtsey.wy@alibaba-inc.com>
Co-authored-by: aidenma <aidenma@tencent.com>
Co-authored-by: Kevin Bohinski <kbohinski@users.noreply.github.com>
Co-authored-by: yangqu <quyanghaoren@126.com>
Co-authored-by: Terry Wang <zjuwangg@foxmail.com>
Co-authored-by: wangxianghu <wangxianghu@apache.org>
Co-authored-by: hehuiyuan <hehuiyuan@jd.com>
Co-authored-by: lys0716 <luystu@gmail.com>
Co-authored-by: Yi Tang <ssnailtang@gmail.com>
Co-authored-by: LeeJiangchuan <lijiangchuan_ljc@yeah.net>
Co-authored-by: Linyu <yaoliny@amazon.com>
Co-authored-by: Fabian Paul <fabianpaul@ververica.com>
@fsk119 fsk119 deleted the FLINK-16384 branch November 22, 2021 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants