Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-14129 Add Database Dialect Service #9640

Closed
wants to merge 5 commits into from

Conversation

exceptionfactory
Copy link
Contributor

Summary

NIFI-14129 Adds a DatabaseDialectService Controller Service interface in a new nifi-database-dialect-service-api as a migration path the existing internal DatabaseAdapter interface and associated implementations.

The new nifi-database-dialect-service-nar includes a StandardDatabaseDialectService implementation that provides a generic implementation for the minimum Statement Types of ALTER, CREATE, and SELECT.

The introduction of the DatabaseDialectService applies to the following Processors:

  • PutDatabaseRecord
  • GenerateTableFetch
  • QueryDatabaseTable
  • QueryDatabaseTableRecord
  • UpdateDatabaseTable

The service also applies to the DatabaseParameterProvider.

This pull request provides an interim transitional strategy for moving away from the DatabaseAdapter to the new DatabaseDialectService. The initial set of changes preserve existing Database Type properties that use DatabaseAdapter implementations and wraps interacts with the DatabaseAdapter in a new DatabaseAdapterDatabaseDialectService.

Existing Database Type properties include a new value named Database Dialect Service that enables the dependent Database Dialect Service property for Controller Service implementations.

This wrapping approach avoids the need to convert existing DatabaseAdapter implementations as part of this change, allowing existing Processor configurations to work without changes, while also support incremental migration of DatabaseAdapter implementations to new DatabaseDialectService implementations.

The DatabaseDialectService interface has a simplified contract method named getStatement() which takes a StatementRequest and returns a StatementResponse containing rendered SQL. The StatementRequest interface has standard implementations, and a QueryStatementRequest extended interface for describing queries. The getSupportedStatementTypes() provides a concise method for implementations to indicate supported statements.

Changes include modifications to existing unit tests that involved direct access to Processor methods using the DatabaseAdapter. Existing unit and integration tests against public methods and behaviors remain in place to validate implementation changes.

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

- Added database-dialect-service-api
- Added Standard Database Dialect Service implementation
- Added Database Adapter implementation
- Added Database Dialect Service property descriptor to Database Processors
- Refactored Database Processors with optional Database Dialect Service
Copy link
Contributor

@pvillard31 pvillard31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge PR, thanks @exceptionfactory !

I have been through extensive testing all day using Postgres and MySQL databases. I have tested many cases before and after this PR (INSERTs, UPDATEs, UPSERTs, DELETEs). Everything is working as expected and I didn't catch any regression / breaking change.

I also tried the new Standard Database Dialect controller service with basic statements and it worked well. I believe this new interface is covering all of the needs we may have in the future while still being very generic.

The approach you used to keep the "old" way and the "new" way is really cool and it is a great addition! Given the size of the PR, I'll wait until some day next week before merging in case someone else in the community wants to give it a try / to provide feedback. I'm a +1 as far as I'm concerned.

@tpalfy
Copy link
Contributor

tpalfy commented Jan 20, 2025

Thanks for this update @exceptionfactory and @pvillard31 for the extensive testing.

I'm doing a review for the code and planning to test it on Oracle as well.
Would need a couple of days to finish if that's okay.

@exceptionfactory
As I see the essence of the backward compatibility is duplicated in the processors:

    protected DatabaseDialectService getDatabaseDialectService(final PropertyContext context) {
        final DatabaseDialectService databaseDialectService;
        final String databaseType = context.getProperty(DB_TYPE).getValue();
        if (DatabaseDialectServiceDatabaseAdapter.NAME.equals(databaseType)) {
            databaseDialectService = context.getProperty(DATABASE_DIALECT_SERVICE).asControllerService(DatabaseDialectService.class);
        } else {
            databaseDialectService = new DatabaseAdapterDatabaseDialectService(databaseType);
        }
        return databaseDialectService;
    }

Could we move the logic into the DatabaseAdapterDescriptor class or somewhere else in one place?

@exceptionfactory
Copy link
Contributor Author

Could we move the logic into the DatabaseAdapterDescriptor class or somewhere else in one place?

Thanks for the initial feedback @tpalfy. The Database Type properties have different names depending on the referencing classes, which required the creation of unique Property Descriptors. I pushed an update to streamline most of the instantiation process, taking the resolved Database Type to avoid confusion related to the Database Type property descriptor.


if (StatementType.ALTER == statementType) {
final List<String> statements = databaseAdapter.getAlterTableStatements(tableDefinition.tableName(), columnDescriptions, true, true);
sql = statements.getFirst();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory a DatabaseAdapter may return a separate "ADD COLUMN" for each new individual column. The Derby adapter does this in practice (although exists only in tests).

Does this work in such cases?

If we rely on this feature being basically deprecated and support only a single ALTER statement, we might want to add some comments/docs to clarify this or change the interface altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for highlighting this point @tpalfy.

To clear up the confusion, and since Apache Derby usage is focused on testing, I changed the DatabaseAdapter interface definition for alter tables to return a single string, aligning the approach with all other statements. This clarifies the behavior and works with existing Derby unit tests that exercise adding a single column.

- Adjusted Derby implementation to throw an exception for more than one column in an ALTER TABLE statement construction
public interface QueryStatementRequest extends StatementRequest {
Optional<String> derivedTable();

Collection<QueryClause> queryClauses();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing.

If I understand correctly, there can be 0 or 1 WHERE and 0 or 1 ORDER BY clause.
When the values for those are read, the same logic is used in StandardDatabaseDialectService and in DatabaseAdapterDatabaseDialectService:

            final Optional<QueryClause> whereQueryClause = queryStatementRequest.queryClauses().stream()
                    .filter(queryClause -> QueryClauseType.WHERE == queryClause.queryClauseType())
                    .findFirst();
            final Optional<QueryClause> orderByQueryClause = queryStatementRequest.queryClauses().stream()
                    .filter(queryClause -> QueryClauseType.ORDER_BY == queryClause.queryClauseType())
                    .findFirst();

Is there a reason why don't have a more explicit and strongly typed approach, with an Optional and an Optional?

This can actually lead to an exception. See my comment on DatabaseParameterProvider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. In the initial stage of refactoring, I considered the possibility for other types of query clauses. As the implementation played out, I agree that switching from the collection to optional properties for whereClause and orderByClause would be easier to follow and maintain. I will make adjustments and push an updated version.

import java.util.stream.Collectors;

/**
* Transitional implementation of Database Dialect Service bridging to existing Database Adapters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't clear that this is an irregular ControllerService in a sense that it doesn't have a framework-managed lifecycle. Could be useful to add this technical information to the doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the refactoring, this class does not need to public at all, as the instantiation is localized to the DatabaseAdapterDescriptor. I will update the comment to indicate that it is an internal implementation for clarity.

"""
)
@Tags({ "Relational", "Database", "JDBC", "SQL" })
public class StandardDatabaseDialectService extends AbstractControllerService implements DatabaseDialectService {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought for consideration: This DialectService is similar in concept to the GenericDatabaseAdapter - could be easier to get the whole picture with the name GenericDatabaseDialectService

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered the Generic prefix for this service implementation, but decided to go with Standard as that is more common for other service implementations. I also considered other names like ANSI, but it seemed like Standard was clear, without necessarily exposing some of the previous particular naming conventions around the internal DatabaseAdapter interface.


Nullable nullable();

Optional<String> defaultValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defaultValue() doesn't seem to be called from anywhere. Is it intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. This was carried over from the current structure, but as you noted, it is not used, so I will look at removing it.

* @param pageRequest Page Request can be empty
*/
public record StandardQueryStatementRequest(
StatementType statementType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the StatementType anything else but SELECT for a QueryStatementRequest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at this time, so it could be defaulted to SELECT. Other use cases would be theoretical right now, perhaps such as a SELECT INTO or something along those lines.

List.of(new QueryClause(QueryClauseType.WHERE, whereClause)),
Optional.empty()
);
final StatementResponse statementResponse = databaseDialectService.getStatement(queryStatementRequest);
Copy link
Contributor

@tpalfy tpalfy Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the optionalSQL WHERE clause property is not set (null), the generated select statement will have a "WHERE null" clause, resulting in an exception (at least it did when testing on Oracle).

- Replaced Collection of QueryClauses with Optional values for where clause and order by clause in QueryStatementRequest
- Removed defaultValue() from ColumnDefinition
- Moved internal DatabaseAdapterDatabaseDialectService and set to package-private visibility for internal use
@exceptionfactory
Copy link
Contributor Author

Thanks for the helpful feedback @tpalfy, I pushed an update that addresses your comments. Replacing the QueryClause record with Optional values for the clauses and removing the defaultValue from ColumnDefinition helped simplify the implementation in several places.

@asfgit asfgit closed this in 253a076 Jan 22, 2025
@tpalfy
Copy link
Contributor

tpalfy commented Jan 22, 2025

LGTM
Thank you for your work @exceptionfactory and @pvillard31 for the review.
Merged into main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants