Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Amazon Comprehend Document Classifier #40287

Merged

Conversation

gopidesupavan
Copy link
Member

Adding Amazon Comprehend document classifier. Doc, Operator, Sensor, Trigger, Waiter, Unit Test, System Test.

Manually tested in Breeze with

wait_for_completion=False with a Sensor
deferrable=True.
wait_for_completion=True

For the system test, I used two documents from AWS samples and created multiple copies. Since the classifier requires a minimum of 10 documents for training for each label. I've observed that it takes a maximum of 10 to 15 minutes to train the classifier, given the limited number of labels and documents. This is the minimum setup I was able to get running, so it can be executed in the daily system test suite.

image

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@gopidesupavan gopidesupavan changed the title Add comprehend document classifier Add Amazon Comprehend Document Classifier Jun 17, 2024
Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2 comments but overall fantastic work!!

@gopidesupavan
Copy link
Member Author

Just 2 comments but overall fantastic work!!

Thank you 😄

@vincbeck vincbeck merged commit d5fb711 into apache:main Jun 19, 2024
51 checks passed
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants