You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our current approach to starters is not very cohesive with no clear strategy. Investigate ways to change this to help tackle adoption numbers. This builds off concept 2 in the motivation modular Kedro work desribed in #2388
Concept 2: Improve starter journey to increase accessibility of Kedro
This ticket describes an alternative approach to starters, that is complementary to our Kedro Utilities proposal. As part of the Kedro utilities work, it came up that the proposed utilities and our starters had some similarities. Upon breaking down the structure of the existing starters, some patterns and inconsistencies started to emerge.
Furthermore, the current list of starters felt disparate and broad in their goals. There was a general leaning towards showcasing how Kedro could integrate with other libraries like pyspark, astro-airflow etc, through the use of a ‘example starter project’.
Possible Implementation
I propose that we combine our current concept of starters with our new utility modules workflow. At project creation, users will be asked to choose from different components that they want to add to their project.
Continuing with the theme of a simplified project starter, with Add-Ons, every resultant project would start from the same basic template. Building on this, if our team chooses to enforce a more consistent way to provide ‘example code’ i.e. default node and pipeline code, consistent test directory, this would also improve our user’s ability to mix and match examples.
Technical Details: cookiecutter allows you to initialise and add code based on booleans, this feature should enable us to adapt the ‘basic’ template based on a set of flags provided by the user on project creation.
Integration Add-Ons
Goal: allow Kedro to support third-party libraries
databricks
pyspark
astro-airflow
flake8, black, isort (linting)
pytest (testing)
Logging
Kedro-Viz (may have dependency on example pipeline)
experiment tracking
plotly integration
matplotlib integration
Example Projects
Goal: showcase Kedro features, as a team we show others how to use Kedro
Spaceflights
A complete data-processing pipeline
A complete project with Kedro-Viz experiment tracking set up
Initial Prototype (WIP)
Project Add-Ons
================
Here you can select which add-ons you'd like to include.
Don't worry if you change your mind you can always add/remove these later.
To read more about these utilities and what they do visit: kedro.org/
Add-Ons
1) Linting : Provides linting set up with Flake8, Black and isort
2) Testing : Provides testing set up with pytest
3) Logging : Provides more logging options
4) Documentation: Provides documentation setup with Sphinx
5) Databricks: Provides set up for working with Databricks
6) PySpark: Provides set up configuration for working with PySpark
7) Airflow: Provides minimal setup to deploy a pipeline to Airflow using Astronomer
8) Kedro-Viz: Provides Kedro's native visualisation tool
8a) Plotly: Provides interactive pipeline visualisations
8b) Experiment-Tracking: Sets up experiment tracking, to compare runs
Which add-ons would you like to include in your project? [1-4/all/1,3/none]:
Would you like to include an example pipeline?[y/n]:
Note: The flow for kedro-viz as an add-on needs further work and I am working with @NeroOkwa and the Viz team on this.
Beyond the Add-Ons
Add-ons will be supported by documentation about how to ‘insert these integrations manually
Phase 2 rollout can investigate ability to plug Add-Ons into existing projects
the starter repo can be used to showcase each ‘add-on’ and also facilitate the git-clone workflow
add-ons should be stackable
example pipeline will need to account for their integration choices
list of add-ons should grow as we support more third-party libraries
Create your own starter
community plugins can be installed then displayed an option on project creation
this would expand on our current ‘custom starter workflow’
Allow community participation (GetInData)
MLFlow
SnowFlake
We could keep a list of ‘approved’ community starters to be shared around. i.e. so we dont have to do all the integration work.
Design Next Steps
Prototype new creation flow options
Investigate details of Kedro-Viz as an add-on
Investigate standalone-datacatalog journey separately as part of incremental user journey work.
The text was updated successfully, but these errors were encountered:
Description
Our current approach to starters is not very cohesive with no clear strategy. Investigate ways to change this to help tackle adoption numbers. This builds off concept 2 in the motivation modular Kedro work desribed in #2388
Concept 2: Improve starter journey to increase accessibility of Kedro
a starter journey that incrementally adds in more components of kedro
increase discoverability of Kedro starters
create starters for more advanced use cases, example code rather than refer to docs flow
Context
This ticket describes an alternative approach to starters, that is complementary to our Kedro Utilities proposal. As part of the Kedro utilities work, it came up that the proposed utilities and our starters had some similarities. Upon breaking down the structure of the existing starters, some patterns and inconsistencies started to emerge.
Furthermore, the current list of starters felt disparate and broad in their goals. There was a general leaning towards showcasing how Kedro could integrate with other libraries like pyspark, astro-airflow etc, through the use of a ‘example starter project’.
Possible Implementation
I propose that we combine our current concept of starters with our new utility modules workflow. At project creation, users will be asked to choose from different components that they want to add to their project.
Continuing with the theme of a simplified project starter, with Add-Ons, every resultant project would start from the same basic template. Building on this, if our team chooses to enforce a more consistent way to provide ‘example code’ i.e. default node and pipeline code, consistent test directory, this would also improve our user’s ability to mix and match examples.
Technical Details: cookiecutter allows you to initialise and add code based on booleans, this feature should enable us to adapt the ‘basic’ template based on a set of flags provided by the user on project creation.
Integration Add-Ons
Goal: allow Kedro to support third-party libraries
Example Projects
Goal: showcase Kedro features, as a team we show others how to use Kedro
Initial Prototype (WIP)
Note: The flow for kedro-viz as an add-on needs further work and I am working with @NeroOkwa and the Viz team on this.
Beyond the Add-Ons
Design Next Steps
The text was updated successfully, but these errors were encountered: