Homework for DWBI class. TU Berlin, winter term 2018/2019
This application performs mining of frequent patterns from a dataset of retail transactions. In particular this program is an implementation of ECLAT algorithm (Equivalence Class Transformation) - a depth-first search algorithm based on set intersection.
The tasks are performed within Apache Flink execution environment as batch job and mostly in parallel.
- Download data set from here: https://tubcloud.tu-berlin.de/s/ZtfgnxMCZ5cjJf8 (or mirror)
- Place it under src/main/resources
- Run main() method in ECLATJob class
- Find results in src/main/resources
- Apache Flink - framework and distributed processing engine for stateful computations
- Online Retail Data Set - a data set for an online retail