Skip to content
Sajjadur Rahman edited this page Dec 18, 2020 · 3 revisions

Leam is a research prototype that combining the strengths of spreadsheets, computational notebooks, and interactive visualizations to facilitate integrated text analytics. Leam implements a visual text algebra to facilitate extensible and expressive analysis, supporting diverse tasks ranging from data cleaning to visualization. It also enables declarative specification of interactive coordination across views of data, code, and visualizations.

Motivation

Visual interactive text analytics (VITA hereafter) is an iterative and non-linear process---it is a multistage process that involves tasks like data preprocessing and transformation, model building, hypothesis testing, and insight exploration, all of which require multiple iterations to obtain satisfactory outcomes. There are a few commercial (e.g., tableau, powerbi) and open-source tools (e.g., nltk, gensim, spacy) that can support different stages of VITA. For example, spreadsheets allow directly processing and manipulating data, computational notebooks enable flexible exploratory analysis and modeling, and visualization systems, typically based on chart templates, facilitate quick interactive visual analysis. An ideal system would unify these affordances. There are also many customized visual text analytics tools, often in the form of research prototypes, that focus on specific use-cases like review exploration, sentiment analysis, and text summarization. Unfortunately, none of these solutions accommodate the inherently cyclic, trial-and-error-based nature of VITA pipelines end to end in an integrated manner.

While having end-to-end VITA systems is much desired, designing and building them is difficult. The primary challenge is the number and diversity of the tasks that need to be supported. In part, programmatic tools such as computational notebooks can provide extensibility and expressivity to incrementally build such support but they often lack in interactivity, accessibility, and scalability, impeding non-linear iterative analysis. We build Leam that addresses these challenges while enabling an integrated experience for VITA.

Wiki Outline

Following is the outline of the Leam Wiki: