-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Leam
is a research prototype that combining the strengths of spreadsheets, computational notebooks, and interactive visualizations to facilitate integrated text analytics. Leam
implements a visual text algebra to facilitate extensible and expressive analysis, supporting diverse tasks ranging from data cleaning to visualization. It also enables declarative specification of interactive coordination across views of data, code, and visualizations.
Visual interactive text analytics (VITA
hereafter) is an iterative and non-linear process---it is a multistage process that involves tasks like data preprocessing and transformation, model building, hypothesis testing, and insight exploration, all of which require multiple iterations to obtain satisfactory outcomes. There are a few commercial (e.g., tableau
, powerbi
) and open-source tools (e.g., nltk
, gensim
, spacy
) that can support different stages of VITA
. For example, spreadsheets allow directly processing and manipulating data, computational notebooks enable flexible exploratory analysis and modeling, and visualization systems, typically based on chart templates, facilitate quick interactive visual analysis. An ideal system would unify these affordances. There are also many customized visual text analytics tools, often in the form of research prototypes, that focus on specific use-cases like review exploration, sentiment analysis, and text summarization. Unfortunately, none of these solutions accommodate the inherently cyclic, trial-and-error-based nature of VITA
pipelines end to end in an integrated manner.
While having end-to-end VITA
systems is much desired, designing and building them is difficult. The primary challenge is the number and diversity of the tasks that need to be supported. In part, programmatic tools such as computational notebooks can provide extensibility and
expressivity to incrementally build such support but they often lack in interactivity, accessibility, and scalability, impeding non-linear iterative analysis. We build Leam
that addresses these challenges while enabling an integrated experience for VITA
.
Following is the outline of the Leam Wiki
: