Skip to content

TitanTadm

StephanOepen edited this page Jun 16, 2009 · 12 revisions

Background

This page documents a sub-task of the HPC adaptation project at UiO; please see the TitanTop page for background.

In order to eliminate the current biggest bottleneck in the DELPH-IN toolchain, turn-around times in machine learning experiments need to be reduced substantially. The Toolkit for Advanced Discriminative Modeling ([http://tadm.sf.net TADM]) is the main machine learning component used in DELPH-IN research to date. TADM estimates the parameters of so-called discriminative, log-linear (or exponential) statistical models, where the result of this process can subsequently serve to probabilistically rank competing hypotheses, say directing the parser towards the most probable analysis.

TADM is implemented in C++, built on top of the [http://www.mcs.anl.gov/petsc/petsc-2/ PETSc] and [http://www.mcs.anl.gov/research/projects/tao/ TAO] libraries, and was originally developed by [http://www-rohan.sdsu.edu/~malouf/ Rob Malouf] (then at the University of Groningen, The Netherlands). A group of active TADM users, collaborating with Rob, hosted the project at SourceForge around 2004 and consolidated existing patches (including some from UiO). Otherwise, there has been no active TADM development in recent years, and available documentation is sparse.

TADM is applied to training data (typically in the form of millions or billions of integer-coded 'features') prepared using the itsdb software (see the TitanItsdb page), and a single estimation run can take several cpu hours. In searching for best-performing model parameters, dozens or hundreds of distinct configurations need to be tested, typically each by means of ten-fold cross validation. Hence, in current development, TADM throughput is the primary bottleneck.

Reportedly, a parallel version of TADM was available locally at Groningen in the late 1990s (customized for MPICH and Myrinet), and the project will resurrect (and adapt as needed, for use on TITAN) MPI support in TADM. Also, it will be necessary to profile some of the core routines and experiment with different versions of low-level libraries (notably BLAS and LAPACK) and use of the Intel compiler suite (rather than the vanilla GNU Compiler Collection), to further improve the cpu utilization of TADM.

This work package will be predominantly implemented by VD staff, (re-)enabling the incomplete and currently dormant MPI support in the TADM code base. Once the software modifications are complete, a joint series of experiments of increasing complexity will serve to determine the scalability of the TADM core (numeric optimization, processing huge sparse matrices). The extended TADM software will be integrated with the LOGON tree and contributed to the TADM project repository at SourceForge.

Clone this wiki locally