AGTK: Annotation Graph Toolkit

Annotation Graphs are a formal framework for representing linguistic annotations of time series data. Annotation graphs abstract away from file formats, coding schemes and user interfaces, providing a logical layer for annotation systems.

What's New

[2011-01-01] All of our AGTK applications have been renamed to have the "AGTK" prefix: e.g., AGTK TableTrans, AGTK MultiTrans, AGTK InterTrans, and AGTK TreeTrans.

[2007-03-31] AGLIB JAVA 1.0 has been released

[2004-01-15] AGLIB 2.0.1 has been released

[2003-07-08] AGLIB 2.0 has been released!

A bug-fix version of AGTK TableTrans for Windows (version 1.2) has been released.


All software is available from:

All software is licensed under the OSI-approved Common Public License. Please contact us if you would like the software under another license.


  • AGTK MultiTrans: transcribing multi-party conversation
  • AGTK TableTrans: observational coding and annotation of audio
  • AGTK TreeTrans: syntactic annotation
  • AGTK InterTrans: interlinear text transcription

    Content of Distributions

    AGLIB includes:

  • full C++ source distribution (unix, windows)
  • windows binary distribution
  • file format support: xlabel, TIMIT, Penn Treebank, Switchboard, BAS Partitur, CSV, LDC Callhome and AIF (level 0);
  • a Java port is available: AGLIB JAVA

    AGAPPS includes:

  • scripting language support: Tcl/Tk and Python;
  • WaveSurfer interface;

    AGTK Windows includes:

  • AGLIB compiled library
  • all applications
  • installshield installation

    Mailing Lists

  • agtk-announce - sign up for AGTK announcements (moderated, low volume)
  • agtk-devel - send any questions and bug reports to this list (unmoderated)


  • AG Library Documentation [2.0]
  • AG Library API Documentation [1.1] [2.0]
  • AG Library Technical Report (Draft) [ ps | pdf | ps.gz ]
  • XML DTD and examples for AG interchange format
  • Powerpoint presentation

    Research Papers

  • TableTrans, MultiTrans, InterTrans and TreeTrans: Diverse Tools Built on the Annotation Graph Toolkit (Steven Bird, Kazuaki Maeda, Xiaoyi Ma, Haejoong Lee, Beth Randall & Salim Zayat, 2002) (Note: these applications are now named AGTK TableTrans, AGTK MultiTrans, AGTK InterTrans, and AGTK TreeTrans.)
  • Models and Tools for Collaborative Annotation, (Ma, Lee, Bird & Maeda, 2002)
  • Creating Annotation Tools with the Annotation Graph Toolkit, (Maeda, Bird, Ma, Lee, 2002)
  • An Integrated Framework for Treebanks and Multilayer Annotations (Cotton & Bird, 2002)
  • A formal framework for linguistic annotation (Bird & Liberman, 2001)
  • Towards a query language for annotation graphs (Bird, Buneman & Tan, 2000)
  • Annotation graphs as a framework for multidimensional linguistic data analysis (Bird & Liberman, 1999).
  • More publications relating to annotation graphs.

    Related Resources

  • Linguistic Annotation - an extensive survey of annotation resources.
  • TalkBank - the NSF project funding LDC's work on Annotation Graphs.
  • Sourceforge sites for two projects cooperating with the AGTK project: Emu, Transcriber.
  • openNLP - other open source NLP projects (mostly on SourceForge)
  • WaveSurfer and Snack - Open Source tool & toolkit for sound visualization and manipulation (AGTK TableTrans, AGTK MultiTrans and AGTK InterTrans use these.)

    Linguistic Data Consortium

    The AGTK Project is based at the Linguistic Data Consortium at the University of Pennsylvania. The following LDC staff are involved in the development of the toolkit:

  • Steven Bird (coordinator)
  • Xiaoyi Ma (AG library)
  • Haejoong Lee (AG library)
  • Kazuaki Maeda (applications)
  • Beth Randall (applications)
  • Salim Zayat (applications)
  • John Breen (applications)
  • Craig Martell (applications)
  • Chris Osborn (applications)
  • Jonathan Dick (Java port)