You are here

Language documentation projects


Cape York Peninsula Language Documentation Emergency Documentation Team Pilot Project had two main general aims: to undertake language documentation work on a selection of endangered Cape York languages, and to train community based language workers with some of the skills they will need to continue language documentation work independent of a linguist.



The Baure documentation project's aim is a complex documentation of the seriously endangered South Arawakan language Baure, spoken in the Amazonian fringe of Bolivia, close to the border with Brazil.



The Awetí Language Documentation Project aims to comprehensively document the language spoken by the Awetí, an indigenous tribe of just over a hundred people living in the Xingú headwater area of central Brazil.



The Tsúùt’ìnà Gunaha Project includes the digitization, cataloguing, archiving, and curating of historical and contemporary recordings; installation of a band-funded audio sound studio at the Tsuu T’ina Nation and training of community members to prepare archiveable recordings; the delivery of a customized Community Linguist Certificate program at the reserve so that community members can learn techniques for and engage in linguistic analysis, language documentation, language and culture revitalization, and language pedagogy.



The Sora Language Documenation and Talking Dictionary Project aims to record and disseminate information for research while also protecting a threatened minority language.

Vanishing Voices of the Great Andamanese is a major documentation project which aims at a detailed descriptive grammar, a sociolinguistic description, a trilingual dictionary in Great Andamanese, Hindi and English, and an extensive archive of folklore, oral texts, and video recordings of the surviving 36 Great Andamanese speakers residing in the Strait Island.



The Rongga Documentation Project aims to set up a comprehensive archive of the Rongga language and culture as a resource tool containing the core data and references on which further studies, analyses, and practical language-community programs may be based.


Ivory Coast

Ega Web Archive: an endangered languages documentation initiative. The Ega initiative aims to develop a model for the computational language documentation of an endangered language.



Dogon and Bangime Linguistics There are approximately twenty Dogon languages. Depending on future funding, we hope to complete by 2014 the documentation and analysis (reference grammar, lexicography, texts, images) of all of these languages and to present the material in an integrated fashion.



The Lacandon Cultural Heritage project documents the language and culture of the northern Lacandones, who live in the Selva Lacandona of Chiapas, Mexico.



The Chintang and Puma Documentation Project (CPDP) aims at the linguistic and ethnographic documentation of two endangered Kiranti languages of Nepal.



The Forum for Language Iniatives is a Resource and training centre working to enable the language communities of northern Pakistan to preserve and promote their mother tongues.


Papua New Guinea

Basic Oral Language Documentation‬ (BOLD:PNG) is recording and transcribing indigenous languages of Papua New Guinea, a country which is home to over 800 languages, many with few remaining speakers, and many with minimal linguistic documentation. The team hopes to collect narratives, dialogues and songs for 100 languages, using the technique of "Basic Oral Language Documentation" (BOLD).

Arapesh Grammar and Digital Language Archive Project was conceived as a way of preserving, integrating, and disseminating some of the rich documentary material that has been produced on the Arapesh languages.



Iquito Language Documentation Project aims to document the Peruvian language of Iquito through creation of a dictionary, grammar book and teaching aids, and train members of the San Antonio community as language teachers.



Kola Saami Documentation Project (KSDP): Linguistic and ethnographic documentation of the endangered Kola Saami languages. The aim of the project is to provide comprehensive linguistic and ethnographic documentation of the endangered Saami languages of Russia, which are spoken in the northwestern-most region of Russia (Murmansk Region – Murmanskaja oblast') on the Kola Peninsula.


Sierra Leone

Documenting the Krim and Bom Languages of Sierre Leone will document two dying languages spoken in the coastal tidelands of south-eastern Sierra Leone. The speakers of both Krim and Bom are now shifting to Mende, and the languages are both presumed to be moribund. Native and non-native participants will be trained in how to do linguistic fieldwork, in accordance with their backgrounds and abilities.


United States

Wichita Documentation Project  aims to provide linguistic and cultural information about the older and modern day Wichita language and Wichita people. Wichita is a North Native American language of the Caddoan branch, spoken near Anadarko, Oklahoma, USA.