Academic Activities



ISWS 2019  (30 June - 6 July 2019, Italy)

International Semantic Web Research Summer School

(€970 grant kindly provided by Network Institute - Vrije Universiteit Amsterdam and KNAW Humanities Cluster)

eLex 2019  (1-3 October 2019, Portugal)

Identification of Languages in Linked Open Data: a Case Study of Linguistic Data of French Combining a Diatopic with a Diachronic Perspective

Authors:  Sabine Tittel and Frances Gillis-Webber

23 June 2019

A Model for Language Annotations on the Web

Authors:  Frances Gillis-Webber, Sabine Tittel and C. Maria Keet

Conference:  1st Iberoamerican Knowledge Graphs and Semantic Web Conference (KGSWC 2019)

View: (Communications in Computer and Information Science 2019, 1029)  |  SUPPLEMENTARY MATERIAL


Several annotation models have been proposed to enable a multilingual Semantic Web. Such models hone in on the word and its morphology and assume the language tag and URI comes from external resources. These resources, such as ISO 639 and Glottolog, have limited coverage of the world’s languages and have a very limited thesaurus-like structure at best, which hampers language annotation, hence constraining research in Digital Humanities and other fields. To resolve this ‘outsourced’ task of the current models, we developed a model for representing information about languages, the Model for Language Annotation (MoLA), such that basic language information can be recorded consistently and therewith queried and analyzed as well. This includes the various types of languages, families, and the relations among them. MoLA is formalized in OWL so that it can integrate with Linguistic Linked Data resources. Sufficient coverage of MoLA is demonstrated with the use case of French.

20 - 23 May 2019

The Shortcomings of Language Tags for Linked Data when Modeling Lesser-Known Languages

Authors:  Frances Gillis-Webber and Sabine Tittel

Conference:  2nd Conference on Language, Data and Knowledge (LDK 2019)

Location:  Leipzig, Germany  |  University of Leipzig in the Assembly Hall and University Church of St. Paul

View: (OASIcs 2019, 70)  |  Download:  PRESENTATION


In recent years, the modeling of data from linguistic resources with Resource Description Framework (RDF), following the Linked Data paradigm and using the OntoLex-Lemon vocabulary, has become a prevalent method to create datasets for a multilingual web of data. An important aspect of data modeling is the use of language tags to mark lexicons, lexemes, word senses, etc. of a linguistic dataset. However, attempts to model data from lesser-known languages show significant shortcomings with the authoritative list of language codes by ISO 639: for many lesser-known languages spoken by minorities and also for historical stages of languages, language codes, the basis of language tags, are simply not available. This paper discusses these shortcomings based on the examples of three such languages, i.e., two varieties of click languages of Southern Africa together with Old French, and suggests solutions for the issues identified.

6 November 2018

Conversion of the English-Xhosa Dictionary for Nurses to a Linguistic Linked Data Framework

Author:  Frances Gillis-Webber

Journal:   Special Issue of Information: Towards the Multilingual Web of Data

View: (Information 2018, 9(11), 274)


The English-Xhosa Dictionary for Nurses (EXDN) is a bilingual, unidirectional printed dictionary in the public domain, with English and isiXhosa as the language pair. By extending the digitisation efforts of EXDN from a human-readable digital object to a machine-readable state, using Resource Description Framework (RDF) as the data model, semantically interoperable structured data can be created, thus enabling EXDN’s data to be reused, aggregated and integrated with other language resources, where it can serve as a potential aid in the development of future language resources for isiXhosa, an under-resourced language in South Africa. The methodological guidelines for the construction of a Linguistic Linked Data framework (LLDF) for a lexicographic resource, as applied to EXDN, are described, where an LLDF can be defined as a framework: (1) which describes data in RDF, (2) using a model designed for the representation of linguistic information, (3) which adheres to Linked Data principles, and (4) which supports versioning, allowing for change. The result is a bidirectional lexicographic resource, previously bounded and static, now unbounded and evolving, with the ability to extend to multilingualism.

10 - 15 September 2018

Summer School:  ISAO 2018, 4th Interdisciplinary School on Applied Ontology

Location:  Cape Town, South Africa  |  University of Cape Town

2 - 6 July 2018

Converting the English-Xhosa Dictionary for Nurses to Linguistic Linked Data

Author:  Frances Gillis-Webber

Conference: International Congress of Linguists (ICL20)

Location:  Cape Town, South Africa  |  Cape Town International Convention Centre (CTICC)

7 - 12 May 2018

Managing Provenance and Versioning for an (Evolving) Dictionary in Linked Data Format

Author:  Frances Gillis-Webber

Conference:  6th Workshop on Linked Data in Linguistics: Towards Linguistic Data Science (LDL-2018), 11th edition of the Language Resources and Evaluation Conference (LREC 2018)

Location:  Miyazaki, Japan  |  Phoenix Seagaia Conference Center



The English-Xhosa Dictionary for Nurses is a unidirectional dictionary with English and isiXhosa as the language pair, published in 1935 and recently converted to Linguistic Linked Data. Using the Ontolex-Lemon model, an ontological framework was created, where the purpose was to present each lexical entry as “historically dynamic” instead of “ontologically static” (Veltman, 2006:6, cited in Rafferty, 2016:5), therefore the provenance information and generation of linked data for an ontological framework with instances constantly evolving was given particular attention. The output is a framework which provides guidelines for similar applications regarding URI patterns, provenance, versioning, and the generation of RDF data.

26 - 30 June 2017

Summer School:  2nd Summer Datathon on Linguistic Linked Open Data (SD-LLOD-17)

Location:  Cercedilla, Spain  |  Residencia Lucas Olazábal of Universidad Politécnica de Madrid

17 - 20 January 2017

Ways to Improve the User Experience in Digital Humanities

Presenter:  Frances Gillis-Webber

Conference:  Inaugural Conference of the Digital Humanities Association of Southern Africa (DHASA 2017)

Location:  Stellenbosch, South Africa  |  Stellenbosch Institute for Advanced Studies (STIAS)

3 - 4 March 2016

Workshop:  OWL & Protégé Tutorial

Location:  Manchester, United Kingdom  |  University of Manchester

