North-West University: SADiLaR’s Strategic Vision Presented to CLARIN Management

Stakeholder engagement is a crucial part of the South African Centre for Digital Language Resources (SADiLaR)’s strategic mission. With the adoption of a new five-year strategy plan, the infrastructure is dedicated to promoting its mandate and establishing a local and global presence to attract potential partners in the domains of natural language processing and digital humanities. This commitment was reinforced by the infrastructure’s acceptance as a full member of CLARIN, a distributed European digital infrastructure consortium, as of 1 January 2024.

Professor Justus Roux, South Africa’s official delegate to the CLARIN General Assembly, made a presentation at the Radboud University in Nijmegen on SADiLaR’s strategic vision for the next five years.

SADiLaR – the next five years

Over the next five years, SADiLaR’s infrastructure will prioritise several strategic objectives to strengthen the impact of readily available language-related technologies and digital humanities in driving transformative research. In addition, it aims to support the implementation of language policies in achieving an inclusive and transformed digital future for South Africa.

Research focus

SADiLaR is committed to advancing the scholarship of human language technologies and digital humanities in South Africa and across the African continent. The organisation aims to strengthen the knowledge production and dissemination pathways in the Global South, thereby contributing to global knowledge production.

Technology and resources

SADiLaR aims to enhance the development, deployment, and maintenance of software and technology in the domains of digital language resources and digital humanities by continuously strengthening the technical infrastructure.

Digitisation, enablement, and promotion of South Africa’s official languages

SADiLaR will continuously contribute to and drive the vision to ensure a digital future for all official languages in South Africa. SADiLaR is involved in all three stages ensuring the longer-term availability of data and tools in the value chain. In a simpler form, the value chain from SADiLaR’s perspective can be summarised as follows:

Raw (unprocessed) data

This relates to creating and digitising analogue data that must be made digital. This process includes cleaning and refining to ensure that good-quality metadata is included. It also entails the creation/maintenance and updates of tools and technologies required to parse African language data.

After the refinement and creation of tools or the processing of data to a final format, the data and tools are released as openly as is practically possible to further downstream innovation.

SADiLaR’s current mandate and the way forward

As it stands, SADiLaR plays a critical role in providing long-term preservation and maintenance of digital language resources through its repository. In this way, SADiLaR provides a place where digital language resource building blocks can be developed in specialised projects run by or in collaboration with the Centre which are then made openly available for reuse in downstream technologies.

Affiliation with CLARIN

South Africa is the first member country outside of Europe, and SADiLaR is the proud representative body for South Africa. Currently, CLARIN ERIC has 24 EU country members and two observers.

“The CLARIN network aligns impeccably with SADiLaR’s strategic objective of strengthening stakeholder relationships and building mutually beneficial partnerships. The network will therefore extend to increase the impact of the infrastructure in the digital humanities space,’’ concludes Prof Roux. SADiLaR’s full strategy document is accessible here.

Prof Justus Roux is the official South African delegate at the CLARIN General Assembly and is assigned to strengthening SADiLaR-CLARIN stakeholder programmes.