Skip to main content

Repositories

 

Storing language resources and related datasets is something that requires a sound organisation and attention for digital sustainability. One of CLARIN's important aims is to ensure that digital language resources are made available to a broad community on a long-term basis. This is achieved by establishing data repositories at the centres, which host digital files and the associated metadata. For reference purposes, these repositories also assign persistent identifiers to the resources, so that a specific dataset can be easily cited in a paper, for example.

Users can inspect the data in such a repository with a local interface. But the metadata is also shared with the rest of the CLARIN community, by means of metadata harvesting.

Although CLARIN is a strong advocate of open access, in some cases resources have to be password-protected, if only to respect legal, privacy and ethic constraints. Even then, federated login makes it easier to request access to these protected collections and to log in once  access has been granted. This decision stays with the resource owner, as authorisation is distributed in the CLARIN infrastructure.

Repository Assessment

The quality, organisational and technical background of the CLARIN repositories is subject to an assessment procedure. Repositories that have successfully undergone this procedure are granted the CLARIN B-centre label.

Software and Configuration

The Centre Registry shows some URIs that you can use to query the endpoints run by CLARIN centres. You can query these endpoints by simply clicking on the hyperlinks, or in a more technical fashion, in order to find out what kind of OAI-PMH server software they operate. To learn more about practical experiences with certain OAI-PMH server software, please contact the endpoint operators (the centre's technical contact). They can freely decide which system they want to use, as long as it is compliant with the centre requirements (support for metadata harvesting, component metadata, persistent identifiers, federated login). Popular options are Fedora Commons and DSpace. Some centres have provided manuals about how to set up a CLARIN-compliant repository.

 

Additional Information