Skip to main content

Conceptual Resources

Concept-based resources include onomasiological lexical resources such as wordnets, framenets, thesauri and ontologies. Such resources are typically interlinked with semantic relations (e.g. hypernymy, hyponymy). There are 29 conceptual resources in the CLARIN infrastructure. Most (22) of the conceptual resources are monolingual, accounting for 14 languages (Ancient Greek, Danish, Greek, Brazilian Portuguese, Dutch, Estonian, Finnish, Italian, Maltese, Polish, Portuguese, Swedish, Slovenian), while the rest (7) include both bilingual and multilingual language combinations (e.g., Swedish-English, Polish-English). In the vast majority of the cases, the conceptual resources can be directly downloaded from the national repositories or queried through easy-to-use online search environments.

For comments, changes of the existing content or inclusion of new resources, send us an resource-families [at] clarin.eu (email).

Conceptual resources in the CLARIN infrastructure

Monolingual Resources

Corpus Language Description Availability

Open Ancient Greek WordNet 0.5

Size: 7,447 synsets
Licence: CC-BY-SA 4.0

Ancient Greek

This is a wordnet that is available for download through ILC4CLARIN.

Download

Ontology for the area of Nanoscience and Nanotechnology

Size: 511 entries
Annotation: semantic relations
Licence: MS NC-NoReD-ND

Brazilian Portuguese

This is an ontology of concepts related to nanoscience and nanotechnology available for download from CLARIN PORTULAN.

Download

DanNet, Danish Wordnet (v 2.2)

Size: 65,000 entries
Annotation: hypernymy, hyperonymy
Licence: DanNet 1.0 Licence

Danish

This wordnet is available for download from the CLARIN-DK repository. The resource is available in the .csv and .owl formats.

For the relevant publication, see Pedersen et al. (2009)

Download

STO semantics (The Danish SIMPLE Lexicon) - LMF format

Size: 12,609 entries
Licence: CC-BY 4.0

Danish

This resource presents a unified, ontology-based semantic model – the so-called SIMPLE model – representing an extended Qualia Structure from the Danish SIMPLE Lexicon. The resource is available for download from CLARIN-DK.

Download

Cornetto-LMF

Size: 130,000 entries
Annotation: semantic relationships and combinatorial information
Licence: other

Dutch

This is a resource that combines two resources with different semantic organisations: the Dutch Wordnet with its synset organisation and the Dutch Reference Lexicon which includes definitions, usage constraints, selectional restrictions, syntactic behaviours, illustrative contexts, etc. The resource is available for online browsing through a dedicated webpage.

Browse

Estonian Wordnet 2.1

Size: 115,318 keywords, 84,150 synets
Annotation: PoS-tags, synonymy, antonymy, hypernymy, hyponymy, meronymy
Licence: CC-BY-SA

Estonian

This is a wordnet that is available for download from META-SHARE (CELR distribution).

Download

Finnish FrameNet

Size: 866 frames
Licence: CC-BY 4.0

Finnish

This is a wordnet that is available for online browsing through FIN-CLARIN.

For the relevant publication, see Lindén et al. (2017)

Browse

GermaNet

Size: 61,659 synsets
Annotation: MSD-tags, lexical relations (hypernymy, hyponymy)
Licence: CLARIN ACA

German

This wordnet is available for download from a dedicated webpage hosted by Uni Tübingen/CLARIN-D.

For the relevant publication, see Henrich et al. (2011)

Download

Polytropon EL Conceptual Lexicon

Size: 4,000 Multi Word units; 15,000 tokens
Annotation: lemmas, MSD-tags, Lexical relations
Licence: under negotiation

Greek

This is a lexicon of lexical-semantic relations (e.g. synonymy, antonymy); lexical relations (e.g. word families, allomorphs, syntactic variants); morphosyntactic features (PoS, gender, declension, etc.), which is not yet available for download.

The Icelandic Wordweb

Annotation: lemmas, concept-based relations
Licence: CC BY 4.0

Icelandic

This is a database of words, categorizations and word relations. The new version consists of a single, large RDF file that houses the Wordweb’s content and is encoded with a standardized vocabulary.

The Wordweb is available for download through the CLARIN.IS repository and is available for online browsing through a dedicated webpage.

For the relevant publication, see Jónsson and Úlfarsdóttir (2011)

Browse

Download

ItalWordNet Kyoto

Size: 49,514 synsets
Licence: CC-BY-NC-SA 4.0

Italian

This is a wordnet available for download from ILC4CLARIN.

Download

ItalWordNet v.2

Size: 49,350 synsets
Annotation: equivalence relations between Italian synsets and closest concepts in an Inter-Lingual index
Licence: CC-BY-NC-SA 4.0

Italian

This is a wordnet available for download from ILC4CLARIN.

For the relevant publication, see Bartoliniet al. (2014)

Download

IWN-LOD

Size: 49,350 synsets

Italian

This is an RDF–Linguistic Open Data version of the ItalWordNet v.2. The resource is available for download through ILC4CLARIN.

Download

Maltese automatically produced distributional thesaurus

Size: 36,034 entries
Annotation: lemmas
Licence: CC-BY-SA

Maltese

This is a thesaurus that is available for download from CLARIN PORTULAN.

Download

NE_SUMO_PLWN_mapping

Size: 120 terms
Licence: CC BY SA 4.0

Polish

This conceptual resource provides a mapping between named entities types, SUMO categories and plWordNet synsets. The resource is available for download from the CLARIN-PL repository.

Download

PLWordNet to Sumo mapping

Size: 175,635 synsets mapped to SUMO
Licence: CC-BY-NC-SA 3.0

Polish

This conceptual resource provides a mapping of plWordNet onto the SUMO ontology. The resource is available for download from the CLARIN-PL repository.

Download

Geo-Net-PT 02

Size: 701,209 concepts
Annotation: qualia structure and lexical relations (hyponyms, synonyms)
Licence: CC-BY

Portuguese

This is an ontology of Portuguese geographic concepts. It is available for download from CLARIN PORTULAN.

Download

MWNPT-International WordNet of Portuguese

Size: 17,200 synsets
Annotation: hyponymy and hypernymy
Licence: MS NC-NoReD-ND

Portuguese

This is a wordnet that is available if contacting the resource manager.

Thesaurus of Modern Slovene 1.0

Size: 105,473 entries
Annotation: core and near synonymy
Licence: CC BY-SA 4.0

Slovenian

This is a thesaurus of the modern Slovenian language that is available for download from CLARIN.SI

For the relevant publication, see Krek et al. (2017)

Download

Bring (2015-05-08)

Size: 148,815 entries
Licence: CC-BY 4.0

Swedish

This is a digital version of Bring's thesaurus (1930) that is available for download from the SWE-CLARIN repository.

Download

Saldo 

Size: 131,020 entries
Licence: CC-BY 4.0

Swedish

This is an extensive lexicon resource for modern Swedish written language. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

Swedish FrameNet (2017-10-16)

Size: 1,195 entries
Licence: CC-BY 4.0

Swedish

This is a Swedish conceptual resource that employs the FrameNet++ annotation. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

Swesaurus (2017-10-16)

Size: 15,010 entries
Licence: CC-BY 4.0

Swedish

This is a Swedish wordnet that can be downloaded from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

Multilingual Resources

Corpus Language Description Availability

Finnish WordNet

Size: 170,000 synsets
Annotation: PoS-tags, synonymy, antonymy, hypernymy, hyponymy, meronymy
Licence: CC-BY 3.0

Finnish, English

This is a wordnet that is available for download from FIN-CLARIN as well as for online browsing.

For the relevant publication, see Lindén and Carlson (2010)

Browse

Download

The Sanat Version of the Finnish TransFrameNet

Size: 866 frames
Licence: CC-BY 4.0

Finnish, English

This is a framenet that is available for online browsing through FIN-CLARIN.

For the relevant publication, see Lindén et al. (2019)

Browse

Prolex

Size: 72,572 lexical relations
Annotation: inflected forms
Licence: Licence Publique Générale Amoindrie GNU

French, English, Polish, Serbian

This is an ontology of place names available for download from Ortolang.

Download

plWordNet 4.0

Size: 506,815 senses, 347,564 synsets
Licence: plwordnet-2

Polish, English

This is a lexico-semantic network which reflects the lexical system of the Polish language with projection to the English language, Słowosieć. The resource is available for download and browsing.

For the relevant publication, see Maziarz et al. (2016)

Browse

Download

Hontology

Size: 282 concepts
Annotation: terms correlation, rules (lexical patterns) and synonyms
Licence: CC-BY-NC-SA

Portuguese, English, Spanish, French

This is an ontology of concepts from the accommodation sector is available for download from CLARIN PORTULAN.

Download

Semantic lexicon of Slovene sloWNet 3.1

Size: 43,460 synsets
Annotation: lexical semantic relations
Licence: CC-BY-SA 4.0

Slovenian, English

This is a wordnet available for download from CLARIN.SI and for online browsing through a dedicated environment.

Browse

Download

WordNet-SALDO (2017-10-16)

Size: 6,989 entries
Licence: CC-BY 4.0

Swedish, English

This wordnet represents a link between SALDO senses and Core WordNet. The resource can be download from the SWE-CLARIN repository and can be queried online through KARP.

Browse

Download

Publications

[Bartolini et al. 2014] Roberto Bartolini, Valeria Quochi, Irene De Felice, Irene Russo, and Monica Monachini. 2014. From Synsets to Videos: Enriching ItalWordNet Multimodally.

[Henrich et al. 2011] Verena Henrich, Erhard Hinrichs, and Tatiana Vodolazova. 2011. Aligning GermaNet Senses with Wiktionary Sense Definitions.

[Jónsson and Úlfarsdóttir 2011] Jón Hilmar Jónsson and Þórdís Úlfarsdóttir. 2011. Íslenskt orðanet: Et skritt mot en allmennspråklig onomasiologisk ordbok.

[Krek et al. 2017] Simon Krek, Cyprian Laskowski,  andMarko Robnik-Šikonja. 2017.  From Translation Equivalents to Synonyms: Creation of a Slovene Thesaurus Using Word co-occurrence Network Analysis.

[Lindén and Carlson 2010] Krister Lindén and Lauri Carlson. 2010. FinnWordNet – WordNet på finska via översättning.

[Lindén et al. 2017] Krister Lindén, Heidi Haltia, Juha Luukkonen, Antti O. Laine, Henri Roivainen, and Niina Väisänen. 2017. FinnFN 1.0: The Finnish frame semantic database.

[Lindén et al. 2019] Krister Lindén, Heidi Haltia, Antti Laine, Juha Luukkonen, Jussi Piitulainen, and Niina Väisänen. 2019. Embeddings.FinnTransFrame: translating frames in the FinnFrameNet project.

[Maziarz et al. 2016] Marek Maziarz, Maciej PiaseckiA Ewa Rudnicka, Stan Szpakowicz, and Paweł Kędzia. 2016. plWordNet 3.0 – a Comprehensive Lexical-Semantic Resource.

[Pedersen et al. 2009] Bolette S. Pedersen, Sanni Nimb, Jørg Asmussen, Nicolai Hartvig Sørensen, Lars Trap-Jensen, and Henrik Lorentzen. 2009. DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary.