- ID
- Name
- Description
- Source
- Comment
- Position
- ISO639P3code
- Glottocode
- Parent_Language_Glottocode
- Macroarea
- Latitude
- Longitude
- ColumnSpec
- Contributor
- Citation
- Edge_Is_Directed
- Tree_Type
- Tree_Is_Rooted
- Tree_Branch_Length_Unit
- Media_Type
- Path_In_Zip
- Download_URL
- Primary_Text
- Analyzed_Word
- Gloss
- Translated_Text
- LGR_Conformance
- Grammaticality_Judgement
- Headword
- Part_Of_Speech
- Value
- Alignment
- Segment_Slice
- Form
- Motivation_Structure
- Prosodic_Structure
- Root
- Stem
- Segments
The CLDF Ontology
- Namespace:
- http://cldf.clld.org/v1.0/terms.rdf#
- Version info:
- http://cldf.clld.org/v1.3 (supersedes http://cldf.clld.org/v1.2)
Modules ⇫
Wordlist ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#Wordlist
A dataset according to the CLDF Wordlist specification
Dictionary ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#Dictionary
A dataset according to the CLDF Dictionary specification
StructureDataset ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#StructureDataset
A dataset according to the CLDF Structure Dataset specification
- csvw:name:
- "StructureDataset"
- rdfs:subClassOf:
- http://www.w3.org/ns/dcat#Distribution
ParallelText ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#ParallelText
A dataset according to the CLDF Parallel Text specification
- csvw:name:
- "ParallelText"
- rdfs:subClassOf:
- http://www.w3.org/ns/dcat#Distribution
TextCorpus ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#TextCorpus
A dataset according to the CLDF Text Corpus specification
Components ⇫
BorrowingTable ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#BorrowingTable
The Borrowing Table stores information about borrowings or loanwords by linking two rows in the Form Table as associative entity where additional information about the particular case of borrowing can be provided.
CognateTable ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#CognateTable
The table of cognate judgements accompanying a CLDF Wordlist. If the only thing we know about cognate sets is the set of members, a Cognate Table can be used without a corresponding Cognateset Table, otherwise it will become the associative table between Form Table and Cognateset Table.
MediaTable ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#MediaTable
- csvw:name:
- "MediaTable"
- csvw:url:
- "media.csv"
- dc:hasVersion:
- http://cldf.clld.org/v1.1
ContributionTable ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#ContributionTable
- csvw:name:
- "ContributionTable"
- csvw:url:
- "contributions.csv"
- dc:hasVersion:
- http://cldf.clld.org/v1.1
FunctionalEquivalentTable ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#FunctionalEquivalentTable
TreeTable ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#TreeTable
- csvw:name:
- "TreeTable"
- csvw:url:
- "trees.csv"
- dc:hasVersion:
- http://cldf.clld.org/v1.2
ParameterNetwork ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#ParameterNetwork
- csvw:name:
- "ParameterNetwork"
- csvw:url:
- "parameter_network.csv"
- dc:hasVersion:
- http://cldf.clld.org/v1.3
Properties ⇫
ID ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#id
A unique identifier for a row in a table.
To allow usage of identifiers as path components of URLs IDs must only contain alphanumeric characters, underscore and hyphen.
- dc:extent:
- singlevalued
- csvw:name:
- "ID"
- csvw:datatype:
- {"base": "string", "format": "[a-zA-Z0-9_\\-]+"}
- rdfs:subPropertyOf:
- http://purl.org/dc/terms/identifier
- rdfs:seeAlso:
- http://dublincore.org/documents/dcmi-terms/#http://purl.org/dc/elements/1.1/identifier
Source ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#source
List of source specifications, of the form <source_ID>[], e.g. http://glottolog.org/resource/reference/id/318814[34], or meier2015[3-12] where meier2015 is a citation key in the accompanying BibTeX file.
- csvw:name:
- "Source"
- csvw:separator:
- ";"
- csvw:datatype:
- {"base": "string"}
- dc:extent:
- multivalued
- rdfs:subPropertyOf:
- http://purl.org/dc/terms/source
Position ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#position
A position represents the placement of an item in a series or sequence of items. Although an integer is the recommended datatype, any datatype that supports a total ordering (where the order is transparent, such as alphabetic order for strings) is acceptable. It is also possible to have a list-valued column for this property, which can be useful for implementing multi-level orderings. In such cases, the typical order for tuples is assumed.
ISO639P3code ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#iso639P3code
- owl:equivalentProperty:
- http://lexvo.org/ontology#iso639P3Code
- rdfs:seeAlso:
- http://www.sil.org/ISO639-3/
- csvw:name:
- "ISO639P3code"
- csvw:datatype:
- {"base": "string", "format": "[a-z]{3}"}
- rdfs:subPropertyOf:
- http://purl.org/dc/terms/identifier
Glottocode ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#glottocode
A Glottocode denoting a languoid described in Glottolog.
- csvw:name:
- "Glottocode"
- csvw:datatype:
- {"base": "string", "format": "[a-z0-9]{4}[1-9][0-9]{3}"}
- csvw:valueUrl:
- "http://glottolog.org/resource/languoid/id/{Glottocode}"
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf# https://content.iospress.com/articles/semantic-web/sw212843
- rdfs:subPropertyOf:
- http://purl.org/dc/terms/identifier
Parent_Language_Glottocode ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#parentLanguageGlottocode
A Glottocode denoting the language-level languoid that is
a parent languoid of the languoid described by the row in LanguageTable
.
- csvw:name:
- "Parent_Language_Glottocode"
- csvw:datatype:
- {"base": "string", "format": "[a-z0-9]{4}[1-9][0-9]{3}"}
- csvw:valueUrl:
- "http://glottolog.org/resource/languoid/id/{Glottocode}"
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf# https://glottolog.org/glottolog/glottologinformation
- dc:hasVersion:
- http://cldf.clld.org/v1.3
Latitude ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#latitude
A latitude in the WGS 84 standard coordinate system, specified as decimal number of degrees.
- csvw:name:
- "Latitude"
- csvw:datatype:
- {"base": "decimal", "minimum": -90, "maximum": 90}
- rdfs:subPropertyOf:
- http://www.w3.org/2003/01/geo/wgs84_pos#lat
Longitude ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#longitude
A longitude in the WGS 84 standard coordinate system, specified as decimal number of degrees.
- csvw:name:
- "Longitude"
- csvw:datatype:
- {"base": "decimal", "minimum": -180, "maximum": 180}
- rdfs:subPropertyOf:
- http://www.w3.org/2003/01/geo/wgs84_pos#long
ColumnSpec ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#columnSpec
A column specification given as JSON representation of a CSVW column description. This column specification may be used by CLDF consumers to read a parameter's value as typed data.
Note that a CSVW datatye description is not sufficient, because parsing a string value
must also be informed by the column properties null
and separator
.
- csvw:name:
- "ColumnSpec"
- csvw:datatype:
- {"base": "json"}
- dc:extent:
- singlevalued
- rdfs:subPropertyOf:
- http://purl.org/dc/terms/type
- dc:hasVersion:
- http://cldf.clld.org/v1.2
Citation ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#citation
A full citation for a citeable unit of a dataset, preferably following the rules of the Unified Style Sheet for Linguistics Journals or the best practices for Linguistics Data Citation.
Edge_Is_Directed ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#edgeIsDirected
Flag signaling whether an edge in a graph is directed or not.
- csvw:name:
- "Edge_Is_Directed"
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf# https://en.wikipedia.org/wiki/Mixed_graph
- csvw:datatype:
- {"base": "boolean", "format": "Yes|No"}
- dc:hasVersion:
- http://cldf.clld.org/v1.3
Tree_Type ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#treeType
The type of a tree (summary
or sample
) describes how the tree can be used.
Summary (or consensus) trees can be analysed in isolation and should have type summary
.
Trees resulting from a method that creates multiple trees, and thus should be analysed as a whole
(or sampled appropriately) should have type sample
.
- csvw:name:
- "Tree_Type"
- csvw:datatype:
- {"base": "string", "format": "summary|sample"}
- dc:hasVersion:
- http://cldf.clld.org/v1.2
Tree_Is_Rooted ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#treeIsRooted
Flag signaling whether a tree is rooted or not.
- csvw:name:
- "Tree_Is_Rooted"
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf# https://en.wikipedia.org/wiki/Tree_(graph_theory)#Rooted_tree
- csvw:datatype:
- {"base": "boolean", "format": "Yes|No"}
- dc:hasVersion:
- http://cldf.clld.org/v1.2
Tree_Branch_Length_Unit ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#treeBranchLengthUnit
The unit used to measure evolutionary time in phylogenetic trees.
- csvw:name:
- "Tree_Branch_Length_Unit"
- csvw:datatype:
- {"base": "string", "format": "change|substitutions|years|centuries|millennia"}
- dc:hasVersion:
- http://cldf.clld.org/v1.2
Media_Type ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#mediaType
A media type (also known as a Multipurpose Internet Mail Extensions or MIME type) as defined by IETF's RFC 6838.
- csvw:name:
- "Media_Type"
- rdfs:subPropertyOf:
- http://www.w3.org/ns/dcat#mediaType
- csvw:datatype:
- {"base": "string", "format": "[^/]+/.+"}
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf#
- dc:hasVersion:
- http://cldf.clld.org/v1.1
- csvw:name:
- "Download_URL"
- csvw:datatype:
- {"base": "anyURI"}
- rdfs:subPropertyOf:
- http://www.w3.org/ns/dcat#downloadUrl
- dc:hasVersion:
- http://cldf.clld.org/v1.1
Analyzed_Word ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#analyzedWord
The morpheme-pattern analysis of a word in an example.
- csvw:name:
- "Analyzed_Word"
- csvw:separator:
- "\t"
- dc:extent:
- multivalued
- rdfs:subPropertyOf:
- http://purl.org/linguistics/gold/GrammarUnit
Gloss ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#gloss
A gloss corresponding to the morpheme-pattern analysis of a word in an example.
- csvw:name:
- "Gloss"
- csvw:separator:
- "\t"
- dc:extent:
- multivalued
- rdfs:seeAlso:
- http://purl.org/linguistics/gold/hasGlosses
Translated_Text ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#translatedText
The translated text of an example.
- csvw:name:
- "Translated_Text"
- rdfs:subPropertyOf:
- http://purl.org/linguistics/gold/hasTranslationLine
LGR_Conformance ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#lgrConformance
The level of conformance of the example with the Leipzig Glossing Rules.
The following levels are distinguished:
WORD_ALIGNED
: Analyzed text and glosses obey LGR rule 1, "word-by-word alignment".MORPHEME_ALIGNED
: Analyzed text and glosses obey LGR rule 2, "morpheme-by-morpheme correspondence".
No information regarding LGR conformance should be signaled with an empty string, i.e.
null
value for the property.
While more information is needed to assess how to interpret IGT - e.g. whether rule 4a is followed to group gloss elements for unsegmentable morpheme - the two levels considered here are essential for decisions about automated re-use.
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf# https://doi.org/10.5281/zenodo.10275705
- csvw:name:
- "LGR_Conformance"
- csvw:datatype:
- {"base": "string", "format": "WORD_ALIGNED|MORPHEME_ALIGNED"}
- rdfs:subPropertyOf:
- http://purl.org/dc/terms/conformsTo
- dc:hasVersion:
- http://cldf.clld.org/v1.3
Grammaticality_Judgement ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#grammaticalityJudgement
A judgement about the (un)grammaticality of the example.
A non-null
value for this property flags an example as ungrammatical
or unacceptable. The actual string value is the typographical symbol(s) or text which is to be
used to mark the example when formatting it in text (e.g. *
).
Note: Ungrammatical examples should link (via languageReference
)
to special item(s) in LanguageTable
with an empty Glottocode
to
prevent data aggregators from inadvertently assigning such an example to a proper language
(if they fail to honour grammaticalityJudgement
).
- csvw:name:
- "Grammaticality_Judgement"
- dc:hasVersion:
- http://cldf.clld.org/v1.3
Value ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#value
The value (a.k.a. datapoint or measurement) of a language for a structural feature.
For features with a limited, discrete set of valid values (a.k.a. categorical variables)
it is recommended to relate items of ValueTable
to the respective code
in CodeTable
.
- csvw:name:
- "Value"
- csvw:null:
- ["?", ""]
- rdfs:subPropertyOf:
- http://purl.org/linguistics/gold/feature
Alignment ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#alignment
An alignment represents segments which are grouped into a common cognate set as a matrix in which cognate segments are placed in the same column while gap characters are introduced in those sound sequences missing a certain counterpart.
- dc:extent:
- multivalued
- csvw:name:
- "Alignment"
- csvw:separator:
- " "
- rdfs:range:
- http://www.w3.org/2000/01/rdf-schema#Literal
- dc:source:
- List2014d
Segment_Slice ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#segmentSlice
List of segment indices or segment ranges forming the target of a partial cognacy judgement.
- dc:extent:
- multivalued
- csvw:name:
- "Segment_Slice"
- csvw:datatype:
- {"base": "string", "format": "\\d+(:\\d+)?"}
- csvw:separator:
- " "
Form ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#form
A lexical unit is any collection of word forms corresponding to a certain meaning which can be found in comparative datasets.
Ideally, a lexical unit would just present itself as one single form. However, in practice, scholars often list speech variants and at times even non-cognate alternatives for their preferred form.
Motivation_Structure ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#motivationStructure
The motivation structure of a word form gives glosses for each of its morphemes. In this it is similar to an instance of interlinear glossed text which describes the underlying semantic motivation for a given word form.
As an example, consider Chinese shùpí "bark (of a tree)" which is a compound consisting of shù "tree" and pí "skin", and whose motivation structure could be rendered as tree bark.
- csvw:name:
- "Motivation_Structure"
- rdfs:range:
- http://www.w3.org/2000/01/rdf-schema#Literal
Prosodic_Structure ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#prosodicStructure
The prosodic structure of a word form labels similar prosodic contexts which may recur even within the same word. Prosodic structures for a given language may have an underlying template that describes which syllables are possible. In Chinese dialects, for example, one could describe the basic template of most syllables as consisting of initial, medial, nucleus, coda, and tone, of which the nucleus and the tone as a suprasegmental element are usually the only required elements.
- csvw:name:
- "Prosodic_Structure"
- rdfs:range:
- http://www.w3.org/2000/01/rdf-schema#Literal
Segments ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#segments
A list of segments (aka a sound sequence) is understood as the strict segmental representation of a form unit of a language, which is usually given in phonetic transcription. Suprasegmental elements, like tone or accent, of sound sequences are usually represented in a sequential form, although they are usually co-articulated along with the segmental elements of a sound sequence. Alternatively, suprasegmental aspects could also be represented as part of the prosodic structure of a word form.
- csvw:name:
- "Segments"
- csvw:separator:
- " "
- dc:extent:
- multivalued
- rdfs:range:
- http://www.w3.org/2000/01/rdf-schema#Literal
Reference Properties ⇫
Media_ID ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#speakerArea
An identifier referencing a media resource
by providing a foreign key to MediaTable
.
This property can be used in LanguageTable
to point to a media resource describing
the speaker area of a language, i.e. the geographic area where the speakers of the
language live.
The linked media resource may be an image of a map, depicting the area, or some other
multimedia content for human consumption. But it may also be a GeoJSON
resource (i.e. a media resource with mediaType
application/geo+json
).
In the latter case, the GeoJSON object MUST contain a feature with a geometry of type
Polygon
or Multipolygon
and a key cldf:languageReference
in its properties
object with the linking language's id
as
value.
FunctionalEquivalentset_ID ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#functionalEquivalentsetReference
A functional equivalent set is a group of strings from different languages that express similar function. This is an identifier referencing a cognateset either
- by providing a foreign key to
FunctionalEquivalentsetTable
or - by using a known encoding scheme.
Concepticon_ID ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#concepticonReference
An identifier of a Concepticon concept set.
A concept set groups a number of concept labels which are used in different questionnaires and were judged to denote the same concept despite potential differences among the concrete concept labels (be it their spelling, or the language in which they were originally created).
- csvw:name:
- "Concepticon_ID"
- csvw:datatype:
- {"base": "string", "format": "[0-9]+"}
- csvw:valueUrl:
- "http://concepticon.clld.org/parameters/{Concepticon_ID}"
- rdfs:range:
- http://www.w3.org/2000/01/rdf-schema#Literal
- dc:source:
- List2016a
- rdfs:seeAlso:
- http://concepticon.clld.org
CLTS_ID ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#cltsReference
An identifier of a sound described in the CLTS dataset.
A sound identifier is the last path component of the sound's URL at
https://clts.clld.org/parameters , e.g. short_neutral_tone
for
https://clts.clld.org/parameters/short_neutral_tone
.
References a sound in the Cross-Linguistic Transcription Systems database. Suitable to mark parameters as phonemes, and consequently values as elements of phoneme inventories. E.g. voiced_bilabial_nasal_consonant.
To mark sounds that can not be mapped to any sound defined in the current CLTS version, the ID "NA", corresponding to the "unknown" sound https://clts.clld.org/parameters/NA should be used.
- csvw:name:
- "CLTS_ID"
- csvw:datatype:
- {"base": "string", "format": "[a-z_-]+|NA"}
- csvw:valueUrl:
- "https://clts.clld.org/parameters/{CLTS_ID}"
- rdfs:range:
- http://www.w3.org/2000/01/rdf-schema#Literal
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf# https://clts.clld.org
GBIF_ID ⇫ ¶
http://cldf.clld.org/v1.0/terms.rdf#gbifReference
A numeric identifier for a unit in GBIF's Backbone Taxonomy.
References a taxonomic unit in GBIF's Backbone Taxonomy. Can be used for example in
ParameterTable
to mark a lexical concept as biological species. E.g.
5219404.
- csvw:name:
- "GBIF_ID"
- csvw:datatype:
- {"base": "string", "format": "[0-9]+"}
- csvw:valueUrl:
- "https://www.gbif.org/species/{GBIF_ID}"
- rdfs:range:
- http://www.w3.org/2000/01/rdf-schema#Literal
- rdfs:seeAlso:
- http://cldf.clld.org/v1.0/terms.rdf# https://www.gbif.org/dataset/d7dddbf4-2cf0-4f39-9b2a-bb099caae36c#description