-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #41 from zazuko/lang-string
Filtering language tagged strings
- Loading branch information
Showing
12 changed files
with
386 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
# Literals with language tags | ||
|
||
Using the `.out()` method it is possible to only find literals in specific languages by passing a second `{ language }` parameter to the method. | ||
|
||
When that parameter is defined, only string literal nodes will be returned. | ||
|
||
For any given subject, all strings in the chosen language will be returned. | ||
|
||
## Finding specific language | ||
|
||
To find string literal in a given language, pass a second object argument with a string `language` key. | ||
|
||
<run-kit> | ||
|
||
```js | ||
const cf = require('clownface') | ||
const RDF = require('@rdfjs/dataset') | ||
const { literal } = require('@rdfjs/data-model') | ||
const { rdf, rdfs } = require('@tpluscode/rdf-ns-builders') | ||
|
||
// create two labels for a resource | ||
const apple = cf({ dataset: RDF.dataset() }) | ||
.node(rdf.Resource) | ||
.addOut(rdfs.label, literal('apple', 'en')) | ||
.addOut(rdfs.label, literal('Apfel', 'de')) | ||
|
||
// find German label | ||
apple.out(rdfs.label, { language: 'de' }).value | ||
``` | ||
|
||
</run-kit> | ||
|
||
## Finding plain literals | ||
|
||
Using an empty string for the `language` parameter will find strings without a language. | ||
|
||
<run-kit> | ||
|
||
```js | ||
const cf = require('clownface') | ||
const RDF = require('@rdfjs/dataset') | ||
const { literal } = require('@rdfjs/data-model') | ||
const { rdf, rdfs } = require('@tpluscode/rdf-ns-builders') | ||
|
||
// create two labels for a resource | ||
const apple = cf({ dataset: RDF.dataset() }) | ||
.node(rdf.Resource) | ||
.addOut(rdfs.label, literal('apple')) | ||
.addOut(rdfs.label, literal('Apfel', 'de')) | ||
|
||
// find literal without language tag | ||
apple.out(rdfs.label, { language: '' }).value | ||
``` | ||
|
||
</run-kit> | ||
|
||
## Finding from a choice of potential languages | ||
|
||
It is possible to look up the literals in multiple alternatives byt providing an array of languages instead. The first language which gets matched to the literals will be used. | ||
|
||
<run-kit> | ||
|
||
```js | ||
const cf = require('clownface') | ||
const RDF = require('@rdfjs/dataset') | ||
const { literal } = require('@rdfjs/data-model') | ||
const { rdf, rdfs } = require('@tpluscode/rdf-ns-builders') | ||
|
||
// create two labels for a resource | ||
const apple = cf({ dataset: RDF.dataset() }) | ||
.node(rdf.Resource) | ||
.addOut(rdfs.label, literal('apple', 'en')) | ||
.addOut(rdfs.label, literal('Apfel', 'de')) | ||
|
||
// there is no French translation so English will be returned | ||
apple.out(rdfs.label, { language: ['fr', 'en'] }).value | ||
``` | ||
|
||
</run-kit> | ||
|
||
A wildcard (asterisk) can also be used to choose any other (random) literal if the preceding choices did not yield any results. It would look similarly to previous example. | ||
|
||
```js | ||
apple.out(rdfs.label, { language: ['fr', '*'] }).value | ||
``` | ||
|
||
!> The result can be either English or German with equal probability. | ||
|
||
## Matching subtags | ||
|
||
In specific cases [subtags](https://tools.ietf.org/html/bcp47#section-2.2), such as `de-CH` can be matched to a given language. By analogy, it is also possible to find a subtag of any length by applying a "starts with" match. | ||
|
||
For example, in the snippet below the more specific subtag `de-CH-1996` will indeed be matched to the more general Swiss German `de-CH` | ||
|
||
<run-kit> | ||
|
||
```js | ||
const cf = require('clownface') | ||
const RDF = require('@rdfjs/dataset') | ||
const { literal } = require('@rdfjs/data-model') | ||
const { rdf, rdfs } = require('@tpluscode/rdf-ns-builders') | ||
|
||
// create two labels for a resource | ||
const bicycle = cf({ dataset: RDF.dataset() }) | ||
.node(rdf.Resource) | ||
.addOut(rdfs.label, literal('Fahrrad', 'de')) | ||
.addOut(rdfs.label, literal('Velo', 'de-CH-1996')) | ||
|
||
// finds a Swiss translation | ||
bicycle.out(rdfs.label, { language: 'de-CH' }).value | ||
``` | ||
|
||
</run-kit> | ||
|
||
!> However, any exact match will always take precedence before the subtag match | ||
|
||
<run-kit> | ||
|
||
```js | ||
const cf = require('clownface') | ||
const RDF = require('@rdfjs/dataset') | ||
const { literal } = require('@rdfjs/data-model') | ||
const { rdf, rdfs } = require('@tpluscode/rdf-ns-builders') | ||
|
||
// create two labels for a resource | ||
const bicycle = cf({ dataset: RDF.dataset() }) | ||
.node(rdf.Resource) | ||
.addOut(rdfs.label, literal('Fahrrad', 'de')) | ||
.addOut(rdfs.label, literal('Velo', 'de-CH-1996')) | ||
|
||
// finds the standard German label | ||
bicycle.out(rdfs.label, { language: 'de' }).value | ||
``` | ||
|
||
</run-kit> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
const RDF = require('@rdfjs/data-model') | ||
const namespace = require('./namespace') | ||
|
||
const ns = namespace(RDF) | ||
|
||
function mapLiteralsByLanguage (map, current) { | ||
const notLiteral = current.termType !== 'Literal' | ||
const notStringLiteral = ns.langString.equals(current.datatype) || ns.xsd.string.equals(current.datatype) | ||
|
||
if (notLiteral || !notStringLiteral) return map | ||
|
||
const language = current.language.toLowerCase() | ||
|
||
if (map.has(language)) { | ||
map.get(language).push(current) | ||
} else { | ||
map.set(language, [current]) | ||
} | ||
|
||
return map | ||
} | ||
|
||
function createLanguageMapper (objects) { | ||
const literalsByLanguage = objects.reduce(mapLiteralsByLanguage, new Map()) | ||
const langMapEntries = [...literalsByLanguage.entries()] | ||
|
||
return language => { | ||
const languageLowerCase = language.toLowerCase() | ||
|
||
if (languageLowerCase === '*') { | ||
return langMapEntries[0] && langMapEntries[0][1] | ||
} | ||
|
||
const exactMatch = literalsByLanguage.get(languageLowerCase) | ||
if (exactMatch) { | ||
return exactMatch | ||
} | ||
|
||
const secondaryMatches = langMapEntries.find(([entryLanguage]) => entryLanguage.startsWith(languageLowerCase)) | ||
|
||
return secondaryMatches && secondaryMatches[1] | ||
} | ||
} | ||
|
||
module.exports = { | ||
createLanguageMapper | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,16 @@ | ||
const namespace = require('@rdfjs/namespace') | ||
|
||
const ns = (factory) => ({ | ||
first: factory.namedNode('http://www.w3.org/1999/02/22-rdf-syntax-ns#first'), | ||
nil: factory.namedNode('http://www.w3.org/1999/02/22-rdf-syntax-ns#nil'), | ||
rest: factory.namedNode('http://www.w3.org/1999/02/22-rdf-syntax-ns#rest') | ||
}) | ||
const ns = (factory) => { | ||
const xsd = namespace('http://www.w3.org/2001/XMLSchema#', { factory }) | ||
const rdf = namespace('http://www.w3.org/1999/02/22-rdf-syntax-ns#', { factory }) | ||
|
||
return { | ||
first: rdf.first, | ||
nil: rdf.nil, | ||
rest: rdf.rest, | ||
langString: rdf.langString, | ||
xsd | ||
} | ||
} | ||
|
||
module.exports = ns |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.