Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount Everest elevation missing - due to Wikidata reference? #767

Open
AJKellmann opened this issue Sep 30, 2024 · 7 comments
Open

Mount Everest elevation missing - due to Wikidata reference? #767

AJKellmann opened this issue Sep 30, 2024 · 7 comments
Assignees

Comments

@AJKellmann
Copy link

Issue validity

The issue still persists, here is the link:
http://dief.tools.dbpedia.org/server/extraction/en/extract?title=Mount+Everest&revid=&format=n-triples&extractors=custom

Error Description

When I wrote a SPARQL query to find the highest mountain in the world, I realized that Mount Everest was missing from the results. The issue appears to be due to the lack of an http://dbpedia.org/ontology/elevation property in the extracted data for Mount Everest.

The Wikipedia page contains the correct height, but it is referenced from Wikidata rather than being directly included in the Infobox as is common for other mountains.
Screenshot Wikipedia
This might also affect other entries in Wikipedia that load their values from Wikidata instead of stating it explicitly.

Pinpointing the source of the error

The issue was discovered in the SPARQL endpoint at http://dbpedia.org/sparql.
Here is the SPARQL query that highlights the problem:

SELECT DISTINCT ?mountain ?height
WHERE {
?mountain http://dbpedia.org/ontology/elevation ?height.
?mountain a http://schema.org/Mountain.
}
ORDER BY DESC(?height)
LIMIT 10

Expected result: Mount Everest should appear with an elevation of 8848.86 meters. Actual result: Mount Everest does not appear in the list, indicating the elevation data is missing.

Details

Wrong triples / missing data:
There is no http://dbpedia.org/ontology/elevation triple for the resource Mount Everest in the current DBpedia data.

Expected corrected RDF outcome:
http://dbpedia.org/resource/Mount_Everest http://dbpedia.org/ontology/elevation 8848.86 (xsd:double)

@debx4
Copy link

debx4 commented Dec 17, 2024

I would love to work on this issue . But I just can't find the file where the data is missing .
It would be me pleasure if @AJKellmann can help .

@AJKellmann
Copy link
Author

AJKellmann commented Dec 17, 2024

Hi @debx4,

Thanks for picking up this issue!

The respective triple is missing for the DBpedia page of the mount Mount Everest (https://dbpedia.org/resource/Mount_Everest). The issue seems to arise because the elevation value is not explicitly present in the Wikipedia Infobox; it is pulled dynamically from Wikidata.

This might be a limitation in how the DBpedia extraction framework handles importing the data.
For other mountains, it is often a concrete value, not a reference to another source.

It would be pretty simple to write a SPARQL query like

INSERT DATA { <http://dbpedia.org/resource/Mount_Everest> <http://dbpedia.org/ontology/elevation> "8848.86"^^xsd:double . }
But that would only fix this single case, not the underlying issue itself.
(I don't know where to find the relevant pieces in the extraction-framework code, I did not check the code.)

@debx4
Copy link

debx4 commented Dec 18, 2024

I try to resolve the problem in http://dbpedia.org/sparql
I hope it should have resolve the problem

@AJKellmann
Copy link
Author

Hi @debx4,

Thanks for following up! I checked the SPARQL endpoint again, but unfortunately, there are still no changes regarding Mount Everest.

Just to clarify, direct edits to the database aren't possible for regular users of the SPARQL endpoint (http://dbpedia.org/sparql). Do you have writing permissions for the DBpedia server? (I don't)
These kinds of changes usually require access to the Virtuoso server running DBpedia, which isn’t publicly accessible.
If I had those permissions, I’d have applied a quick fix myself.
For such updates, you'd need to log in via the Virtuoso Conductor interface (http://dbpedia.org/conductor or its internal equivalent), which is not accessible publicly.

That said, even a quick fix would only address this one case. The actual issue lies in how the extraction framework processes Wikipedia pages that use fetchwikidata to pull data from Wikidata. Values like the elevation for Mount Everest aren’t currently included in the RDF output. Fixing this properly would require changes to the framework.
That's why I posted this issue here in this GitHub repository.
And I'm not sure whether this would be something that is easy to fix.

@debx4
Copy link

debx4 commented Dec 18, 2024

Yeah actually I have tried that issue to solve and it have accepted my request but as you have mentioned that database isn't accessible to public so I think that's why changes are not shown.

@AJKellmann
Copy link
Author

You're right, the SPARQL query is valid, but without write access, DBpedia won't apply the change.
The deeper issue is still how fetchwikidata values are handled during extraction.
If you want to keep trying, don’t hesitate to ask me.
We’ll keep this issue open for now.

@JyotiP24
Copy link

Hi, can you assign this to me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants