Constructing a WordNet-like tree from the Wikidata dataset?

Question

Is there an approach to constructing a semantic tree somewhat like WordNet, but using the vastly more structured and useful Wikidata data instead? How could this work? Could we place similar entities together, and dissimilar ones farther away? Does this need to be supervised in any way by a human, or is there an approach to similarity using only the properties that Wikidata provides?

There already is BabelNet... See e.g. this question on SO. – Stanislav Kralin Jan 24 '18 at 11:47 — Stanislav Kralin, Jan 24 '18 at 11:47

score 4 · Accepted Answer · answered Aug 14 '17 at 16:26

I would say that the P279-property (subclass of) will span a semantic "tree" (it is not necessarily a tree but a directed graph). You might be able to build semantic trees from other Wikidata information too.

The P279 graph can, for instance, be obtained from the Wikidata Query Service (WDQS). The below tree is based on a query around 'novel' (Q8261).

The graph can be rendered with the following SPARQL query on WDQS:

#defaultView:Graph
PREFIX gas: <http://www.bigdata.com/rdf/gas#>

SELECT DISTINCT ?item1 ?item1Label ?item2 ?item2Label
WHERE {
  { 
    SELECT ?item1 ?item2
    WHERE {
      SERVICE gas:service {
        gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.BFS" ;
        gas:in wd:Q8261 ; 
        gas:traversalDirection "Forward" ;
        gas:out ?item1 ;
        gas:out1 ?depth1 ;
        gas:out2 ?item2 ;
        gas:linkType wdt:P279 ;
      }
    }
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,da,sv,jp,zh,ru,fr,de" .  } 
}

Link to Wikidata Query service for P279 graph on 'novel'

Similarity information is not readily available with Wikidata and WDQS. I have been experimenting with Wikidata and graph embedding to make this possible. A work in progress prototype called "Wembedder" is available at https://tools.wmflabs.org/wembedder/. For instance, a query on 'short story' is available here: https://tools.wmflabs.org/wembedder/most-similar/Q49084#language=en The current Wembedder returns "essay" and "narrative poetry" as the most similar other items. The web service is described further in Wembedder: Wikidata entity embedding web service.

Constructing a WordNet-like tree from the Wikidata dataset?

1 Answers1