Saturday, August 29, 2015

Presenting WikiDataDotNet - Client API for WikiData

WikiData

WikiData is one of those things that sets the mind boggling at the possibilities of the internet. It's a project, started by the WikiMedia foundation, to collect structured data on everything. If you are doing anything related to machine learning, it is the best source of data I have so far found.

It aims to contain an items on everything and for each item a collection of statements describing aspects of it and it's relationship to other items. Everything makes more sense with an example, here is it's record on the item Italy which can be found in the API like so:

This will return a JSON file with sections like:

       "id": "Q38",
       "labels": {  
          "en": {  
           "language": "en",
           "value": "Italy"
         }, 

Here we see the id of the item, in this case Q38 that is used for looking Italy up. Then labels contains the name of Italy in each language. Further down there is also a section aliases that contains alternate names for Italy in every language.

Futher down we get to the really interesting stuff, claims.

          "P36": [  
           {  
             "mainsnak": {  
               "snaktype": "value",  
               "property": "P36",  
               "datavalue": {  
                 "value": {  
                   "entity-type": "item",  
                   "numeric-id": 220  
                 },  
                 "type": "wikibase-entityid"  
               },  
               "datatype": "wikibase-item"  
             },  
             "type": "statement",  
             "qualifiers": {  
               "P580": [  

These are a series of statements about the different aspects of the item. For example the above P36 is a claim about what the capital of Italy is. Claims are also entities in the API, so they can also be looked up like so https://www.wikidata.org/w/api.php?action=wbgetentities&ids=P36

mainsnak is the main statement associated with this claim (a Snak in wikidata is any basic assertion that can be made about an item). These all have a value and a type. In this case the claim that about Italy's capital, the value is a reference to a wiki entry, which can again be looked up from WikiData if you append a Q to the beginning of the numeric id, you my have already worked out what the entity here is https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q220

Other claims on Italy include location, who it shares a border with, public holidays, provinces, basic form of government, head of state, population(across history), head of government, the list is endless(no wait, actually it's 64 entries long).


Presenting WikiDataDotNet

I've been working on a project that needed to query against WikiData from .Net. The only existing .Net API for this I could find is Wikibase.NET for writing wiki bots. It hasn't been updated in a while and unfortunately a quick test reveals it no longer works. At a future date I may fix it up, but in the meantime I've created this quick query only API: WikiDataDotNet

Usage

It currently provides the ability to request entities:
F#
 let italy = WikiDataDotNet.Request.request_entity "Q38"   
C#
 var italy = WikiDataDotNet.Request.request_entity("Q38");  

and do a text search against wiki data:
F#
 let search_result = WikiDataDotNet.Request.search "Headquarters of the U.N"  
C#
 var searchResult = WikiDataDotNet.Request.search("en", "Headquarters of the U.N");  

That's it for functionality so far. My next plans are to make it easier to look up Claims against items and do caching of Claims. Also maybe some kind of LINQ style querying interface would be nice.

1 comment:

  1. how to insert any data to my database using C# and sql server.

    ReplyDelete