A Look Inside the Think Tank...Private by Thomas Steiner
Disabling Blog Comments
I have given up. The (barely never) used commenting function of my blog got repeatedly spammed. In consequence, I have removed it from my home-brew blog software. It was fun while it lasted. Looking back, I had added a simple Turing test back in 2005, but apparently the spammers have caught up and care enough to even spam my super low traffic blog. If you're a spammer and you read this: you win. Also, fuck you!
If you care enough to comment on any of the items on this blog, you know how to reach me: @tomayac or +tomayac.
What I do for a living in my PhD
Benvingut a Munic, Pep! Guardiola joins FC Bayern. Preview of an automatically generated social media gallery. That's the stuff I work on in my PhD.
Finally something to show to people :-).
Shutting down the Open Knowledge Graph
The Open Knowledge Graph, an attempt to open up the Google Knowledge Graph by means of crowdsourcing, is history. Since its initial announcement on August 11 to the current day, the graph has grown from 0 triples to exactly 2,850,510 RDF triples. This impressive figure has been reached solely through passionate users who participated in the Search for Embedded Knowledge Items effort (SEKI@home) by sharing their Google search activities. In the view of the authors, there is an over-delivery of facts through knowledge bases like DBpedia or Freebase. In contrast, the Open Knowledge Graph made accessible only a subset of the most interesting facts about entities, derived from the Google Knowledge Graph. This happened in a machine-readable way through the SPARQL protocol.
However, Google clarified for us that by design the data in the Knowledge Graph is available only via a consumer interface. Jack Menzel, Product Management Director at Google, contacted us with the following statement:
First, the reason we can't put all the data we have into Freebase is that we've acquired it from other sources who have not granted us the rights to redistribute. Much of the local and books data, for example, was given to us with terms that we would not immediately syndicate or provide it to others for free.
Other pieces of data are used, but only with attribution. For example, some data, like images, we feel comfortable using only in the context of search (as it is a preview of content that people will be finding with that search) and some data like statistics from the World Bank should only be shown with proper attribution.
With regards to automatic access to extract the ranking of the content: we block this kind of access to Google because our ranking is the proprietary core of what Google provides whenever you use search—users should access Google via the interfaces we provide."
In consequence, we are shutting down the Open Knowledge Graph, which means that we will no longer provide access to the data via the SPARQL endpoint previously located at http://openknowledgegraph.org/sparql. We will keep online the SEKI@home Chrome extension for future use (so if you have it installed, please do not uninstall it quite yet), however, will remove the Google-scraping functionality from it. We will also keep the main Open Knowledge Graph homepage (http://openknowledgegraph.org/) online, as our paper titled SEKI@home, or Crowdsourcing an Open Knowledge Graph was accepted for publication at the 1st International Workshop on Knowledge Extraction and Consolidation from Social Media (KECSM2012), collocated with the 11th International Semantic Web Conference (ISWC2012).
The good news is that where there is shadow, there is light: the folks over at Freebase did let us know that the Freebase team have always been committed to supporting the Linked Open Data community and that they have plans in the works on making the Freebase dumps that they already provide available in RDF.
Thanks to all RDF triple scrobblers for contributing to the Open Knowledge Graph. It was fun while it lasted.
Tom and Stefan
(i) This post was reviewed, however, not edited for content, by D. Price, Product Counsel at Google, and J. Menzel, Product Management Director at Google.
(ii) T. Steiner is a Google employee. S. Mirea is a Google intern at time of writing. The Open Knowledge Graph was developed in their own time as an independent research project and with a Universitat Polit?cnica de Catalunya and a Jacobs University Bremen affiliation respectively.
SEKI@home, or Crowdsourcing an Open Knowledge Graph
In May 2012, the Web search engine Google has introduced the so-called Knowledge Graph, a graph that understands real-world entities and their relationships to one another. It currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects. Soon after its announcement, people started to ask for a Knowledge Graph Application Programming Interface (API), however, as of today, Google does not provide one. With SEKI@home, which stands for Search for Embedded Knowledge Items, we propose a browser extension-based approach to crowdsource such an API. As people with the extension installed search on Google.com, the extension sends extracted anonymous Knowledge Graph facts from Search Engine Results Pages (SERPs) to a centralized, publicly accessible triple store, and thus over time creates a SPARQL-queryable Open Knowledge Graph. The SPARQL endpoint for the Open Knowledge Graph is available at http://openknowledgegraph.org/sparql. This prototype browser extension is tailored to the Google Knowledge Graph, however, we note that the concept of SEKI@home is generalizable for other knowledge bases.
A paper describing the technical details of this extension has been submitted to the First International Workshop on Knowledge Extraction and Consolidation from Social Media (KECSM2012).
The extension was mainly developed by Stefan Mirea, steven.mirea (at) gmail [dot] com.
Disclaimer: the Open Knowledge Graph API and SPARQL endpoint are in NO way associated with Google. Make fair use of it.
Knowledge Graph Socializer Chrome extension
In May 2012, the Web search engine Google has introduced the so-called Knowledge Graph, a graph that understands real-world entities and their relationships to one another. Entities covered by the Knowledge Graph include landmarks, celebrities, cities, sports teams, buildings, movies, celestial objects, works of art, and more. The graph enhances Google search in three main ways: by disambiguation of search queries, by search log-based summarization of key facts, and by explorative search suggestions.
With the Knowledge Graph Socializer Chrome extension, we suggest a fourth way of enhancing Web search: through the addition of realtime coverage of what people say about real-world entities on social networks. This browser extension seamlessly adds relevant microposts from the social networking sites Google+, Facebook, and Twitter in form of a panel to Knowledge Graph entities. In a true Linked Data fashion, we interlink detected concepts in microposts with Freebase entities.
Please note: you need a freely available Google API key if you want to use the Knowledge Graph Socializer extension.