DBpedia: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
paper
Updated page for the September 2007 release ofthe dataset (Chris Bizer).
Line 1: Line 1:
'''DBpedia''' is a community effort to extract structured information from [[Wikipedia]] and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia and to interlink other datasets on the Web with DBpedia data.
'''dbpedia''' is a project advertised in January 2007 by Sören Auer (Universität Leipzig) et al. that digs out data from Wikipedia's articles and makes it searchable in an RDF metadata framework. It appears to deliver some of the promises of [[Wikidata]], albeit with a different technology.


== The DBpedia Dataset ==
==External links==
*http://dbpedia.org/
*[http://lists.wikimedia.org/pipermail/wikitech-l/2007-January/029061.html dbpedia.org - Querying Wikipedia like a Semantic Database], announcement on wikitech-l
*Sören Auer, Jens Lehmann, ''[http://www.informatik.uni-leipzig.de/~auer/publication/ExtractingSemantic.pdf What have Innsbruck and Leipzig in common? Extracting Semantic from Wiki Content]'' (PDF)


Wikipedia articles consist mostly of free text, but also contain different types of structured information, such as infobox templates, categorisation information, images, geo-coordinates and links to external Web pages. This structured information can be extracted from Wikipedia and can serve as a basis for enabling sophisticated queries against Wikipedia content.

The DBpedia dataset describes 1,950,000 “things”, including at least 80,000 persons, 70,000 places, 35,000 music albums, 12,000 films. It contains 657,000 links to images, 1,600,000 links to relevantexternal web pages, 180,000 external links into other RDF datasets, 207,000 Wikipedia categories and 75,000 YAGO categories.

The DBpedia project uses the [http://en.wikipedia.org/wiki/Resource_Description_Framework Resource Description Framework] as a flexible data model for representing extracted information and for publishing it on the Web. As of September 2007, the DBpedia dataset consists of around 103 million RDF triples, which have been extracted from the English, German, French, Spanish, Italian, Portuguese, Polish, Swedish, Dutch, Japanese, Chinese, Russian, Finnish and Norwegian versions of Wikipedia.

The DBpedia dataset is available under the terms of the GNU Free Documentation License.

The DBpedia dataset is interlinked on RDF level with various other Open Data datasets on the Web. This enables applications to enrich DBpedia data with data from these datasets.
As of June 2007, DBpedia is interlinked with the following datasets: GeoNames, Musicbrainz, CIA World Fact Book, DBLP, Project Gutenberg, DBtune Jamendo and Eurostat as well as US Census data.
See [http://dbpedia.org/docs/ DBpedia website] and [http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData W3C SWEO Linking Open Data Community Project] for details about interlinked datasets.

== Accessing the DBpedia Dataset ==

The DBpedia dataset can be accessed using three different access mechanisms:

* SPARQL Endpoint. There is a [http://dbpedia.org/sparql public SPARQL endpoint] which enables you to query the dataset using the [http://en.wikipedia.org/wiki/SPARQL SPARQL] query language. You can use the [http://DBpedia.org/snorql SNORQL query explorer] to ask queries against the endpoint (does not work with Internet Explorer). Several example queries are found on the [http://dbpedia.org/docs/ DBpedia website].
* Linked Data Interface. DBpedia is also served as [http://en.wikipedia.org/wiki/LinkedData Linked Data], meaning that you can use Semantic Web browsers like [http://www.w3.org/2005/ajar/tab Tabulator], [http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/disco/ DISCO] or the [http://demo.openlinksw.com/DAV/JS/rdfbrowser/index.html Open Link Data Browser] to navigate the dataset.
* Downloads. The DBpedia dataset can also be downloaded from the [http://dbpedia.org/docs/ DBpedia website].

== External links ==
=== Web Pages ===
* [http://dbpedia.org/ DBpedia Project] - Official website
* [http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData W3C SWEO Linking Open Data Community Project]

=== Publications ===
* Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, Zachary Ives: [http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/Auer-Bizer-ISWC2007-DBpedia.pdf DBpedia: A Nucleus for a Web of Open Data]. 6th International Semantic Web Conference (ISWC 2007), Busan, Korea, November 2007.
* Christian Bizer et al.: [http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/DBpedia-WWW2007-draft-slides.pdf DBpedia - Querying Wikipedia like a Database]. Developers track presentation at WWW2007.
* Sören Auer, Jens Lehmann: [http://www.informatik.uni-leipzig.de/~auer/publication/ExtractingSemantics.pdf What have Innsbruck and Leipzig in common? Extracting Semantics from Wiki Content]. Paper at ESWC 2007.
* Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum: [http://www2007.org/papers/paper391.pdf Yago: A Core of Semantic Knowledge - Unifying WordNet and Wikipedia]. Paper at WWW2007.
* Christian Bizer et al.: [http://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/LinkingOpenData.pdf Interlinking Open Data on the Web] ([http://linkeddata.org/documents/eswc2007-poster-linking-open-data.pdf Poster]). Poster at ESWC 2007.

[[Category:Free software culture and documents]]
[[Category:Open access]]
[[Category:World Wide Web]]
[[Category:Semantic web]]
[[Category:German engineering]]
[[Category:German engineering]]
[[Category:Research]]
[[Category:Research]]

Revision as of 09:43, 6 September 2007

DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia and to interlink other datasets on the Web with DBpedia data.

The DBpedia Dataset

Wikipedia articles consist mostly of free text, but also contain different types of structured information, such as infobox templates, categorisation information, images, geo-coordinates and links to external Web pages. This structured information can be extracted from Wikipedia and can serve as a basis for enabling sophisticated queries against Wikipedia content.

The DBpedia dataset describes 1,950,000 “things”, including at least 80,000 persons, 70,000 places, 35,000 music albums, 12,000 films. It contains 657,000 links to images, 1,600,000 links to relevantexternal web pages, 180,000 external links into other RDF datasets, 207,000 Wikipedia categories and 75,000 YAGO categories.

The DBpedia project uses the Resource Description Framework as a flexible data model for representing extracted information and for publishing it on the Web. As of September 2007, the DBpedia dataset consists of around 103 million RDF triples, which have been extracted from the English, German, French, Spanish, Italian, Portuguese, Polish, Swedish, Dutch, Japanese, Chinese, Russian, Finnish and Norwegian versions of Wikipedia.

The DBpedia dataset is available under the terms of the GNU Free Documentation License.

The DBpedia dataset is interlinked on RDF level with various other Open Data datasets on the Web. This enables applications to enrich DBpedia data with data from these datasets. As of June 2007, DBpedia is interlinked with the following datasets: GeoNames, Musicbrainz, CIA World Fact Book, DBLP, Project Gutenberg, DBtune Jamendo and Eurostat as well as US Census data. See DBpedia website and W3C SWEO Linking Open Data Community Project for details about interlinked datasets.

Accessing the DBpedia Dataset

The DBpedia dataset can be accessed using three different access mechanisms:

External links

Web Pages

Publications