RDF: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Content deleted Content added
Evan (talk | contribs)
No edit summary
Line 53: Line 53:


=== some example functions ===
=== some example functions ===
Here's an exampel function for creating Dublin Core base data (code suggested by [[User:evan]]):
Here's an example function for creating Dublin Core base data (code suggested by [[User:evan]]):


<nowiki>
<nowiki>

Revision as of 03:34, 30 December 2005

This page is about an effort to provide an extensive RDF interface for mediawiki. The idea is to create a flexible framework that goes byond the ability of the code described at RDF metadata. I (User:Duesentrieb) am working on this together with User:Evan. There's a sample implementation at http://wikitravel.org/~evan/mw-rdf-0.3.tar.gz, which is in production on Wikitravel (see WikiTravel:Wikitravel:RDF).

Scope

The idea is to provide a way to query different kinds of information about a wiki page in RDF format. This may include

  • basic metadata as defined by the Dublin Core standard and/or the Creative Commons project.
  • List of all authors
  • List of all links to and from a page
  • List of keywords (Categories for a page)
  • List of members of a category
  • ...and a plugin interface for more.

Also, the RDF could be deliver in different notations, like

  • RDF/XML
  • N-Tripel
  • Turtle
  • ..and maybe others.

This will probably implemented as an extension that provides a special page which, wehn called without parameters, would present a form where the user can specify what data is wanted in what format.

Another possible feature would be to allow users to add custom RDF information on any page, in Turtle format, in a <rdf>...</rdf> block

Implementation

The implementation will probably be based on the RAP framework [1].

core

The core function will take a list of "models", i.e. datasets wanted, and return a RAP model which can then be serialized to create the actual output.

Here is the code for that function, as suggested by User:Evan:

 function getRdf($article, $modelNames=$wgDefaultModelNames) {
   global $wgModelFunctions;

   #empty model

   $fullModel = new RAP::Model();

   for ($modelNames as $modelName) {
        $modelFunction = $wgModelFunctions[$modelName];
        if ($modelFunction == null) {
             #print error
             continue;
        }
        $model = $modelFunction($article);
        $fullModel->merge($model);
   }

   return $fullModel;
 }
 

$wgModelFunctions may be changed to contain a human readable description of the data set in addition to the function name. This would the be displayed on the query form where the user can choose the data sets.

some example functions

Here's an example function for creating Dublin Core base data (code suggested by User:evan):

 function DublinCoreModel($article) {
    global $DC_creator, $DC_date; # available from RAP system

    $model = new Model();

    $resource = getArticleResource($article); # Gets a RAP resource for the article; we'll have some utilities like this
    $user = getUserResource($article->getUser()); # another utility

    $model->add(new Statement($resource, $DC_creator, $user));
    $model->add(new Statement($resource, $DC_date, new Literal($article->getDate()));

    # etc.

    return $model;
 }
 

Example function for listing all links from a page (also suggested by User:evan):

 function LinkingModel($article) {
 
    global $DCMES, $DCTERM, $DCMI_types;
 
    $model = new Model();
 
    $resource = getArticleResource($article); # here's that utility again
 
    $linkFromTitles = $article->links(); # actually, I'm pretty sure this doesn't exist, but it should. B-)
 
    for ($linkFromTitles as $linkFromTitle) {
        $model->add(new Statement($resource, $DCTERM['References'], titleToResource($linkFromTitle));
    }
     
    $linkToTitles = $article->whatLinksHere(); # another function that never was
 
    for ($linkToTitles as $linkToTitle) {
        $model->add(new Statement($resource, $DCTERM['isReferencedBy'], titleToResource($linkToTitle));
    }
 
    # ... more for Image links, etc.
 
    return $model;
 }
 

It may however be better to have a more generic function that allows to build such a list directly from an SQL query. This would make it very easy to add new datasets.

custom RDF

Function for building a model from custum RDF stuff on the wiki page:

 function InTextTurtleModel($article) {

    $text = $article->getText();

    $turtleBits = preg_match("<rdf>.*?</rdf>"); # Get stuff between <rdf> tags

    $turtle = string_join($turtleBits); # ...and join it into one big string

    $turtleParser = new N3Parser(); # RAP's "N3" parser is really a Turtle parser

    $model = $turtleParser->parse2model($turtle); # FIXME: handle errors here

    return $model; #  # That's it!
 }
 

the code on the wiki page would use Turtle notation. The blow examplesais that some of the text of the page was copied from another Wikipedia article, and that another part of the page was copied from some other random URL.

 <rdf>
    <> dc:source <http://www.example.com/some/upstream/document.txt>, Wikipedia:AnotherArticle .

    <http://www.example.com/some/upstream/document.txt>
      a cc:Work;
      dc:creator "Anne Example-Person", "Anne Uther-Person";
      dc:contributor "Yadda Nudda Person";
      dc:dateCopyrighted "14 Mar 2005";
      cc:License cc:by-sa-1.0.
 </rdf>
 

Links

See also