www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

Internet Services

WebDAV Server
URIQA Semantic Web Enabler
URIQA HTTP Methods URIQA Web Service URIQA Section in Virtuoso Configuration File URI Matching Rules
Mail Delivery & Storage
NNTP Newsgroups
MIME & Internet Messages
FTP Services
VSP Guide
LDAP

18.2. URIQA Semantic Web Enabler

Virtuoso supports the URIQA (URI Query Agent) extension of HTTP WebDAV protocol. URIQA adds three new methods to HTTP in order to retrieve, add and remove RDF metadata about a given subject. The subject is identified by its URI. If the subject is a DAV resource then URIQA will usually reuse the DAV URI of the resource. If the subject is not a resource but something else (physical entity, imaginary thing or vocabulary item) then URIQA can be used to process metadata about the subject even if the subject itself can not be accessed via HTTP.

URIQA-specific HTTP methods are called MGET (to retrieve existing metadata), MPUT (to add or update RDF triples) and MDELETE (to remove some or all triples). A single URIQA request usually deals with a single subject that is specified by request URI. The MGET response, however, can return metadata about more than one RDF subject, e.g., the request about a book can return both data about the book itself plus some data about persons who are known as authors of the book.

In addition to URIQA-specific HTTP methods, Virtuoso implements a semantic web service interface that allows plain HTTP clients to access metadata using traditional GET or POST HTTP methods.

The Virtuoso URIQA implementation allows flexible configuration using an ordered list of request handlers. Every handler has a pattern for URIs; if the URI in the request does not match the pattern then the handler is ignored, otherwise a callback function of the handler is called to process the request. The default configuration of Virtuoso server will try three sorts of actions.

Note:

URIQA is not yet a stable standard. Virtuoso implements draft of URIQA proposal from Nokia, dated 2004. As URIQA will evolve, future versions of Virtuoso will implement updated versions of the specification. There is no warranty that future implementations will be compatible with the current one.

See Also: External References

The Nokia URI Query Agent Model

18.2.1. URIQA HTTP Methods

All three methods have a set of HTTP header parameters to specify the precise URI of the subject. HTTP does not require that every resource is accessible via a single valid URI, so many equivalent URLs can point to same resource and the result of typical HTTP request does not change if one of equivalent URLs is replaces with some other. Unlike HTTP GET, HTTP PUT etc., metadata methods may return different results for different URLs even if these URLs are equivalent for other methods. URIQA rules are very simple.

Examples of MGET Requests

The following requests are all equivalent:

Request 1. 'URIQA-uri' is used, the rest does not matter.

MGET /foo HTTP/1.1
Host: example.com
URIQA-uri: http://example.com/foo

Request 2. 'URIQA-uri' is missing, 'Host' is used, the host name www.example.com is ignored.

MGET http://www.example.com/foo HTTP/1.1
Host: example.com

Request 3. The URI from the first line is used verbatim. This is unsafe, because proxy servers can alter the URI, e.g. by adding port number.

MGET http://example.com/foo HTTP/1.1

Request 4. The URI from the first line is used, but host name is retrieved from 'DefaultHost' URIQA configuration parameter. If the parameter is set to example.com then the request is equivalent to previous.

MGET /foo HTTP/1.1

18.2.1.1. MGET Method

MGET request contains a subject URI and the response consists of RDF/XML representation of an RDF graph with metadata about the subject. In many cases, the returned graph is a Concise Bounded Description of the resource or something similar, but it can be of any sort.

There are no integrity rules. E.g., if a response for request about subject A contains some data about B then the request about B may return same or different data, or even report that B does not exists. If URI refers to non-existing resource or even to a non-existing server or protocol then the response can be an 'not found' error or an empty graph or even a non-empty graph, depending on the handler that processed the request.

Usually MGET request consists of only subject URI specification, but it can contain any other parameters such as an authentication or even the HTTP request body with extra data for some particular handler. For Virtuoso DAV resources, MGET will need read permission on the subject resource, because the resulting RDF is retrieved from 'http://local.virt/DAV-RDF' property of the resource.


18.2.1.2. MPUT Method

MPUT request contains an HTTP header that describe a subject URI and contains Content-Length, and the body must be an RDF/XML that consist of triples that should be added. The server will try to add new RDF triples from the body to the description of the subject. In some cases, the server will replace obsolete triples with triples from the body, e.g., if some RDF Schema is in use that states for a predicate that it can not have more than one value for any given subject.

There are no integrity rules. If MPUT request with subject A submits data about resource B then the updated data may become visible via MGET request with subject A and stay unchanged if retrieved directly by MGET with subject B. For instance, the default request handler for DAV will update only 'http://local.virt/DAV-RDF' DAV property of the subject resource not touching any DAV properties of resources named in the request.

A client application can not use MPUT with subject URI that refers to a non-DAV Virtuoso resource, because disk-resident resources do not have DAV properties, including DAV metadata properties. MPUT can refer to nonexisting Virtuoso DAV resource only if the name of this resource has been already locked for uploading of the resource. The most reliable way, however, is to upload the resource first and update metadata only after the uploading. There are two reasons to do operations in this sequence. First of all, Virtuoso can automatically extract some metadata from the content of uploaded resource and if MPUT happens after the upload then MPUT data can properly overwrite automatically extracted values. An additional reason is that resource uploading will set the MIME-type of the resource and may associate some RDF Schemas with the resource; hence MPUT can properly update some triples instead of storing multiple values for some predicate that should have only one value according to RDF Schema.

For Virtuoso DAV resources, MPUT will need both read and write permissions on the subject resource, because 'http://local.virt/DAV-RDF' property of the resource is first retrieved and then updated.


18.2.1.3. MDELETE Method

MDELETE request contains an HTTP header that describe a subject URI and may contain the body. If present then the body must be an RDF/XML that consist of triples that should be deleted. If the body is totally missing then MDELETE removes all metadata associated with the subject URI.

There are no integrity rules. If MDELETE request with subject A removes triples about resource B then these triples may stay visible if retrieved directly by MGET with subject B. For instance, the default request handler for DAV will update only 'http://local.virt/DAV-RDF' DAV property of the subject resource not touching any DAV properties of resources named in the request.

For Virtuoso DAV resources, MPUT will need both read and write permissions on the subject resource, because 'http://local.virt/DAV-RDF' property of the resource is first retrieved and then updated.



18.2.2. URIQA Web Service

Virtuoso provides the '/uriqa/' web service for clients that do not support URIQA-specific methods. Instead of passing URI and method name in HTTP parameter lines, web service calls pass them as part of web service URI. The beginning of the path can be any, starting from '/uriqa/' or '/URIQA/'. The following two requests are to retrieve metadata about 'http://example.com/foo'.

GET /uriqa?uri=http%3a%2f%2fexample%2ecom%2ffoo HTTP/1.1
GET /uriqa?uri=http%3a%2f%2fexample%2ecom%2ffoo&method=MGET HTTP/1.1
The following request header is for MPUT
GET /uriqa?uri=http%3a%2f%2fexample%2ecom%2ffoo&method=MPUT HTTP/1.1

The URIQA web service does not need complicated rules for URI passing because the request can not be significantly changed by any proxy. The value of the 'uri' parameter should be an absolute URI.


18.2.3. URIQA Section in Virtuoso Configuration File

By default,the Virtuoso server acts only as URIQA proxy, i.e. it redirects incoming requests to other servers without trying to return metadata about DAV resources or other data stored on the server itself. To let URIQA retrieve local metadata, the Virtuoso server should know names that can be used by clients to access it. Virtuoso configuration file, e.g., virtuoso.ini, can contain these names as parameters in "[URIQA]" section

These configuration parameters are "sticky". If they're found in configuration file then they are preserved in the database registry. If configuration file has changed then new values will be used after server restart. If database dump is replayed on a server whose configuration file does not contain these parameters then values from dump will stay in the registry. If database dump is replayed on a server whose configuration file contains other values then values from dump will stay in the registry till server restart.


18.2.4. URI Matching Rules

A simple installation does not require any special configuration of URIQA except specifying server names in the [URIQA] section of configuration file (virtuoso.ini). However complex applications may need from URIQA more than simple retrieval of metadata of DAV resources. Like HTTP virtual hosts, URIQA may require different processing for different URIs, so Virtuoso offers appropriate tools.

When the URIQA server gets an URI to process, it reads the system table WS.WS.URIQA_HANDLER to find out the procedure that can access metadata about some range of URIs. This table is defined as follows:

create table WS.WS.URIQA_HANDLER
(
  UH_ID integer not null primary key,
  UH_ORDER integer not null,
  UH_NAME varchar not null unique,
  UH_MATCH_COND varchar not null,
  UH_MATCH_ENV any,
  UH_HANDLER varchar not null,
  UH_HANDLER_ENV any
)
create index URIQA_HANDLER_ORDER_NAME on WS.WS.URIQA_HANDLER (UH_ORDER, UH_NAME)
;

The server scans the table in order of ascending values in UH_ORDER column, and checks whether the request URI matches the condition specified by UH_MATCH_COND and UH_MATCH_ENV. As soon as an appropriate row is found, a function with name specified by UH_HANDLER is called with parameters that describe the request plus any extra application-specific data as stored in UH_HANDLER_ENV. The function should either compose a response and set a flag to 1 or do nothing and set a flag to 0. If 1 is set then the processing of the request is complete, otherwise the server resumes table scan.

At server startup, up to three records are automatically added into WS.WS.URIQA_HANDLER.

Applications can add more lines to the table to handle different sorts of URIs via different application specific functions. The name of function should begin with "WS.WS.URIQA_HANDLER_", the rest is as specified by UH_HANDLER of the row. The signature of function should be

function WS.WS.URIQA_HANDLER_myexample (
  inout op varchar,   -- operation name, 'MGET', 'MPUT' or 'MDELETE';
  inout uri varchar,  -- request URI;
  inout split any,    -- request URI split by WS.WS.PARSE_URI into parts;
  inout body any,     -- the body of the request;
  inout params any,   -- get_keyword style vector of parameters of the request;
  inout lines any,    -- vector of lines of HTTP request header;
  inout app_env any,  -- any application-specific data from UH_HANDLER_ENV;
  inout is_final integer -- status flag. Function sets the flag to 1 to report that the request response is prepared.
  ) re0turns any	      -- returns a status vector, see below.

Status vector describes either the reason why the request has failed, or the success status. It consists of four elements:

In case of DAV error, elements 3 and 4 can be set to NULL to generate proper values automatically.

Examples are:

vector ('00000', 0, '200', 'OK');
vector ('URIQA', 0, '500', 'The remote URIQA server returned an invalid header');
vector ('URIQA', -1, '404', 'Invalid URI; Ill formed or missing path to the resource');
vector ('URIQA', -12, null, null);

The current version of Virtuoso supports the following names of matching operations for use in UH_MATCH_COND: