INSERT OR REPLACE support

About

SPARQL Update has INSERT and DELETE. To update an existing triple in RDF you need to DELETE it first. You already have INSERT SILENT in Tracker's RDF store, but that just ignores certain errors; it doesn’t replace triples.

With normal SPARQL Update will a large amount of the update time go to the application having to first delete triples and internally to the INSERT having to look up preexisting values. For this reason we came up with the idea of providing a replace feature on top of standard SPARQL 1.1 Update.

Available since: 0.10.6

The example of Mark and his dogs

Inserting Mark and his first dogs

In SPARQL Update you insert data with INSERT {  Turtle  formatted data }. An example:

INSERT {
  <Max> a <Dog> ;
        <hasName> ‘Max’ ;
        <hasAge> 10 .
  <Mimi> a <Dog> ;
        <hasName> ‘Mimi’ ;
        <hasAge> 11 .
  <Mark> a <Person> ;
         <hasName> ‘Mark’ ;
         <hasAge> 30 ;
         <owns> <Max>, <Mimi>
}

In the example we are using both single value property and multiple value properties. You can have only one name and one age, so <hasName> and <hasAge> are single value properties. But you can own more than one dog, so <owns> is a multiple value property.

Ambiguity for multiple value properties

The ambiguity with a replace feature for SPARQL Update is at multiple value properties. Does it need to replace the entire list of values? Does it need to append to the list? Does it need to update just one item in the list? And which one? This probably explains why it’s not specified in SPARQL Update.

For single value properties there’s no ambiguity. For multiple value properties on a resource where the particular triple already exists, there’s also no ambiguity: RDF doesn’t allow duplicate triples. This means that in RDF you can’t own <Max> twice. This is also true for separate insert executions.

In the next two examples the first query is equivalent to the second query. Keep this in mind because it will matter for the INSERT OR REPLACE feature:

INSERT { <Mark> <owns> <Max>, <Max>, <Mimi> }

Is the same as

INSERT { <Mark> <owns> <Max>, <Mimi> }

There is no ambiguity for single value properties so we can use replace for single value properties like this:

INSERT OR REPLACE {
  <Max> a <Dog> ;
        <hasName> ‘Max’ ;
        <hasAge> 11 .
  <Mimi> a <Dog> ;
        <hasName> ‘Mimi’ ;
        <hasAge> 12 .
  <Mark> a <Person> ;
         <hasName> ‘Mark’ ;
         <hasAge> 31 ;
         <owns> <Max>, <Mimi>
}

As mentioned earlier doesn’t RDF allow duplicate triples, so nothing will change to the ownerships of Mark. However, would we have added a new dog then just as if OR REPLACE was not there would he be added to Mark’s ownerships. The following example will actually add Morm to Mark’s dogs (and this is different than with the single value properties, they are overwritten instead).

INSERT OR REPLACE {
  <Morm> a <Dog> ;
        <hasName> ‘Morm’ ;
        <hasAge> 2 .
  <Max> a <Dog> ;
        <hasName> ‘Max’ ;
        <hasAge> 12 .
  <Mimi> a <Dog> ;
         <hasName> ‘Mimi’ ;
         <hasAge> 13 .
  <Mark> a <Person> ;
          <hasName> ‘Mark’ ;
          <hasAge> 32 ;
          <owns> <Max>, <Mimi>, <Morm>
}

We know that this looks a bit strange, but in RDF it kinda makes sense too. Note again that our replace feature is not part of standard SPARQL 1.1 Update (and will probably never be).

Resetting the entire list behind a multiple value property on a resource

If for some reason you want to completely overwrite Mark’s ownerships then you need to precede the insert with a delete. If you also want to remove the dogs from the store (let’s say because, however unfortunate, they died), then you also have to remove their rdfs:Resource type:

DELETE { <Mark> <owns> ?dog . ?dog a rdfs:Resource }
WHERE { <Mark> <owns> ?dog }
INSERT OR REPLACE {
  <Fred> a <Dog> ;
        <hasName> ‘Fred’ ;
        <hasAge> 1 .
  <Mark> a <Person> ;
         <hasName> ‘Mark’ ;
         <hasAge> 32 ;
         <owns> <Fred> .
}

Example that uses the Nepomuk ontology

The example above will not work out of the box because we don't install the ontology that Mark and his dogs use. We do install the Nepomuk ontology so I'll translate some of the INSERT OR REPLACE concepts in the first example to classes and properties in the Nepomuk ontology.

InformationElement with a title and two keywords

We start with an insert of a resource that has a single value and two times a multi value property filled in:

INSERT { <r> a nie:InformationElement ;
             nie:title 'title';
             nie:keyword 'keyw1';
             nie:keyword 'keyw2' }

A quick query to verify, and yes it’s in:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title, keyw1
  title, keyw2

If we repeat the query a second time then the old-values check will turn the insert into a noop:

INSERT { <r> a nie:InformationElement ;
             nie:title 'title';
             nie:keyword 'keyw1';
             nie:keyword 'keyw2' }

And a quick query to verify that, and indeed nothing has changed:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title, keyw1
  title, keyw2

Changing the title, adding keywords

If we’d do that last insert query but with different values, we’d get this:

INSERT { <r> a nie:InformationElement ;
             nie:title 'title new';
             nie:keyword 'keyw4';
             nie:keyword 'keyw3' }

SparqlError.Constraint: Unable to insert multiple values for subject
`r' and single valued property `dc:title' (old_value: 'title', new
 value: 'title new')

Note that for the two nie:keyword triples this would have worked, but given that each query is a transaction and because the nie:title part failed, aren’t those two written either.

Let’s now try the same with INSERT OR REPLACE:

INSERT OR REPLACE { <r> a nie:InformationElement ;
                        nie:title 'title new';
                        nie:keyword 'keyw4';
                        nie:keyword 'keyw3' }

And a quick query now yields:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title new, keyw1
  title new, keyw2
  title new, keyw3
  title new, keyw4

You can see that how it behaved for nie:title was different than for nie:keyword. That’s because nie:title is a single value -and nie:keyword is a multi value property.

Resetting the list of keywords to a new list

What if we do want to reset the multi value property and insert a complete new list? Simple, just do this as a single query (space or newline delimited):

DELETE { <r> nie:keyword ?k } WHERE { <r> nie:keyword ?k }
INSERT OR REPLACE { <r> a nie:InformationElement ;
                        nie:title 'title new';
                        nie:keyword 'keyw4';
                        nie:keyword 'keyw3' }

And a quick query now yields:

SELECT ?t ?k { <r> nie:title ?t; nie:keyword ?k }
Results:
  title new, keyw3
  title new, keyw4

The keywords being removed here are literals, not rdfs:Resource, so to fully delete them we didn't also need to take the rdfs:Resource -rdf:type like in the first example. If your multiple values property would have had rdfs:Resource as range, then you would of course (like in the example of Mark and his dogs) have to add rdfs:Resource to the delete:

DELETE { <r> nie:relatedTo ?k . ?k a rdfs:Resource }
WHERE { <r> nie:relatedTo ?k }
INSERT OR REPLACE { _:r1 a nie:DataObject .
                    _:r2 a nie:DataObject .
                    <r> a nie:InformationElement ;
                        nie:title 'title new';
                        nie:relatedTo _:r1;
                        nie:relatedTo _:r2 }

GraphUpdated signal for replaced values

The GraphUpdated signal for when a value gets replaced will contain an entry in the deletes and an entry in the inserts array.

When used with the limited support for graphs

Let's explain this by example. Say we start with a resource with a nie:title in a graph test://graph-1:

INSERT { GRAPH <test://graph-1> { 
         <test://instance-6> a nie:InformationElement ;
                             nie:title 'title 1'
} }

And we'd try to overwrite the graph value for the nie:title using a normal INSERT:

INSERT { GRAPH <test://graph-2> { 
         <test://instance-6> nie:title 'title 1'
} }

Because our limited GRAPH support can only associate a single graph to a specific statement this would be ignored. This means that this query will effectively yield test://graph-1 for variable ?g:

SELECT ?g ?t { GRAPH ?g {
               <test://instance-6> nie:title ?t
} }

However, when you add OR REPLACE to the INSERT, it will overwrite the graph value. Note that this only works as described here for single value properties. For example for rdf:type or a will the behaviour of INSERT be used (without OR REPLACE)!

INSERT OR REPLACE { GRAPH <test://graph-2> {
                    <test://instance-6> nie:title 'title 1'
} }

The SELECT query will now yield test://graph-2 for the variable ?g:

SELECT ?g ?t WHERE { GRAPH ?g {
                     <test://instance-6> nie:title ?t
} }

INSERT OR REPLACE will also replace values if you replace the graph value at the same time:

INSERT OR REPLACE { GRAPH <test://graph-3> {
                    <test://instance-6> nie:title 'title 2'
} }

That means that the query will now yield test://graph-3 for ?g and title 2 for ?t:

SELECT ?g ?t { GRAPH ?g {
               <test://instance-6> nie:title ?t
} }

Full examples

Projects/Tracker/Documentation/InsertOrReplace (last edited 2014-09-17 16:51:14 by DebarshiRay)