Sparql Tips & Tricks

The performance of tracker from the application point of view can vary a lot depending on the quality of the queries. Here are some tips to improve the speed from application point of view.

Retrieve ONLY what you need NOW

  • Every variable in the SELECT part means to retrieve the values from the DB and transfer them to the client. So, if some data is not strictly needed, do NOT ask for it. I need to show a grid of images. Every tile in the grid has the same width and height:
    ### BAD: 
    # 'height', 'width' and 'flash' are not relevant in the UI. 
    # 'u' is the internal tracker uri... probably 'loc' is enough for the application.
    #
    SELECT ?u ?loc ?height ?width ?flash WHERE
    {
       ?u a nmm:Photo ;
         nie:url ?loc ;
         nmm:flash ?flash ;
         nfo:width ?width ;
         nfo:height ?height .
    }
    In this case only this is needed:
    ### GOOD 
    # only 'loc' is needed in this case.
    #
    SELECT ?loc WHERE
    {
       ?u a nmm:Photo ;
          nie:url ?loc.
    }

Minimize the number of queries

  • The roundtrip of the data is pretty big and grows quicker than the query time. In few words: it is better to run ONE COMPLEX query and multiple simple queries.
     # Todo: add example

OPTIONALS are a performance killer

  • OPTIONAL is the keyword in SparQL to indicate that a value in those properties is not mandatory. Internally it is translated in a quite unefficient SQL (not because our translation is bad, but because of the SQL semantics with joins). So, try to avoid OPTIONALS in the queries. For that you have mostly two options depending on the context:

tracker:coalesce function

  • There is a function in tracker (tracker:coalesce) similar to the COALESCE function in SQL: it returns the first non-null value in the list of parameters. This function can be used to reduce the data returned. For example, if we want to show the title of a music piece, we want the title coming from the metadata. If the song doesn't have metadata, we prefer to use the filename. If for some reason it is not available, fallback to a default message.

    ### BAD retrieving both data.
    SELECT ?loc ?title ?filename WHERE
    {
     ?u a nmm:MusicPiece .
     OPTIONAL { ?u nie:title ?title }
     OPTIONAL { ?u nfo:filename ?filename }
    }

    But we can use tracker:coalesce:

    ### GOOD using tracker:coalesce
    SELECT ?loc tracker:coalesce (?title, ?filename, "unknown title") WHERE
    {
     ?u a nmm:MusicPiece .
     OPTIONAL { ?u nie:title ?title }
     OPTIONAL { ?u nfo:filename ?filename }
    }

Use property functions

  • Due the internal translation of SparQL in SQL, tracker can be more efficient in some fields are asked using property functions in the SELECT part of the query, instead of the usual optional in the query:
     ### BAD Not using property-functions
     SELECT ?u ?title WHERE 
     {
       ?u a nmm:MusicPiece ;
         nie:title ?title;
         nie:url "file:///home/user/a.mp3"
     }
    Instead, we prefer this:
     ### GOOD
     SELECT nie:title(?u) WHERE 
     {
       ?u a nmm:MusicPiece ;
         nie:url "file:///home/user/a.mp3"
     }

Syntactic sugar and performance

FIXME: Format this with examples!

Q: Is there a difference between OPTIONAL{foo}.OPTIONAL{bar} and OPTIONAL{foo. bar} ?
A: yes. { foo bar } matches nothing if only exactly one of the two is set

 # These two queries are NOT equivalent!

 SELECT ?a ?b WHERE {
   OPTIONAL {
      ?x nie:title ?a .
      ?x nie:description ?b .
 }

 ---
 
 SELECT ?a ?b WHERE {
   OPTIONAL {
      ?x nie:title ?a .
   }
   OPTIONAL {
      ?x nie:description ?b .
   }
 }

Q: Is there a difference between GRAPH x {foo . bar} and GRAPH x {foo}.GRAPH x {bar}
A: that should be equivalent

 FIXME: Example here

Q: For the graph case, is there a perf advantage of having only 1 GRAPH{} ? or is it only marginal?
A: Same overhead as {foo} {bar} compared to {foo . bar}. in many cases marginal. in some it might be significant

 FIXME Example here

Q: Same question with "resource prop1 value1.resource prop2 value2" VS "resource prop1 value1; prop2 value2" ?
A: That makes no difference at all

 # These two queries are equivalent, no performance difference:

 SELECT ?a ?b WHERE {
   ?x nie:title ?a .
   ?x nie:description ?b .
 }

 SELECT ?a ?b WHERE {
   ?x nie:title ?a ;
      nie:description ?b .
 }

Q: Does DELETE {foo rdf:type ?v} WHERE {foo rdf:type ?v} remove completely the resource foo ?
A: No, DELETE { <foo> a rdfs:Resource } does.

Projects/Tracker/Documentation/SparqlTipsTricks (last edited 2013-12-19 16:33:45 by LuisMenina)