Full Text Search usage in Tracker's SPARQL queries

Introduction

The following examples show how the FTS syntax works when using fts:match in the SPARQL queries.

Some notes:

  • All these rules also apply to the arguments passed to tracker-search.
  • There is a MINIMUM number of characters per word in the searches (3 by default). If a given word doesn't reach the minimum number of characters, it won't be used for the search.

For all examples, a file with the following contents is used:

red
green
blue

The simplest example is when using a single word during word search. There's not much to say in this case.

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red' }"
Results: 1
  file:///home/user/colors.txt

In addition to exact matches, we can also look for word prefixes using a star (*) at the end of the word to look for. In this example we look for files where we find at least one word starting with 'gree':

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'gree*' }"
Results: 1
  file:///home/user/colors.txt

Using the implicit AND operator

When several terms are passed to the fts:match, they will be treated with an implicit AND. This is, the FTS search will look for items where ALL words are found. For example, the following query will look for files where all three 'red' 'green' and 'blue' are found. Our test file matches because it contains all those words.

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red blue green' }"
Results: 1
  file:///home/user/colors.txt

If looking for a word which doesn't exist in our test file, even if the other words passed in the query exist, FTS will not return any result. In the example file, there's no yellow word, so no result will be returned:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red blue green yellow' }"
Results: 0

Using the explicit AND operator

An explicit AND operator can also be passed between terms, instead of implicitly assuming it:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red AND blue AND green' }"
Results: 1
  file:///home/user/colors.txt

Using the OR operator

If the OR keyword is used to separate terms, FTS will try to find items which match either one or the other term. In the file example, there's no yellow word, but the query will return results because we have the red word.

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red OR yellow' }"
Results: 1
  file:///home/user/colors.txt

Mixing OR and AND operator

The OR operator will take precedence to the implicit or explicit AND.

In the following example, we are looking for red AND (blue OR yellow):

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red blue OR yellow' }"
Results: 1
  file:///home/user/colors.txt

In the following example, we are looking for (red OR blue) AND yellow), so no results are returned:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red OR blue yellow' }"
Results: 0

The NOT operator (hyphen)

The hyphen (-) is used as NOT operator in the FTS syntax.

In the following example, we are looking for red AND green but NOT blue

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red green -blue' }"
Results: 0

Please note that FTS will do word-breaking in the input arguments in fts:match. This means that the following query, even if no whitespace between the middle word and the hyphen, will be equivalent to the previous one, look for red AND 'green' but NOT blue.

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red green-blue' }"
Results: 0

Note that you cannot do an FTS query using only a NOT operator:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match '-blue' }"
Could not run query, SQL logic error or missing database

In the following additional example, the same FTS error is returned because the search terms are splitted in two words, a and aaa; then the first word is skipped because it doesn't reach the minimum length; and so the last negated aaa word is left as the only item for the query; which as already said is not allowed. So the following two queries end up being equivalent:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'a-aaa' }"
Could not run query, SQL logic error or missing database
$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match '-aaa' }"
Could not run query, SQL logic error or missing database

Sentences

FTS allows also to look for sentences, this is, look for words which should be present in a given specific order. The double-quotes (which should be passed properly escaped) can be used to enclose words in the same sentence.

In the following example, we are looking for the red green blue sentence:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match '\"red green blue\"' }"
Results: 1
  file:///home/user/colors.txt

If we reverse the order of the words in the sentence, no result should come up:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match '\"red blue green\"' }"
Results: 0

As previously said, input terms are word-breaked before using them in FTS, so we could also look for the given sentence like this (see the extra commas):

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match '\"red, green, blue\"' }"
Results: 1
  file:///home/user/colors.txt

If any FTS-syntax operator appears inside the sentence (between the escaped double-quotes), it will NOT be treated as an operator, and will be removed from the query. In this example, the last -blue will get word-breaked and finally parsed as blue. The hyphen, the OR, and the AND in this case will get skipped, as the commas are. So, the following two queries are fully equivalent:

$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match '\"red, OR green, AND -blue\"' }"
Results: 1
  file:///home/user/colors.txt
$> tracker sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match '\"red green blue\"' }"
Results: 1
  file:///home/user/colors.txt

Attic/Tracker/Documentation/Examples/SPARQL/FTS (last edited 2023-08-14 12:50:13 by CarlosGarnacho)