An Introduction to SPARQL Queries

What is SPARQL?

SPARQL (SPARQL Protocol and RDF Query Language, pronounced “sparkle”) is a query language used to retrieve and manipulate linked data stored in an RDF format. In this format, data are stored as triples. Triples are formed by a subject, predicate, and object. Together they form an assertion that the subject has a particular property or relationship, indicated by the predicate, to the object.

Example:

The best way to get started is to write a simple query:

					SELECT ?subject ?predicate ?object WHERE { 
	?subject ?predicate ?object . 
}					
				

This query will return all the triples in the dataset.

Components of a Query

Let's break down this query further.

?:

Any word following a question mark is a variable.

SELECT:

The variables following SELECT will be displayed in a table. Other keywords such as CONSTRUCT, ASK, and DESCRIBE can be used to display other results.

WHERE:

The WHERE clause is mandatory and specifies where data will be extracted from. Other temporary variables may be used here.

Note: If a query may result in a large number of triples, modifiers such as LIMIT can be used to refine the search. This topic will be explored in greater detail in the next section. The example below will display the first 10 results.

					SELECT ?subject ?predicate ?object WHERE { 
	?subject ?predicate ?object .
}
LIMIT 10			
				

Prefixes and URIs

The data are stored in the form of triples with the subject, predicate, object format. Each part of the triple is identified by a Uniform Resource Identifier (URI). For example, the gender property "woman" in the CWRC Ontology is represented by the following URI: <http://sparql.cwrc.ca/ontologies/cwrc.html#woman>

URIs in queries are often shortened using a prefix, or namespace which is defined at the beginning of the query. These stand in for most of the URI, leaving variable at the end which designate the terms from that ontology or vocabulary.

Thus the same term above could be represented in a triple as: cwrc:woman

Interpreting a simple SPARQL query

PREFIX cwrc: <http://sparql.cwrc.ca/ontologies/cwrc#>
SELECT ?person WHERE { 
    ?person cwrc:hasGender cwrc:woman . 
    ?person cwrc:hasOccupation ?occupation .
    ?occupation a cwrc:teacher .
}

This query searches for all of the triples in the CWRC dataset that satisfy the following requirements: the person is a woman and a teacher.

In plain English, it could be translated as follows:

There is a person who has the gender woman

That same person has an occupation

That occupation is a teacher

Notes on Style

Every triple in a query should end in a period.

If a variable used in one line is repeated in the next line it can be omitted. Example:

		        	PREFIX cwrc: <http://sparql.cwrc.ca/ontologies/cwrc#>
SELECT ?person ?occupation WHERE { 
	?person cwrc:hasGender cwrc:woman . 
		cwrc:hasOccupation ?occupation .
}