SPARQL chunks and queries
André Ourednik - https://ourednik.info
2022-06-01
Source:vignettes/use.Rmd
use.Rmd
This package allows you to query SPARQL endpoints in two different ways:
- Run SPARQL chunks in Rmarkdown files.
- Use inline functions to send SPARQL queries to a user-defined
endpoint and retrieve data in dataframe form
(
sparql2df
) or list form (sparql2list
).
Endpoints can be reached from behind corporate firewalls on Windows machines thanks to automatic proxy detection. See Execute SPARQL chunks in R Markdown.
Use
To use the full potential of the package you need to load the library and tell knitr that a SPARQL engine exists:
Once you have done so, you can run SPARQL chunks:
Chunks
Retrieve a data frame
output.var: the name of the data.frame you want to store the results in
endpoint: the URL of the SPARQL endpoint
autoproxy: whether or not try to use the automatic proxy detection
Example 1 (Swiss administration endpoint)
```{sparql output.var="queryres_csv", endpoint="https://lindas.admin.ch/query"}
PREFIX schema: <http://schema.org/>
SELECT * WHERE {
?sub a schema:DataCatalog .
?subtype a schema:DataType .
}
```
Example 2 (Uniprot endpoint)
Note the use of attempt at automatic proxy detection.
```{sparql output.var="tes5", endpoint="https://sparql.uniprot.org/sparql", autoproxy=TRUE}
PREFIX up: <http://purl.uniprot.org/core/>
SELECT ?taxon
FROM <http://sparql.uniprot.org/taxonomy>
WHERE {
?taxon a up:Taxon .
} LIMIT 500
```
Example 3 (WikiData endpoint):
```{sparql output.var="res.df", endpoint="https://query.wikidata.org/sparql"}
SELECT DISTINCT ?item ?itemLabel ?country ?countryLabel ?linkTo ?linkToLabel
WHERE {
?item wdt:P1142 ?linkTo .
?linkTo wdt:P31 wd:Q12909644 .
VALUES ?type { wd:Q7278 wd:Q24649 }
?item wdt:P31 ?type .
?item wdt:P17 ?country .
MINUS { ?item wdt:P576 ?abolitionDate }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}
```
Inline code
The inline functions sparql2df
and
sparql2list
both have the same pair of arguments: a
SPARQL endpoint and a SPARQL query. Queries can be
multi-line:
endpoint <- "https://lindas.admin.ch/query"
query <- "PREFIX schema: <http://schema.org/>
SELECT * WHERE {
?sub a schema:DataCatalog .
?subtype a schema:DataType .
}"
Retrieve a data frame
result_df <- sparql2df(endpoint,query)
The same but with attempt at automatic proxy detection:
result_df <- sparql2df(endpoint,query,autoproxy=TRUE)
Retrieve a list
result_list <- sparql2list(endpoint,query)
The same but with attempt at automatic proxy detection:
result_list <- sparql2list(endpoint,query,autoproxy=TRUE)