# Service descriptions

This document contains a short introduction to [RDF](https://www.w3.org/TR/rdf11-primer/) using [rudof](https://rudof-project.github.io/).


## Preliminaries: Install and configure rudof

The library is available as `pyrudof`.

In [1]:
!pip install pyrudof

Collecting pyrudof
  Downloading pyrudof-0.1.130-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.9 kB)
Downloading pyrudof-0.1.130-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.5/10.5 MB[0m [31m35.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pyrudof
Successfully installed pyrudof-0.1.130


The main entry point if a class called `Rudof` through which most of the functionality is provided.

In [2]:
from pyrudof import Rudof, RudofConfig, ServiceDescription

In order to initialize that class, it is possible to pass a RudofConfig instance which contains configuration parameters for customization.

In [3]:
rudof = Rudof(RudofConfig())

## Obtaining information about SPARQL endpoints

### When the SPARQL endpoint directly generates the service description in RDF (Uniprot)

It is possible to get information from SPARQL endpoints that publish [service descriptions](https://www.w3.org/TR/sparql11-service-description/) using the [VoID](https://www.w3.org/TR/void/) vocabulary.

In [4]:
# @title

rudof.reset_all()
from pyrudof import ReaderMode, RDFFormat, ServiceDescriptionFormat, QueryResultFormat

In [5]:
service = "https://sparql.uniprot.org/sparql"

rudof.read_service_description(service, RDFFormat.Turtle, None, ReaderMode.Strict)


Now we can serialize the service description in JSON, for example, as:

In [6]:
service = rudof.get_service_description()
service_str = service.serialize(ServiceDescriptionFormat.Json)
print(f"Service description in JSON:\n{service_str}")


Service description in JSON:
{
  "title": "UniProt",
  "endpoint": "https://sparql.uniprot.org/sparql",
  "default_dataset": {
    "id": {
      "Iri": "https://sparql.uniprot.org/sparql#sparql-default-dataset"
    },
    "default_graph": {
      "id": {
        "Iri": "https://sparql.uniprot.org/.well-known/void#sparql-default-graph"
      },
      "triples": 217505202099
    },
    "named_graphs": [
      {
        "id": {
          "Iri": "http://sparql.uniprot.org/proteomes"
        },
        "name": "http://sparql.uniprot.org/proteomes",
        "graphs": [
          {
            "id": {
              "Iri": "https://sparql.uniprot.org/.well-known/void#_graph_proteomes!94c2c578"
            },
            "triples": 8984258,
            "classes": 11,
            "property_partition": [
              {
                "id": {
                  "Iri": "https://sparql.uniprot.org/.well-known/void#proteomes!370b02de!primaryTopicOf"
                },
                "property": "ht

We also support the conversion from service descriptions to MIE format files:

In [7]:
mie = service.as_mie()
print(f"Service description in MIE format as YAML:\n{mie.as_yaml()}")

Service description in MIE format as YAML:
---
schema_info:
  title: UniProt
  endpoint: "https://sparql.uniprot.org/sparql"
prefixes:
  pav: "http://purl.org/pav/"
  dcterms: "http://purl.org/dc/terms/"
  xsd: "http://www.w3.org/2001/XMLSchema#"
  formats: "http://www.w3.org/ns/formats/"
  "": "http://www.w3.org/ns/sparql-service-description#"
  void: "http://rdfs.org/ns/void#"
  rdf: "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  void_ext: "http://ldf.fi/void-ext#"


### Using SPARQL CONSTRUCT queries

Some endpoints, like RDFPortal, have their service description located as a named graph which can be queried using a SPARQL CONSTRUCT query. In the case of RDFPortal, the service description is located in the graph [endpoint = "https://plod.dbcls.jp/repositories/RDFPortal_VoID"
](endpoint = "https://plod.dbcls.jp/repositories/RDFPortal_VoID"
).

In [8]:
endpoint = "https://plod.dbcls.jp/repositories/RDFPortal_VoID"


In [9]:
construct_query = """
PREFIX void: <http://rdfs.org/ns/void#>
PREFIX sd: <http://www.w3.org/ns/sparql-service-description#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

CONSTRUCT WHERE {
  [
    a sd:Service ;
    sd:defaultDataset [
       a sd:Dataset ;
       sd:namedGraph [
         sd:name <http://sparql.uniprot.org/uniprot> ;
         a sd:NamedGraph ;
         sd:endpoint ?ep_url ;
         sd:graph [
           a void:Dataset ;
           void:triples ?total_count ;
           void:classes ?class_count ;
           void:properties ?property_count ;
           void:distinctObjects ?uniq_object_count ;
           void:distinctSubjects ?uniq_subject_count ;
           void:classPartition [
             void:class ?class_name ;
             void:entities ?class_triple_count
           ] ;
           void:propertyPartition [
             void:property ?property_name ;
             void:triples ?property_triple_count
           ]
         ]
       ]
     ]
  ] .
}
"""

It is possible to run SPARQL CONSTRUCT queries in rudof using:

In [11]:
rudof.use_endpoint(endpoint)

In [12]:
dbcls_turtle = rudof.run_query_construct_str(construct_query, QueryResultFormat.Turtle)



In [13]:
print(dbcls_turtle)

@prefix void: <http://rdfs.org/ns/void#> .
@prefix sd: <http://www.w3.org/ns/sparql-service-description#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdf4j: <http://rdf4j.org/schema/rdf4j#> .
@prefix sesame: <http://www.openrdf.org/schema/sesame#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix fn: <http://www.w3.org/2005/xpath-functions#> .

<http://rdfportal.org/ns/svcroot> a sd:Service;
  sd:defaultDataset <http://rdfportal.org/ns/root> .

<http://rdfportal.org/ns/root> a sd:Dataset;
  sd:namedGraph <http://rdfportal.org/ns/2c0544f688656cff12b9b69af828aaf3> .

<http://rdfportal.org/ns/2c0544f688656cff12b9b69af828aaf3> a sd:NamedGraph, void:Dataset;
  sd:name <http://sparql.uniprot.org/uniprot>;
  sd:endpoint <https://rdfportal.org/sib/sparql>;
  void:triples 53650641556;
  void:classes 109;
  void:properties 85;
  void:distinctObjects 102943

In [None]:
rudof.read_service_description_str(dbcls_turtle)

And now we can get the information of the service description as before with:

In [14]:
service = rudof.get_service_description()
service_str = service.serialize(ServiceDescriptionFormat.Json)
print(f"Service description in JSON:\n{service_str}")


Service description in JSON:
{
  "title": "UniProt",
  "endpoint": "https://sparql.uniprot.org/sparql",
  "default_dataset": {
    "id": {
      "Iri": "https://sparql.uniprot.org/sparql#sparql-default-dataset"
    },
    "default_graph": {
      "id": {
        "Iri": "https://sparql.uniprot.org/.well-known/void#sparql-default-graph"
      },
      "triples": 217505202099
    },
    "named_graphs": [
      {
        "id": {
          "Iri": "http://sparql.uniprot.org/proteomes"
        },
        "name": "http://sparql.uniprot.org/proteomes",
        "graphs": [
          {
            "id": {
              "Iri": "https://sparql.uniprot.org/.well-known/void#_graph_proteomes!94c2c578"
            },
            "triples": 8984258,
            "classes": 11,
            "property_partition": [
              {
                "id": {
                  "Iri": "https://sparql.uniprot.org/.well-known/void#proteomes!370b02de!primaryTopicOf"
                },
                "property": "ht