ShEx#
This document contains a short introduction to ShEx using rudof.
Preliminaries: Install and configure rudof#
First, we install and configure rudof.
# @title
!pip install pyrudof
from pyrudof import Rudof, RudofConfig, RudofError
rudof = Rudof(RudofConfig())
Requirement already satisfied: pyrudof in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (0.2.9)
Validating RDF using ShEx#
ShEx (Shape Expressions) is a concise and human-readable language to describe and validate RDF data.
A ShEx schema contains several declarations of shapes and can be defined in several formats: a compact format (ShExC), a JSON-LD format (ShExJ) and an RDF based on (ShExR).
Let’s start defining a simple ShEx schema as:
rudof.read_shex("""
prefix : <http://example.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
:User {
:name xsd:string ;
:birthDate xsd:date ;
:knows @:User * ;
:worksFor @:Company *
}
:Company {
:name xsd:string ;
:code xsd:integer ;
:employee @:User *
}
""")
And let’s define some RDF data.
rudof.read_data("""
prefix : <http://example.org/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
:alice a :Person ;
:name "Alice" ;
:birthDate "2005-03-01"^^xsd:date ;
:worksFor :acme ;
:knows :bob .
:bob a :Person ;
:name "Robert Smith" ;
:birthDate "2003-01-02"^^xsd:date ;
:worksFor :acme ;
:knows :alice .
:acme a :Company ;
:name "Acme Inc." ;
:code 23 .
""")
In order to validate nodes in a graph, ShEx uses a ShapeMap, which associates a node selector with a shape.
The simplest shape map is a pair node@shape.
Rudof keeps the current shape map in a structure so it can be reused.
The method read_shapemap(shapemap) can be used to read a shapemap from a string and store it’s value as the currrent shape map.
In the previous example, if we want to validate :alice as a :Person we can use:
rudof.read_shapemap(":alice@:User")
Once the ShEx schema and the Shapemap have been added to rudof, it is possible to validate the current RDF data with the validate_shex() method:
from pyrudof import ResultShexValidationFormat
rudof.validate_shex()
results = rudof.serialize_shex_validation_results(format=ResultShexValidationFormat.Compact)
print(results)
Results:
╭────────┬───────┬────────╮
│ Node │ Shape │ Status │
├────────┼───────┼────────┤
│ :alice │ :User │ OK │
╰────────┴───────┴────────╯
If we want to validate :acme as a :Company, we could do:
from pyrudof import ResultShexValidationFormat
rudof.read_shapemap(":alice@:Company")
rudof.validate_shex()
results = rudof.serialize_shex_validation_results(format=ResultShexValidationFormat.Details)
print(results)
Results:
╭────────┬──────────┬────────┬─────────────────────────────────────────────────────────────────────╮
│ Node │ Shape │ Status │ Details │
├────────┼──────────┼────────┼─────────────────────────────────────────────────────────────────────┤
│ :alice │ :Company │ FAIL │ Error Shape 1 failed for node http://example.org/alice with errors │
│ │ │ │ │
╰────────┴──────────┴────────┴─────────────────────────────────────────────────────────────────────╯
The shapemap can contain a list of nodes and shapes, so it is possible to run several validations like:
rudof.read_shapemap(":alice@:User, :bob@:User")
rudof.validate_shex()
results = rudof.serialize_shex_validation_results()
print(results)
Results:
╭────────┬───────┬────────┬──────────────────────────────────────────────────────────────────────────────────╮
│ Node │ Shape │ Status │ Details │
├────────┼───────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ :alice │ :User │ OK │ Shape passed. Node :alice, shape 0: :User = {(:name xsd:string ; :birthDate xsd: │
│ │ │ │ date ; :knows @0* ; :worksFor @1* ; )} │
│ │ │ │ │
├────────┼───────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ :bob │ :User │ OK │ Shape passed. Node :bob, shape 0: :User = {(:name xsd:string ; :birthDate xsd:da │
│ │ │ │ te ; :knows @0* ; :worksFor @1* ; )} │
│ │ │ │ │
╰────────┴───────┴────────┴──────────────────────────────────────────────────────────────────────────────────╯
We reset the status of the ShEx schema, the Shapemap and the current RDF data for the next section.
# @title
rudof = Rudof(RudofConfig())
Validating SPARQL endpoints#
It is also possible to validate RDF data which is not local but is available in a SPARQL endpoint like wikidata or dbpedia. Let’s start with Wikidata:
rudof.read_data(endpoint="wikidata")
We can declare a simple shape in Wikidata as follows:
rudof.read_shex("""
prefix : <http://example.org/>
prefix wd: <http://www.wikidata.org/entity/>
prefix wdt: <http://www.wikidata.org/prop/direct/>
:Researcher {
wdt:P31 [ wd:Q5 ] ; # Instance of Human
wdt:P19 @:Place ; # BirthPlace
}
:Place {
wdt:P17 @:Country * ; # Country
}
:Country {}
""")
rudof.read_shapemap("wd:Q80@:Researcher")
rudof.validate_shex()
results = rudof.serialize_shex_validation_results()
print(results)
Results:
╭──────────────────────────────────────┬─────────────┬────────┬──────────────────────────────────────────────────────────────────────────────────╮
│ Node │ Shape │ Status │ Details │
├──────────────────────────────────────┼─────────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ <http://www.wikidata.org/entity/Q80> │ :Researcher │ OK │ Shape passed. Node <http://www.wikidata.org/entity/Q80>, shape 0: :Researcher = │
│ │ │ │ {(wdt:P31 [wd:Q5 ] ; wdt:P19 @1 ; )} │
│ │ │ │ │
╰──────────────────────────────────────┴─────────────┴────────┴──────────────────────────────────────────────────────────────────────────────────╯
Visualizing ShEx schemas#
rudof can be used to convert ShEx to diagrams in UML-like style. The converter generates a PlantUML string which can be written to a file and converted to an image using the PlantUML tool.
from pyrudof import ConversionMode, ResultConversionMode, ConversionFormat, ResultConversionFormat
shex_schema = """
prefix : <http://example.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
:User {
:name xsd:string ;
:worksFor @:Company * ;
:addres @:Address ;
:knows @:User
}
:Company {
:name xsd:string ;
:code xsd:string ;
:employee @:User
}
:Address {
:name xsd:string ;
:zip_code xsd:string
}
"""
plant_uml = rudof.convert_schemas(
shex_schema,
input_mode=ConversionMode.ShEx,
output_mode=ResultConversionMode.Uml,
input_format=ConversionFormat.ShExC,
output_format=ResultConversionFormat.PlantUML
)
with open('out.puml', 'w') as _f:
_f.write(plant_uml)
Now we install the PlantUML tools necessary to process the generated plant_uml
# @title
! pip install plantuml
! pip install ipython
!python -m plantuml out.puml
from IPython.display import Image
Requirement already satisfied: plantuml in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (0.3.0)
Requirement already satisfied: httplib2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from plantuml) (0.31.2)
Requirement already satisfied: pyparsing<4,>=3.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from httplib2->plantuml) (3.3.2)
Requirement already satisfied: ipython in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (9.10.1)
Requirement already satisfied: decorator>=4.3.2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.2.1)
Requirement already satisfied: ipython-pygments-lexers>=1.0.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (1.1.1)
Requirement already satisfied: jedi>=0.18.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.19.2)
Requirement already satisfied: matplotlib-inline>=0.1.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.2.1)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.9.0)
Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (3.0.52)
Requirement already satisfied: pygments>=2.11.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (2.20.0)
Requirement already satisfied: stack_data>=0.6.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.14.3)
Requirement already satisfied: typing_extensions>=4.6 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.15.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython) (0.6.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from jedi>=0.18.1->ipython) (0.8.6)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from pexpect>4.3->ipython) (0.7.0)
Requirement already satisfied: executing>=1.2.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (2.2.1)
Requirement already satisfied: asttokens>=2.1.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (3.0.1)
Requirement already satisfied: pure-eval in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (0.2.3)
[{'filename': 'out.puml', 'gen_success': True}]
Image(f"out.png")