Introduction to DCTAP#
This document contains a short introduction to DCTAP using rudof.
!pip install pyrudof
from pyrudof import Rudof, RudofConfig, RudofError
rudof = Rudof(RudofConfig())
!pip install ipython # If not already installed
!pip install plantuml
from IPython.display import Image # For displaying images
Requirement already satisfied: pyrudof in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (0.2.9)
Requirement already satisfied: ipython in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (9.10.1)
Requirement already satisfied: decorator>=4.3.2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.2.1)
Requirement already satisfied: ipython-pygments-lexers>=1.0.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (1.1.1)
Requirement already satisfied: jedi>=0.18.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.19.2)
Requirement already satisfied: matplotlib-inline>=0.1.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.2.1)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.9.0)
Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (3.0.52)
Requirement already satisfied: pygments>=2.11.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (2.20.0)
Requirement already satisfied: stack_data>=0.6.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.14.3)
Requirement already satisfied: typing_extensions>=4.6 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.15.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython) (0.6.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from jedi>=0.18.1->ipython) (0.8.6)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from pexpect>4.3->ipython) (0.7.0)
Requirement already satisfied: executing>=1.2.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (2.2.1)
Requirement already satisfied: asttokens>=2.1.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (3.0.1)
Requirement already satisfied: pure-eval in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (0.2.3)
Collecting plantuml
Downloading plantuml-0.3.0-py3-none-any.whl.metadata (2.5 kB)
Collecting httplib2 (from plantuml)
Downloading httplib2-0.31.2-py3-none-any.whl.metadata (2.2 kB)
Collecting pyparsing<4,>=3.1 (from httplib2->plantuml)
Downloading pyparsing-3.3.2-py3-none-any.whl.metadata (5.8 kB)
Downloading plantuml-0.3.0-py3-none-any.whl (5.8 kB)
Downloading httplib2-0.31.2-py3-none-any.whl (91 kB)
Downloading pyparsing-3.3.2-py3-none-any.whl (122 kB)
Installing collected packages: pyparsing, httplib2, plantuml
?25l
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3/3 [plantuml]
?25h
Successfully installed httplib2-0.31.2 plantuml-0.3.0 pyparsing-3.3.2
What is DCTAP?#
DCTAP (Dublin Core Tabular Application Profiles) is a model that can be used to define metadata using a tabular format.
In this way, it is possible to define models in CSV which can then be converted to other schema technologies like ShEx or SHACL.
Converting DCTAP to ShEx#
Rudof has support for DCTAP and can be used to read DCTAP files in CSV or Excel files and convert those models to other schema languages.
DCTAP can be used to represent shapes using a tabular representation using CSV or an spreadsheet format like XLSX. As an example, the following CSV data:
dctap_str = """shapeId,propertyId,Mandatory,Repeatable,valueDatatype,valueShape
Person,name,true,false,xsd:string,
,birthdate,false,false,xsd:date,
,worksFor,false,true,,Company
Company,name,true,false,xsd:string,
,employee,false,true,,Person
"""
rudof.read_dctap(dctap_str)
print(rudof.serialize_dctap())
Shape(Person)
name xsd:string
birthdate xsd:date ?
worksFor @Company *
Shape(Company)
name xsd:string
employee @Person *
It is possible to convert the DCTAP obtained to ShEx
from pyrudof import ConversionMode, ResultConversionMode, ConversionFormat, ResultConversionFormat
shex_schema = rudof.convert_schemas(
dctap_str,
input_mode=ConversionMode.Dctap,
output_mode=ResultConversionMode.ShEx,
input_format=ConversionFormat.Csv,
output_format=ResultConversionFormat.ShExC
)
print(shex_schema)
prefix base: <http://base/>
prefix dct: <http://purl.org/dc/terms/>
prefix ex: <http://example.org/>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix schema: <http://schema.org/>
prefix sdo: <https://schema.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
ex:Person { ex:name .; ex:birthdate . ?; ex:worksFor @ex:Company * }
ex:Company { ex:name .; ex:employee @ex:Person * }
Validating data with the ShEx generated from DCTAP#
rudof.read_shex(shex_schema)
rudof.read_data("""
prefix : <http://example.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
:alice :name "Alice" ;
:birthdate "1970-01-01"^^xsd:date ;
:worksFor :acme .
:acme :name "ACME INC." .
:bob :name 23 .
""")
rudof.read_shapemap(":alice@ex:Person, :bob@ex:Person")
rudof.validate_shex()
print(rudof.serialize_shex_validation_results())
Results:
╭────────┬───────────┬────────┬──────────────────────────────────────────────────────────────────────────────────╮
│ Node │ Shape │ Status │ Details │
├────────┼───────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ :alice │ ex:Person │ OK │ Shape passed. Node :alice, shape 0: ex:Person = {(ex:name . ; ex:birthdate .? ; │
│ │ │ │ ex:worksFor @1* ; )} │
│ │ │ │ │
├────────┼───────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ :bob │ ex:Person │ OK │ Shape passed. Node :bob, shape 0: ex:Person = {(ex:name . ; ex:birthdate .? ; ex │
│ │ │ │ :worksFor @1* ; )} │
│ │ │ │ │
╰────────┴───────────┴────────┴──────────────────────────────────────────────────────────────────────────────────╯
Visualizing DCTAP content as a UML diagrams#
from pyrudof import ConversionMode, ResultConversionMode, ConversionFormat, ResultConversionFormat
plant_uml = rudof.convert_schemas(
shex_schema,
input_mode=ConversionMode.ShEx,
output_mode=ResultConversionMode.Uml,
input_format=ConversionFormat.ShExC,
output_format=ResultConversionFormat.PlantUML
)
with open('out.puml', 'w') as _f:
_f.write(plant_uml)
# @title
! pip install plantuml
! pip install ipython
!python -m plantuml out.puml
from IPython.display import Image
Requirement already satisfied: plantuml in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (0.3.0)
Requirement already satisfied: httplib2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from plantuml) (0.31.2)
Requirement already satisfied: pyparsing<4,>=3.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from httplib2->plantuml) (3.3.2)
Requirement already satisfied: ipython in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (9.10.1)
Requirement already satisfied: decorator>=4.3.2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.2.1)
Requirement already satisfied: ipython-pygments-lexers>=1.0.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (1.1.1)
Requirement already satisfied: jedi>=0.18.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.19.2)
Requirement already satisfied: matplotlib-inline>=0.1.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.2.1)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.9.0)
Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (3.0.52)
Requirement already satisfied: pygments>=2.11.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (2.20.0)
Requirement already satisfied: stack_data>=0.6.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.14.3)
Requirement already satisfied: typing_extensions>=4.6 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.15.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython) (0.6.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from jedi>=0.18.1->ipython) (0.8.6)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from pexpect>4.3->ipython) (0.7.0)
Requirement already satisfied: executing>=1.2.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (2.2.1)
Requirement already satisfied: asttokens>=2.1.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (3.0.1)
Requirement already satisfied: pure-eval in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (0.2.3)
^C
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 230, in <module>
main()
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 225, in main
print(list(map(lambda filename: {'filename': filename,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 226, in <lambda>
'gen_success': pl.processes_file(filename, directory=args.out)}, args.files)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 199, in processes_file
content = self.processes(data)
^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 169, in processes
response, content = self.http.request(url, **self.request_opts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/httplib2/__init__.py", line 1727, in request
(response, content) = self._request(
^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/httplib2/__init__.py", line 1447, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/httplib2/__init__.py", line 1399, in _conn_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/http/client.py", line 1415, in getresponse
response.begin()
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/http/client.py", line 330, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/http/client.py", line 291, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/socket.py", line 718, in readinto
return self._sock.recv_into(b)
^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Image(f"out.png")