Introduction to DCTAP#

This document contains a short introduction to DCTAP using rudof.

!pip install pyrudof

from pyrudof import Rudof, RudofConfig, RudofError
rudof = Rudof(RudofConfig())
!pip install ipython # If not already installed
!pip install plantuml
from IPython.display import Image # For displaying images
Requirement already satisfied: pyrudof in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (0.2.9)
Requirement already satisfied: ipython in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (9.10.1)
Requirement already satisfied: decorator>=4.3.2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.2.1)
Requirement already satisfied: ipython-pygments-lexers>=1.0.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (1.1.1)
Requirement already satisfied: jedi>=0.18.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.19.2)
Requirement already satisfied: matplotlib-inline>=0.1.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.2.1)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.9.0)
Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (3.0.52)
Requirement already satisfied: pygments>=2.11.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (2.20.0)
Requirement already satisfied: stack_data>=0.6.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.14.3)
Requirement already satisfied: typing_extensions>=4.6 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.15.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython) (0.6.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from jedi>=0.18.1->ipython) (0.8.6)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from pexpect>4.3->ipython) (0.7.0)
Requirement already satisfied: executing>=1.2.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (2.2.1)
Requirement already satisfied: asttokens>=2.1.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (3.0.1)
Requirement already satisfied: pure-eval in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (0.2.3)
Collecting plantuml
  Downloading plantuml-0.3.0-py3-none-any.whl.metadata (2.5 kB)
Collecting httplib2 (from plantuml)
  Downloading httplib2-0.31.2-py3-none-any.whl.metadata (2.2 kB)
Collecting pyparsing<4,>=3.1 (from httplib2->plantuml)
  Downloading pyparsing-3.3.2-py3-none-any.whl.metadata (5.8 kB)
Downloading plantuml-0.3.0-py3-none-any.whl (5.8 kB)
Downloading httplib2-0.31.2-py3-none-any.whl (91 kB)
Downloading pyparsing-3.3.2-py3-none-any.whl (122 kB)
Installing collected packages: pyparsing, httplib2, plantuml
?25l
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3/3 [plantuml]
?25h
Successfully installed httplib2-0.31.2 plantuml-0.3.0 pyparsing-3.3.2

What is DCTAP?#

DCTAP (Dublin Core Tabular Application Profiles) is a model that can be used to define metadata using a tabular format.

In this way, it is possible to define models in CSV which can then be converted to other schema technologies like ShEx or SHACL.

Converting DCTAP to ShEx#

Rudof has support for DCTAP and can be used to read DCTAP files in CSV or Excel files and convert those models to other schema languages.

DCTAP can be used to represent shapes using a tabular representation using CSV or an spreadsheet format like XLSX. As an example, the following CSV data:

dctap_str = """shapeId,propertyId,Mandatory,Repeatable,valueDatatype,valueShape
Person,name,true,false,xsd:string,
,birthdate,false,false,xsd:date,
,worksFor,false,true,,Company
Company,name,true,false,xsd:string,
,employee,false,true,,Person
"""
rudof.read_dctap(dctap_str)
print(rudof.serialize_dctap())
Shape(Person)  
 name xsd:string 
 birthdate xsd:date ?
 worksFor @Company *
Shape(Company)  
 name xsd:string 
 employee @Person *

It is possible to convert the DCTAP obtained to ShEx

from pyrudof import ConversionMode, ResultConversionMode, ConversionFormat, ResultConversionFormat

shex_schema = rudof.convert_schemas(
    dctap_str, 
    input_mode=ConversionMode.Dctap,
    output_mode=ResultConversionMode.ShEx,
    input_format=ConversionFormat.Csv,
    output_format=ResultConversionFormat.ShExC
)
print(shex_schema)
prefix base: <http://base/>
prefix dct: <http://purl.org/dc/terms/>
prefix ex: <http://example.org/>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix schema: <http://schema.org/>
prefix sdo: <https://schema.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
ex:Person { ex:name .; ex:birthdate . ?; ex:worksFor @ex:Company * }
ex:Company { ex:name .; ex:employee @ex:Person * }

Validating data with the ShEx generated from DCTAP#

rudof.read_shex(shex_schema)
rudof.read_data("""
prefix : <http://example.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
:alice :name "Alice" ;
       :birthdate "1970-01-01"^^xsd:date ;
       :worksFor :acme .
:acme  :name "ACME INC." .

:bob   :name 23 .
""")
rudof.read_shapemap(":alice@ex:Person, :bob@ex:Person")
rudof.validate_shex()
print(rudof.serialize_shex_validation_results())
Results:
╭────────┬───────────┬────────┬──────────────────────────────────────────────────────────────────────────────────╮
│ Node   │ Shape     │ Status │ Details                                                                          │
├────────┼───────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ :alice │ ex:Person │ OK     │ Shape passed. Node :alice, shape 0: ex:Person = {(ex:name . ; ex:birthdate .? ;  │
│        │           │        │ ex:worksFor @1* ; )}                                                             │
│        │           │        │                                                                                  │
├────────┼───────────┼────────┼──────────────────────────────────────────────────────────────────────────────────┤
│ :bob   │ ex:Person │ OK     │ Shape passed. Node :bob, shape 0: ex:Person = {(ex:name . ; ex:birthdate .? ; ex │
│        │           │        │ :worksFor @1* ; )}                                                               │
│        │           │        │                                                                                  │
╰────────┴───────────┴────────┴──────────────────────────────────────────────────────────────────────────────────╯

Visualizing DCTAP content as a UML diagrams#

from pyrudof import ConversionMode, ResultConversionMode, ConversionFormat, ResultConversionFormat

plant_uml = rudof.convert_schemas(
    shex_schema, 
    input_mode=ConversionMode.ShEx,
    output_mode=ResultConversionMode.Uml,
    input_format=ConversionFormat.ShExC,
    output_format=ResultConversionFormat.PlantUML
)

with open('out.puml', 'w') as _f:
    _f.write(plant_uml)
# @title
! pip install plantuml
! pip install ipython
!python -m plantuml out.puml
from IPython.display import Image
Requirement already satisfied: plantuml in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (0.3.0)
Requirement already satisfied: httplib2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from plantuml) (0.31.2)
Requirement already satisfied: pyparsing<4,>=3.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from httplib2->plantuml) (3.3.2)
Requirement already satisfied: ipython in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (9.10.1)
Requirement already satisfied: decorator>=4.3.2 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.2.1)
Requirement already satisfied: ipython-pygments-lexers>=1.0.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (1.1.1)
Requirement already satisfied: jedi>=0.18.1 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.19.2)
Requirement already satisfied: matplotlib-inline>=0.1.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.2.1)
Requirement already satisfied: pexpect>4.3 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.9.0)
Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (3.0.52)
Requirement already satisfied: pygments>=2.11.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (2.20.0)
Requirement already satisfied: stack_data>=0.6.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (0.6.3)
Requirement already satisfied: traitlets>=5.13.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (5.14.3)
Requirement already satisfied: typing_extensions>=4.6 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from ipython) (4.15.0)
Requirement already satisfied: wcwidth in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython) (0.6.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from jedi>=0.18.1->ipython) (0.8.6)
Requirement already satisfied: ptyprocess>=0.5 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from pexpect>4.3->ipython) (0.7.0)
Requirement already satisfied: executing>=1.2.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (2.2.1)
Requirement already satisfied: asttokens>=2.1.0 in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (3.0.1)
Requirement already satisfied: pure-eval in /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages (from stack_data>=0.6.0->ipython) (0.2.3)
^C
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 230, in <module>
    main()
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 225, in main
    print(list(map(lambda filename: {'filename': filename,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 226, in <lambda>
    'gen_success': pl.processes_file(filename, directory=args.out)}, args.files)))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 199, in processes_file
    content = self.processes(data)
              ^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/plantuml.py", line 169, in processes
    response, content = self.http.request(url, **self.request_opts)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/httplib2/__init__.py", line 1727, in request
    (response, content) = self._request(
                          ^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/httplib2/__init__.py", line 1447, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/httplib2/__init__.py", line 1399, in _conn_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/http/client.py", line 1415, in getresponse
    response.begin()
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/http/client.py", line 330, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/http/client.py", line 291, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/socket.py", line 718, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Image(f"out.png")
_images/c64c0eceb61eb72ad922e5febf8c566cb43ae73bd21a5b82db63c43436cf001a.png