Home » Posts tagged 'semantic web' (Page 2)
Tag Archives: semantic web
Using Jena and SHACL to validate RDF Data
RDF enables users to capture data in a way that is intuitive to them. This means that data is often captured without conforming to any schema. It is often useful to know that an RDF dataset conforms to some (potential partial) schema. This is where SHACL (SHApe Constraint Language), a W3C standard, comes into play. It is a language for describing and validating RDF graphs. In this post I will give a brief overview of how to use SHACL to validate RDF data using the Jena implementation of SHACL.
A SHACL Example
We will use an example from the SHACL specification. Assume we have a file person.ttl
that contains the following data:

Example RDF data
To validate this data we create a shape definition in personShape.ttl
containing:

Person shape definition
A Code Example using Jena
To validate our RDF data using our SHACL shape we will use the Jena implementation of SHACL. Start by adding the SHACL dependency to your Maven pom.xml
. Note that you do not need to add Jena as well as the SHACL pom already includes Jena.

SHACL Maven dependency
In the code we will assume the person.ttl
and personShape.ttl
files are in $Project/src/main/resources/
. The code for doing the validation is the following then:

Java code using Jena implementation of SHACL
Running the Code
Running the code will cause a report.ttl
file to be written out to $Project/src/main/resources/
. We can determine that our data does not conform by checking the sh:conforms
property. We have 4 violations of our ex:PersonShape
:
- For
ex:Alice
theex:ssn
property does not conform to the pattern defined in the shape. ex:Bob
has 2ex:ssn
properties.ex:Calvin
works for a company that is not of typeex:Company
.ex:Calvin
has a propertyex:birthDate
that is not allowed byex:PersonShape
since it is close bysh:closed true
.
A corrected version of our person data may look as follows:

Person data that conforms to our person shape
Conclusion
In this post I have given a brief overview of how SHACL can be used to validate RDF data using the SHACL implementation of Jena. This code example is available at shacl tutorial.
DBPedia Extraction Framework and Eclipse Quick Start
I recently treid to compile the DBPedia Extraction Framework. What was not immediately clear to me is whether I have to have Scala installed. It turns out that having Scala installed natively is not necessary, seeing as the scala-maven-plugin
is sufficient.
The steps to compile DBPedia Extraction Framework from the command line are:
- Ensure you have the JDK 1.8.x installed.
- Ensure Maven 3.x is installed.
- mvn package
Steps to compile DBPedia Extraction Framework from the Scala IDE (which can be downloaded from Scala-ide.org) are:
- Ensure you have the JDK 1.8.x installed.
- Ensure you have the Scala IDE installed.
mvn eclipse:eclipse
mvn package
- Import existing Maven project into Scala IDE.
- Run
mvn clean install
from within the IDE.
Associations between Classes
This far we have only considered UML classes where the attributes are primitive types rather than classes. Here we will consider UML classes that have classes as attributes. Assume we want to model projects. Assume a project must have one name, one sponsor that must be a manager and it must have a team of between 3 and 10 employees. In UML this can be stated using attributes (see Fig.1(a)) or associations (see Fig. 1(b)). For interest sake Wazlawick [1] suggests using attribute notation for data types and associations for classes. His motivation is that associations makes dependencies between classes more apparent. I usually follow this guideline myself.

Fig. 1
The OWL representation for these 2 class diagrams is given in Fig. 2. The first thing to notice is that we use ObjectProperty
instead of DataProperty
to represent the sponsor
attribute/association. Similar for the team attribute/association. Our property definitions also now have Domain
and Range
restrictions. When we say that Susan is the sponsor for ABC, we can infer that Susan is a manager and ABC is project. This information can be captured through Domain
and Range
restrictions. For the purpose of finding modeling errors in it is preferable to add Domain
and Range
restrictions.

Fig. 2
To limit the number of employees on a team to between 3 and 10 employees we use the property cardinality restrictions team min 3 owl:Thing
and team max 10 owl:Thing
. It may seem strange that we use team max 10 owl:Thing
rather than team max 10 Employee
. Surely we want to restrict team members to employees? Well true, but that is achieved through our range restriction on the team
object property. Here we restricting our team to 10 whatever classes and the range restriction will infer that the team must be of type Employee
.
References
1. R. S. Wazlawick, Object-oriented Analysis and Design for Information Systems: Modeling with UML, OCL and IFML, Morgan Kaufmann, 2014.