Henriette's Notes

Home » 2017

Yearly Archives: 2017

DBPedia Extraction Framework and Eclipse Quick Start

I recently treid to compile the DBPedia Extraction Framework. What was not immediately clear to me is whether I have to have Scala installed. It turns out that having Scala installed natively is not necessary, seeing as the scala-maven-plugin is sufficient.

The steps to compile DBPedia Extraction Framework from the command line are:

  1. Ensure you have the JDK 1.8.x installed.
  2. Ensure Maven 3.x is installed.
  3. mvn package

Steps to compile DBPedia Extraction Framework from the Scala IDE (which can be downloaded from Scala-ide.org) are:

  1. Ensure you have the JDK 1.8.x installed.
  2. Ensure you have the Scala IDE installed.
  3. mvn eclipse:eclipse
  4. mvn package
  5. Import existing Maven project into Scala IDE.
  6. Run mvn clean install from within the IDE.

Associations between Classes

This far we have only considered UML classes where the attributes are primitive types rather than classes. Here we will consider UML classes that have classes as attributes. Assume we want to model projects. Assume a project must have one name, one sponsor that must be a manager and it must have a team of between 3 and 10 employees. In UML this can be stated using attributes (see Fig.1(a)) or associations (see Fig. 1(b)). For interest sake Wazlawick [1] suggests using attribute notation for data types and associations for classes. His motivation is that associations makes dependencies between classes more apparent. I usually follow this guideline myself.

Fig. 1

Fig. 1

The OWL representation for these 2 class diagrams is given in Fig. 2. The first thing to notice is that we use ObjectProperty instead of DataProperty to represent the sponsor attribute/association. Similar for the team attribute/association. Our property definitions also now have Domain and Range restrictions. When we say that Susan is the sponsor for ABC, we can infer that Susan is a manager and ABC is project. This information can be captured through Domain and Range restrictions. For the purpose of finding modeling errors in it is preferable to add Domain and Range restrictions.

Association between Classes Manchester

Fig. 2

To limit the number of employees on a team to between 3 and 10 employees we use the property cardinality restrictions team min 3 owl:Thing and team max 10 owl:Thing. It may seem strange that we use team max 10 owl:Thing rather than team max 10 Employee. Surely we want to restrict team members to employees? Well true, but that is achieved through our range restriction on the team object property. Here we restricting our team to 10 whatever classes and the range restriction will infer that the team must be of type Employee.

References

1. R. S. Wazlawick, Object-oriented Analysis and Design for Information Systems: Modeling with UML, OCL and IFML, Morgan Kaufmann, 2014.

 

fast.ai: A Fresh take on Learning and Teaching ML

In this video Rachel Thomas provides an interesting take on learning ML: instead of promoting the typical bottom-up approach, fast.ai promotes a top-down approach. From a pedagogical perspective this seems counter intuitive. Surely you need to know the building blocks before you can move on to the theory that builds on the building blocks?  Indeed, that is how traditional education proceeds. However,  when consultants provide feedback to executives they tend to take a top-down approach. Why is that?

The main reason for taking a top-down approach when writing up/presenting technical findings is that you can provide a roadmap for where you are heading. This means that when you step into the details, the stakeholders can, because they now have a map of where you are heading, know how the details relate to the bigger picture. This is precisely why I think the fast.ai approach to learning ML can be effective.  Rachel Thomas provides further motivation for their approach in their video: How to Learn Deep Learning (when you’re not a computer science PhD)