Henriette's Notes

Add Some More Attributes

Person with added attributes

 

In this post what I want to do is add some attributes to the Person class of the previous post. The important thing to understand is that as you add attributes to a class, what you are doing in effect is adding additional constraints that will cause the number of objects that can be of that type to shrink. This is illustrated in the Venn diagram below. Note that our Person class is now a subset of the intersection of the sets of objects with name as attribute, surname as attribute and age as attribute.

 

Person with more attributes subset

 

If we now consider the OWL 2 representation of this class in Manchester syntax, it matches our Venn diagram exactly. It further states that name, surname and age are properties. It states that individuals of the Person class have a name property of type xsd:string, a surname property of type xsd:string and a age property of type xsd:integer.

 

Add some attributes OWL

A Simple Class

A Simple Class

Let us start with a simple example. Assume we have a Person class, which models a person that has a name. Let us just think about what this means. If we think of our domain of interest and we list all the objects of the domain, some objects will belong to a set that is a subset of the domain of interest, which is called the Person set, which is represented by our Person class. Our Person class also has a name attribute of type String, but it is likely that we will have other classes in our domain that may have a name attribute of type String. Thus, the Person class represents objects that are a subset of all the objects in the domain that have a name attribute of type String. This is shown in the Venn diagram below.

Person Subset

 

Note that the Person class is not necessarily a strict subset of the objects that have a name attribute of type String. It is possible that the Person class is the only class in our domain that has a name attribute of type String, in which case these two sets are in fact equal.

The OWL 2 equivalent representation in Manchester syntax is given in the image below. Note that for the name attribute in the UML class we have defined a related DataProperty. Furthermore, a Person class is also defined, which is defined as SubClassOf: name some xsd:string. What this means is that individuals that belongs to the Person class also belongs to the class of individuals that have a name property of type xsd:string. Thus, the Person class is a subclass of the class representing individuals that have a name property of type xsd:string.

Person Manchester

The Correspondence between DLs/OWL and OO

The analogy between DLs and object-orientation can be observed when it is considered that the basic task in constructing an ontology is classification. Explicit subsumption relationships between concepts can be defined in the TBox. In object-orientation this can be achieved by definition of an inheritance hierarchy between classes. Classification is further solidified as the basis of DLs in that the core reasoning capabilities they provide are subsumption and instance checking. Subsumption computes a subsumption hierarchy, which essentially categorizes concepts into superconcept/subconcept relationships. Instance checking verifies whether a given individual is an instance of a specific concept [1].

In object-orientation the domain of interest is described in terms of classes that have properties, which are defined via attributes and/or associations. Classes in essence have a set-theoretic semantics, i.e. a class represents a set of objects in the domain of interest which shares attributes. Objects that are classified by a class are called instances of the class. The analogy with DLs is that classes, attributes/associations and instances (or sometimes called objects) correspond respectively with concepts, roles and individuals in DLs, which in OWL corresponds respectively to classes, properties and individuals.

This correspondence between object orientation, DLs and OWL 2 is summarized in the table below.

Object orientation DLs OWL 2
Class Concept Class
Attribute/association Role Property
Object Individual Individual

Bibliography

[1] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi and P. F. Patel-Schneider, The Description Logic Handbook: Theory, Implementation and Applications, Cambridge University Press, 2007.

Theoretical Basis of Object-oriented Analysis

Classification is the core activity of object-oriented analysis. Classification is the means via which people order knowledge according to the similarities they recognize between different objects they observe in the world. The specific classification approach that is applied when doing object-oriented analysis is called classical categorization. Classification, in specific classical categorization, does not pertain to object-orientation alone, but rather, it reflects how people think in general about the world [1] [2] [3]. Olive explains the need and use of classification as follows [2]:

Classification provides cognitive economy because it allows us to structure knowledge about objects into two levels: concept and instance. At the concept level, we find the properties (both defining and nondefining) common to all instances of the concept. At the instance level, we find only the concept of which the object is an instance, and the particular properties of that instance. In the absence of classification, we would have to associate every instance with all of its properties. Classification reduces the amount of information we have to remember, communicate, and process; the extent to which it is reduced depends on the number of properties of the concept.

It is precisely this cognitive economy, provided by a complete and consistent object-oriented conceptual schema, that is the essence of enabling efficient communication between stakeholders of software development projects. When software development projects decide to forgo the creation of conceptual schemas, it is at the cost of efficient communication.

 

Notice that classification is inherently set theoretic. Sets have a characteristic function that essentially determines from a possible universe of elements, which elements belong to the set. With classification commonalities between instances are recognized which cause us to classify these instances as belonging to the same concept. In this way concepts and sets are equivalent.

 

Bibliography

[1] G. Booch, R. A. Maksimchuk, M. W. Engel, B. J. Young, J. Conallen and K. A. Houston, Object-oriented analysis and design with applications, Addison-Wesley Professional, 2007.

[2] A. Olive, Conceptual modeling of information systems, Springer, 2007.

[3] G. Lakoff, Women, fire and dangerous things: what categories reveal about the mind, Chicago: University of Chicago Press, 1990.

What are Description Logics?

Description logics (DLs) are syntactic variants of first-order logic that are specifically designed for the conceptual representation of an application domain in terms of concepts and relationships between concepts [1].

 
Expressions in DLs are constructed from atomic concepts (unary predicates), atomic roles (binary predicates) and individuals (constants). Complex expressions can be built inductively from these atomic elements using concept constructors. Formally a concept represents a set of individuals and a role a binary relation between individuals [2].

 

Formally every DL ontology consists of a set of axioms that are based on finite sets of concepts, roles and individuals. Axioms in a DL ontology are divided into the TBox, the RBox and the ABox. A TBox is used to define concepts and relationships between concepts (that is the terminology or taxonomy) and an ABox is used to assert knowledge regarding the domain of interest (i.e. that an individual is a member of a concept). Depending on the expressivity of the DL used, an ontology may include an RBox. An RBox is used to define relations between roles as well as properties of roles [2].

 

A feature of DLs is that they have decidable reasoning procedures for standard reasoning tasks.  This means these reasoning procedures will give an answer, unlike undecidable reasoning procedures which may not terminate and thus may not give an answer.  A fundamental goal of DL research is to preserve decidability to the point that decidability is considered to be a precondition for claiming that a formalism is a DL. Standard DL reasoning algorithms are sound and complete and, even though the worst-case computational complexity of these algorithms is ExpTime and worse, in practical applications they are well-behaved [3].

 

Standard reasoning procedures for DLs are the following [2].

  • Satisfiability checking checks that every axiom in an ontology can be instantiated. Axioms that cannot be instantiated indicates that modelling errors exist within the ontology.
  • Consistency checking checks whether there are axioms that contradict each other, which again is indicative of modelling errors.
  • Subsumption checking checks whether an axiom subsumes another axiom, which is used for classifying axioms into a parent-child taxonomy.

 

Various DLs exist with different levels of expressivity and computational complexity. The most widely supported DL is SROIQ(D) which forms the mathematical basis of the W3C OWL 2 standard [4]. In OWL concepts are referred to as classes, roles are referred to as properties and individuals are still referred to as individuals.

 

In subsequent posts I will provide an intuitive understanding of OWL 2 and explain some of its uses. If you are using OWL or other semantic technologies, I will love to hear from you. Please leave a comment and feel free to explain the novel ways in which you use semantic technologies.

 

Bibliography

[1] D. Berardi, D. Calvanese and G. De Giacomo, “Reasoning on UML class diagrams,” Artificial Intelligence, vol. 168, no. 1-2, p. 70–118, 2005.

[2] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi and P. F. Patel-Schneider, The Description Logic Handbook: Theory, Implementation and Applications, Cambridge University Press, 2007.

[3] F. Baader, “What’s new in Description Logics,” Informatik-Spektrum, vol. 34, no. 5, p. 434–442, 2011.

[4] W3C, “OWL 2 Web Ontology Language – Document Overview (Second Edition),” W3C, 11 December 2012. [Online]. Available: https://www.w3.org/TR/owl2-overview/. [Accessed 9 September 2017].

The Rectangle/Square Controversy

A software design problem that has often been discussed in the literature is whether a square should inherit from a rectangle or whether a rectangle should inherit from a square? Two opposing arguments are made by Meyer [1] and Martin, et.al. [2], both of which seem to have some merit. How is a software developer to decide which is the correct design? In this post I will discuss why I believe both approaches are wrong and I will provide the solution I prefer.

Meyer [1] proposes the design where square inherits from rectangle with square imposing the invariant that the width must equal the height. The conceptual design is shown in Figure 1. This design makes sense since a square “is-a” rectangle. Remember, that “is-a” is the litmus test for inheritance [3]. To state this more precisely: inheritance defines a subset relation between a child and a parent class. Thus, the Square class represents the set of squares which is a subset of the set of rectangles which is represented by the Rectangle class. This is shown in Figure 2. For a detailed discussion on the mathematical semantics of inheritance, see Meyer [1]. The implementation is shown in Figure 3.

Figure 1: Square extends Rectangle

Figure 1: Square extends Rectangle

Figure 2

Figure 2: The set of squares is a subset of the set of rectangles

Figure 3

Figure 3: The implementation of a Square extending a Rectangle

Martin’s criticism against the design of Figure 1 is that it violates the Liskov substitution principle, which states that subtypes must be substitutable for their base types [2]. Clients of instances of the Square class cannot set the width and height of a square as if they are using a rectangle. In particular they need to ascertain that the width and height of instances of square are equal. The main problem is that a square does not require both a width and a height and it therefore needs to ignore either of the attributes.

An alternative design that Martin discusses is where the setWidth (respectively setHeight) method is changed to set the height equal to the width (respectively to set the width equal to the height). However, the problem is when setWidth (respectively setHeight) is called it will overwrite the value of the height (respectively width) (see Figure 4). Hence, a client that first sets the width to 5 and then the height to 3 will expect the perimeter to be 16, but instead it will be 12. This again violates the Liskov substitution principle.

Figure 4

Figure 4: Forcing the width and height to be equal

The solution that Martin [2] proposes is that the Rectangle class must inherit from the Square class (see Figure 5). The objection that Meyer has against this design is that it violates the “is-a” relation in that a rectangle is not a square. Now you may reason that a rectangle is indeed a square when its width and height are equal, but the crucial point is that a rectangle is not always a square. Remember, inheritance enforces a subset relation between a child and a parent. That is, all instances of the child are necessarily instances of the parent, but not all instances of the parent are necessarily instances of the child.

The flaw in the solution that Martin proposes is that he uses inheritance for the sole purpose of reuse without there being an “is-a” relation. Meyer calls this “convenience inheritance” which should be avoided.

Figure 5

Figure 5: Rectangle extends Square

From a conceptual perspective a way around these objections is to introduce a class (i.e. Quadrilateral) from which both the Rectangle and Square classes can inherit. The related UML class diagram is shown in Figure 6. The {incomplete, overlapping} annotation indicates that other quadrilaterals may exist and that rectangles may occasionally be squares (when width=height).

Figure 6

Figure 6: Conceptual solution for Square and Rectangle problem

The implementation of this conceptual solution can be done where Quadrilateral is either an interface or an abstract class. Figure 7 shows the implementation where Quadrilateral is defined as an interface.

Figure 7

Figure 7: A correct implementation for the Square and Rectangle classes

Naturally I am not the first to point this solution out. [5] gives a general discussion of the problem and [4] provides a solution similar to mine.

Bibliography

[1] B. Meyer, Object-oriented software construction, Prentice Hall, 1997.

[2] R. Martin and M. Micah, Agile principles, patterns, and practices in C#, Prentice Hall, 2006.

[3] G. Booch, R. A. Maksimchuk, M. W. Engel, B. J. Young, J. Conallen and K. A. Houston, Object-oriented analysis and design with applications, Addison-Wesley Professional, 2007.

[4] R. Carr, “Is a Square a Rectangle?,” [Online]. Available: http://www.blackwasp.co.uk/SquareRectangle.aspx.

[5] H. Makabee, “When a square is not a rectangle,” [Online]. Available: http://effectivesoftwaredesign.com/2010/09/20/when-a-square-is-not-a-rectangle/.

Software Entropy: The Case for Organizational Change

In general enterprise software consists of large numbers of systems with complex cross-dependencies and with a high level of heterogeneity and redundancies. The ability of these software systems to address the needs of the enterprise changes over time for the worse. In order to understand the forces at work I will first consider entropy at the system level and then at the enterprise level.

System Entropy

Initially after a software system is delivered, making changes to the system is usually relatively easy. The longer the system is in maintenance and the more changes are incorporated into the code base; the more the ability to make changes to the software system deteriorates [1] [2]. There are various reasons for the deterioration of software systems. Some of the reasons are due to technology, but mostly the reasons are organizational of nature [1].

From a technology perspective various best practices exist to counter entropy of software systems. For legacy technologies roadmaps exist to incorporate the latest systems design thinking to address short comings of earlier technologies. As an example IBM provides various strategies to improve the maintainability of RPG programs [3] [4].

At a system level the single most significant contributor to the decay of the code base is the lack of routine refactoring to ensure the long term maintainability of the system [1]. These refactoring exercises can for instance be geared towards ensuring the soundness of the design in the face of changing business needs or taking advantage of technology improvements. The lack of timely refactoring may be due to factors such as lack of skilled staff or time-to-market demands.

As such technology is not entirely to blame for the entropy of software systems. Rather, organizational level challenges are the most important contributors to the decay of software systems. Therefore, in order to address software system entropy, change firstly needs to be effected at the organizational level and subsequently at the technology level if appropriate. For example, replacing the existing RPG programs with the latest technology will inevitably in time propagate entropy in the new technology as well if the organizational challenges are not addressed.

Further factors that contribute to the decay of software systems originate at an enterprise level.

Enterprise Software Entropy

The level of entropy found at the enterprise level is significantly higher than that of a single isolated software system. Typically the enterprise architecture is an accidental architecture. That is the enterprise architecture came about organically due to the independent evolution of the various systems and their cross dependencies rather than being planned. Since the various systems have been designed and developed in isolation, redundancies between systems are rife. Many enterprise architectures has grown over many years which results in a plethora of programming languages – often based on disparate programming paradigms – used across systems. With such a lengthy history these systems are frequently hosted on different kinds of servers running a variety of operating systems.

A characteristic of an accidental architecture is the occurrence of stovepipes (or silos). Stovepipes are systems that have been developed to fulfill a particular task without regard for the context of the task within the larger enterprise business process [2]. Since these systems have not been designed to interoperate [5], this approach sabotages efficiency at an enterprise level [6]. The most prevalent approach to alleviating the effect of silos in the enterprise is through different manifestations of point-to-point integrations. This results in the intertwined enterprise architecture as seen in Figure 1.

Typical Intertwined Enterprise Architecture

Figure 1: Typical Intertwined Enterprise Architecture

The effects of stovepipes are usually acutely noticeable between business units while the intra business unit level effects are often largely hidden from the enterprise. Understandably this causes business and management to have a vague idea of the true complexity of integration projects. This contributes to integration projects being invariable hugely under estimated in terms of cost and timelines [7]. Given this background it is no wonder that an estimated 70% of Enterprise Application Integration (EAI) projects fail [8]. The problems around integration are further exacerbated when e-Business and its associated need for business to business communications is considered.

If we want to address the problems of enterprise software effectively, we have to understand the forces contributing to entropy. These factors can be categorized into day-to-day operations and specific events [1].

The daily maintenance processes are geared towards the continuous delivery of economic value in the face of ever changing business needs. Enterprise systems generate economic value through value networks. These value networks frequently span multiple applications [9] since most applications or off-the-shelf solutions are not solving cross-functional or inter-enterprise needs [10] [11] [12]. In order to support the value network these applications need to be integrated [12] [11]. Day-to-day operations are typically project and short term focussed instead of being enterprise and long term focussed. This explains the proliferation of point-to-point integrations found in many of today’s enterprises.

Besides day-to-day operations specific events contribute to the decay of enterprise systems [1]. These events include but are not limited to the following:

  • The maturation of organizations causes new specialized groups to be included in the Software Development Life Cycle (SDLC) [5].
  • End-of-life events of a supported product [1] may cause skills to be available only at a premium.
  • An enterprise may expand into new territories as a result of off shoring or globalization [5].
  • Mergers or acquisitions can introduce new systems and technologies into the organization. This is likely to result in redundancies at various levels [1] [5] [9].
  • New lines of business may be introduced to take advantage of new business opportunities [9]. Often business units have their own IT departments with opposing agendas causing infighting between business units [5].
  • Obsoleteness or limitations of legacy technologies may drive technology modernization exercises [9]. Legacy applications further provide challenges in managing a retiring workforce and ever decreasing skills base [5].
  • Government legislation may change the way business is conducted [9].

It is significant to again note that the role technology play in driving entropy at the enterprise level is negligible. Thus in order to find solutions to enterprise software entropy we cannot consider technical change alone – change must be effected at the organizational level as well.

Conclusion

The traditional solution to the decay of software is to start from a clean slate and rewrite the application using a new technology. Often technology is blamed for the ills of the software industry, but technology is merely a tool at the disposal of the enterprise. The enterprise chose to use the technology because it has a business value. When a newer technology comes along it brings new opportunities for increased efficiencies – it does not nullify the value of the older technology. Therefore a rewrite should not be considered merely on technical grounds (for example the availability of a new technology), but should have a clear business case.

Continuous refactoring can be used to modify an application to take advantage of the latest design principles and stay relevant in the face of changing business needs. However, the prevalence of short term planning in organizations causes the continuous refactoring step often to be skipped altogether. Omitting the refactoring step in the long term leads to code that becomes too costly to refactor, hence the need for a rewrite.

Bibliography

[1] D. Krafzig, K. Banke and D. Slama, “Enterprise SOA: Service-Oriented Architecture Best Practices,” Amazon.com, 2004. [Online]. Available: http://www.amazon.com/Enterprise-SOA-Service-Oriented-Architecture-Practices/dp/0131465759/.
[2] M. Juric, P. Sarang and B. Mathew, “Business Process Execution Language for Web Services BPEL and BPEL4WS 2nd Edition,” Amazon.com, 2006. [Online].
[3] L. Patterson, H. Araki, S. Bramley, G. Cobb, J. Eikenhorst, S. Milligan, J. Simons and M. Tregear, “Modernizing and Improving the Maintainability of RPG Applications Using X-Analysis Version 5.6,” http://www.redbooks.ibm.com/, 2006. [Online]. Available: http://www.redbooks.ibm.com/redpapers/pdfs/redp4046.pdf.
[4] D. Cruikshank, “Modernizing Database Access: The Madness Behind the Methods,” IBM, 2006. [Online]. Available: http://www-03.ibm.com/systems/resources/systems_i_software_db2_pdf_Performance_DDS_SQL.pdf.
[5] M. Matsumura, B. Brauel and J. Shah, “SOA Adoption for Dummies,” Software AG, 2009. [Online]. Available: http://zy.xjgame.com/SOBPAO/SOA%20Adoption%20for%20Dummies.pdf.
[6] J. W. Ross, P. Weill and D. C. Robertson, “Enterprise Architecture As Strategy: Creating a Foundation for Business Execution,” Amazon.com, 2006. [Online]. Available: http://www.amazon.com/Enterprise-Architecture-Strategy-Foundation-Execution/dp/1591398398.
[7] R. Schmelzer, “Understanding the Real Costs of Integration,” zapThink, 2002. [Online]. Available: http://www.zapthink.com/2002/10/23/understanding-the-real-costs-of-integration/.
[8] G. Trotta, “Business Process Management (BPM) Best Practices: Dancing Around EAI ‘Bear Traps’,” ebizQ: The Insider’s Guide to Next-Generation BPM, 2003. [Online]. Available: http://www.ebizq.net/topics/int_sbp/features/3463.html.
[9] F. A. Cummins, “Building the Agile Enterprise: With SOA, BPM and MBM,” Amazon.com, 2008. [Online]. Available: http://www.amazon.com/Building-Agile-Enterprise-SOA-Press/dp/0123744458.
[10] F. Kuglin and R. Hood, “Using Technology to Transform the Value Chain,” Amazon.com, 2008. [Online]. Available: http://www.amazon.com/Using-Technology-Transform-Value-Chain/dp/1420047590.
[11] M. Weske, “Business Process Management: Concepts, Languages, Architectures,” Amazon.com, 2012. [Online]. Available: http://www.amazon.com/Business-Process-Management-Languages-Architectures/dp/3642286151.
[12] C. C. Poirier, L. Ferrara, F. Hayden and D. Neal , “The Networked Supply Chain: Applying Breakthrough BPM Technology to Meet Relentless Customer Demands,” Amazon.com, 2003. [Online]. Available: http://www.amazon.com/The-Networked-Supply-Chain-Breakthrough/dp/1932159088.