Chapter 5. Information Modeling

Table of Contents

1. Classes with Properties and Methods
2. Connecting Classes with Associations
3. From a Conceptual Model via a Design Model to Class Models
4. Excursion: Formalizing Information Models with RDF and OWL
4.1. RDF vocabularies
4.2. RDF fact statements
4.3. Expressing structured data in web documents
4.4. OWL vocabularies and constraints
4.5. Usability issues of RDF and OWL
5. Summary
6. Exercises

UML class diagrams provide a visual syntax for expressing UML class models, which allow defining information and data models. They can be used both at the more abstract level of conceptual modeling for requirements engineering and at the more detailed level of design modeling for designing the model classes of an app. Their main building blocks are class rectangles and association lines.

A class rectangle has one, two or three compartments, containing the name of the class, its properties, and its methods. The purpose of a class is to classify objects and to define their properties and the methods that can be invoked on them.

An association line connects two class rectangles. The purpose of an association is to classify relationships (links) between objects. While UML classes have a direct counterpart in the class concepts of object-oriented programming (OOP) languages, UML associations do not have such a direct OOP counterpart. They are therefore often more difficult to understand for developers. Only in the special case of a unidirectional functional association there is a direct OOP counterpart: a reference property for referencing the objects that are linked to a given object by the association.

From a logical point of view, a class model defines a vocabulary, or language, for expressing various types of fact statements about objects. The knowledge representation languages RDF and OWL allow to formalize the vocabularies defined by class models and the fact statements made when instantiating their classes by creating objects. In this way, they help to understand the semantics of information models.

1. Classes with Properties and Methods

In a UML class diagram, a class has a name (shown in the first compartment of the class rectangle), and it may have properties (shown in the second compartment) and methods (shown in the third compartment). Properties and methods may be described with or without details. The following diagrams illustrate these options using the example of a class books or Book for describing books as information objects.

A class can be expressed in UML by just providing its name, without any further detail, like so

This option is useful for making sketches and overview diagrams. Using an ordinary English plural name like books makes the class diagram more readable for non-tech-savvy people.

A more informative description of a class is obtained by listing its properties, possibly without any further detail, like in the following example:

However, for better understanding the meaning of properties and for being able to code a class in an OO programming language, we need to know the range of each property, which is the type of its values. The range of a property can be either a primitive datatype or another class.

In the following diagram we use general implementation-agnostic datatype names (like “Integer”), for which a specific programming language may have specific names (like “int” in Java). Notice that we now use a common OOP naming convention of giving classes a capitalized singular (mixed-case) name like Book (or LearningUnit). This allows saying that “an instance of a class C is a C (object)”, like “an instance of Book is a book (object)”.

Notice how the standard identifier attribute isbn is marked with the keyword id appended to the property declaration in curly braces. This is the UML syntax for defining several kinds of property constraints discussed in the next chapter.

Finally, we can also define the methods and functions of a class in a third compartment, like so:

In this example, the Book class has a function checkISBN, which returns a string. Given a class diagram in this form, it is straightforward to code it in an OO programming language like JavaScript or Java.

Recall that in JavaScript a class is defined in the form of a constructor function that assigns the values of its parameters to the properties of the newly created object, like so:

// **JavaScript code**
var Book = function (i, t, y) {
  this.isbn = i;  // string
  this.title = t; // string
  this.year = y;  // number (integer)
}

In JavaScript, the (instance-level) methods of a class are defined as method slots of the constructor’s built-in prototype object. This is how we code the checkISBN method:

// **JavaScript code**
Book.prototype.checkISBN = function () {
  // regular expression pattern matching test
  if (!/\b\d{9}(\d|X)\b/.test( this.isbn)) {
    return "The ISBN must be a 10-digit string or " +
        "a 9-digit string followed by 'X'!";
  } else return "";
}

If we don’t have to care about older web browsers, such as Internet Explorer 9, we can also use the new class definition syntax (introduced in the ES6 version of JavaScript) and combine the definition of properties and methods in one piece of code:

class Book  {   // **JavaScript (ES6) code**
  constructor( i, t, y) { 
    this.isbn = i;  // string 
    this.title = t; // string 
    this.year = y; // number (integer) 
  }
  // instance-level methods
  checkISBN() { 
    // regular expression pattern matching test 
    if (!/\b\d{9}(\d|X)\b/.test( this.isbn)) { 
      return "The ISBN must be a 10-digit string or " + 
          "a 9-digit string followed by 'X'!"; 
    } else return "";
  } 
}

As opposed to JavaScript, Java has always had a language element class for defining classes:

public class Book {   // **Java code**
  private String isbn;
  private String title;
  private int year;

  // Constructor
  public Book( String i, String t, int y) {
    this.isbn = i;
    this.title = t; 
    this.year = y;
  }
  // instance-level methods
  public checkISBN() { 
    ... 
  } 
}

We need to be aware of the ambiguity of the term “object”. We have to distinguish between objects in the sense of real-world objects (also called “business objects” or “entities”) and objects in an OO program, such as JS objects or Java objects. When we want to manage information about business objects of some type in an app, we represent them in the form of JS/Java objects instantiating a JS/Java class that represents their (business) object type. We call these classes model classes for two reasons: first because they implement the classes defined in an app’s data model, and second because they represent the ‘model’ part of an app's Model-View-Controller codebase architecture.

Therefore, in a JS/Java app, a business object is a JS/Java object, but not every JS/Java object represents a business object because we use JS/Java objects for many purposes (e.g., in JavaScript, an array is a a JS object, but it's not a business object). The same applies to classes: (business) object types are represented as model classes, but not every JS/Java class is a model class because we may use JS/Java classes also for other purposes (e.g., in Java, a class can be used as a container for a method library, but such a class is not a model class).