Chapter 10. Enumerations and Enumeration Attributes

Chapter 10. Enumerations and Enumeration Attributes
Prev	Part III. Enumerations	Next

Table of Contents

1. Enumerations

1.1. Simple enumerations
1.2. Code lists
1.3. Record enumerations
1.4. Extensible enumerations

2. Enumeration Attributes

3. Enumerations in Computational Languages

3.1. Enumerations in SQL
3.2. Enumerations in XML Schema
3.3. Enumerations in JavaScript

4. Dealing with Enumeration Attributes in a Design Model

5. Quiz Questions

5.1. Question 1: Languages Supporting Enumerations
5.2. Question 2: UI Widget for Single-Valued Enum Attribute
5.3. Question 3: UI Widget for Multi-Valued Enum Attribute

1. Enumerations

In all application domains, there are string-valued attributes with a fixed list of possible string values. These attributes are called enumeration attributes, and the fixed value lists defining their possible string values are called enumerations. For instance, when we have to manage data about people, we often need to include information about their gender. The possible values of a gender attribute may be restricted to one of the enumeration labels "male","female" and "undetermined", or to one of the enumeration codes "M", "F" and "U". Whenever we deal with codes, we also need to have their corresponding labels, at least in a legend explaining the meaning of each code.

Instead of using the enumeration string values as the internal values of an enumeration attribute, it is preferable to use a simplified internal representation for them, such as the positive integers 1, 2, 3, etc., which enumerate the possible values. However, since these integers do not reveal their meaning (which is indicated by the enumeration label) in program code, for readability we rather use special constants, called enumeration literals, such as MALE or M, prefixed by the name of the enumeration like in this.gender = GenderEL.MALE. Notice that we follow the convention that the names of enumeration literals are written all upper case, and that we also use the convention to suffix the name of an enumeration datatype with "EL" standing for "enumeration literal" (such that we can recognize from the name GenderEL that each instance of this datatype is a "gender enumeration literal").

There are also enumerations having records as their instances, such that one of the record fields provides the name of the enumeration literals. An example of such an enumeration is the following list of units of measurement:

Table 10.1. Representing an enumeration of records as a table

Units of Measurement
Unit Symbol	Unit Name	Dimension
m	meter	length
kg	kilogram	mass
g	gram	mass
s	second	time
ms	milisecond	time

Notice that since both the "Unit Symbol" and the "Unit Name" fields are unique, either of them could be used for the name of the enumeration literals.

In summary, we can distinguish between the following three forms of enumerations:

simple enumerations define a list of self-explanatory enumeration labels;
code lists define a list of code/label pairs.
record enumerations consist of a list of records, so they are defined like classes with simple attributes defining the record fields.

These three forms of enumerations are discussed in more detail below.

Notice that, since enumerations are used as the range of enumeration attributes, they are considered to be datatypes.

Enumerations may have further features. For instance, we may want to be able to define a new enumeration by extending an existing enumeration. In programming languages and in other computational languages, enumerations are implemented with different features in different ways. See also the Wikipedia article on enumerations.

1.1. Simple enumerations

A simple enumeration defines a fixed list of self-explanatory enumeration labels, like in the example of a GenderEL enumeration shown in the following UML class diagram:

Since the labels of a simple enumeration are being used, in capitalized form, as the names of the corresponding enumeration literals (GenderEL.MALE, GenderEL.FEMALE, etc.), we may also list the (all upper case) enumeration literals in the UML enumeration datatype, instead of the corresponding (lower or mixed case) enumeration labels.

1.2. Code lists

A code list is an enumeration that defines a fixed list of code/label pairs. Unfortunately, the UML concept of an enumeration datatype does not support the distinction between codes as enumeration literals and their labels. For defining both codes and labels in a UML class diagram in the form of an enumeration datatype, we may use the attribute compartment of the data type rectangle and use the codes as attribute names defining the enumeration literals, and set their initial values to the corresponding label. This approach results in a visual representation as in the following diagram:

In the case of a code list, we can use both the codes or the labels as the names of enumeration literals, but using the codes seems preferable for brevity (GenderEL.M, GenderEL.F, etc.). For displaying the value of an enumeration attribute, it's an option to show not only the label, but also the code, like "male (M)", provided that there is sufficient space. If space is an issue, only the code can be shown.

1.3. Record enumerations

A record enumeration defines a record type with a unique field designated to provide the enumeration literals, and a fixed list of records of that type. In general, a record type is defined by a set of field definitions (in the form of primitive datatype attributes), such that one of the unique fields is defined to be the enumeration literal field, and a set of operation definitions.

Unfortunately, record enumerations, as the most general form of an enumeration datatype, are not supported by the current version of UML (2.5) where the general form of an enumeration is defined as a special kind of datatype (with optional field and operation definitions) having an additional list of unique strings as enumeration literals (shown in a fourth compartment). The UML definition does neither allow designating one of the unique fields as the enumeration literal field, nor does it allow populating an enumeration with records.

Consequently, for showing a record enumeration in a UML class diagram, we need to find a workaround. For instance, if our modeling tool allows adding a drawing, we could draw a rectangle with four compartments, such that the first three of them correspond to the name, properties and operations compartments of a datatype rectangle, and the fourth one is a table with the names of properties/fields defined in the second compartment as column headers, as shown in the following figure.

UnitEL
«el» unitSymbol: String unitName: String dimension: String
Unit Symbol	Unit Name	Dimension
m	meter	length
kg	kilogram	mass
g	gram	mass
s	second	time
ms	millisecond	time

1.4. Extensible enumerations

Figure 10.1. An example of an extensible enumeration

There may be cases of enumerations that need to be extensible, that is, it must be possible to extend their list of enumeration values (labels or code/label pairs) by adding a new one. This can be expressed in a class diagram by appending an ellipsis to the list of enumeration values, as shown in Figure 10.1.

Since enumeration values are internally represented by enumeration literals, which are normally stored as plain positive integers in a database, a new enumeration value can only be added at the end of the value list such that it can be assigned a new index integer without re-assigning the indexes of other enumeration values. Otherwise, the mapping of enumeration indexes to corresponding enumeration values would not be preserved.

Alternatively, if new enumeration values have to be inserted in-between other enumeration values, and their indexes re-assigned, this implies that

enumeration indexes are plain sequence numbers and do no longer identify an enumeration value;
the value of an enumeration literal can no longer be an enumeration index, but rather has to be an identifying string: preferably the enumeration code in the case of a code list, or the enumeration label, otherwise.