Table of Contents
In all application domains, there are string-valued attributes with
a fixed list of possible string values. These attributes are called
enumeration attributes, and the fixed
value lists defining their possible string values are called enumerations. For instance, when we have to
manage data about people, we often need to include information about their
gender. The possible values of a gender
attribute may be
restricted to one of the enumeration labels "male","female" and
"undetermined", or to one of the enumeration codes "M", "F" and "U".
Whenever we deal with codes, we also need to have their corresponding
labels, at least in a legend explaining the meaning of each code.
Instead of using the enumeration string values as the internal
values of an enumeration attribute, it is preferable to use a simplified
internal representation for them, such as the positive integers 1, 2, 3,
etc., which enumerate the possible values. However, since these integers
do not reveal their meaning (which is indicated by the enumeration label)
in program code, for readability we rather use special constants, called
enumeration
literals, such as MALE
or
M
, prefixed by the name of the enumeration like in
this.gender = GenderEL.MALE
. Notice that we follow the
convention that the names of enumeration literals are written all upper
case, and that we also use the convention to suffix the name of an
enumeration datatype with "EL" standing for "enumeration literal" (such
that we can recognize from the name GenderEL
that each
instance of this datatype is a "gender enumeration literal").
There are also enumerations having records as their instances, such that one of the record fields provides the name of the enumeration literals. An example of such an enumeration is the following list of units of measurement:
Table 10.1. Representing an enumeration of records as a table
Units of Measurement | ||
---|---|---|
Unit Symbol | Unit Name | Dimension |
m | meter | length |
kg | kilogram | mass |
g | gram | mass |
s | second | time |
ms | milisecond | time |
Notice that since both the "Unit Symbol" and the "Unit Name" fields are unique, either of them could be used for the name of the enumeration literals.
In summary, we can distinguish between the following three forms of enumerations:
simple enumerations define a list of self-explanatory enumeration labels;
code lists define a list of code/label pairs.
record enumerations consist of a list of records, so they are defined like classes with simple attributes defining the record fields.
These three forms of enumerations are discussed in more detail below.
Notice that, since enumerations are used as the range of enumeration attributes, they are considered to be datatypes.
Enumerations may have further features. For instance, we may want to be able to define a new enumeration by extending an existing enumeration. In programming languages and in other computational languages, enumerations are implemented with different features in different ways. See also the Wikipedia article on enumerations.
A simple enumeration defines a
fixed list of self-explanatory enumeration labels, like in the example
of a GenderEL
enumeration shown in the following UML class
diagram:
Since the labels of a simple enumeration are being used, in
capitalized form, as the names of the corresponding enumeration literals
(GenderEL.MALE
, GenderEL.FEMALE
, etc.), we may
also list the (all upper case) enumeration literals in the UML
enumeration datatype, instead of the corresponding (lower or mixed case)
enumeration labels.
A code list is an enumeration that defines a fixed list of code/label pairs. Unfortunately, the UML concept of an enumeration datatype does not support the distinction between codes as enumeration literals and their labels. For defining both codes and labels in a UML class diagram in the form of an enumeration datatype, we may use the attribute compartment of the data type rectangle and use the codes as attribute names defining the enumeration literals, and set their initial values to the corresponding label. This approach results in a visual representation as in the following diagram:
In the case of a code list, we can use both the codes or the
labels as the names of enumeration literals, but using the codes seems
preferable for brevity (GenderEL.M
,
GenderEL.F
, etc.). For displaying the value of an
enumeration attribute, it's an option to show not only the label, but
also the code, like "male (M)", provided that there is sufficient space.
If space is an issue, only the code can be shown.
A record enumeration defines a record type with a unique field designated to provide the enumeration literals, and a fixed list of records of that type. In general, a record type is defined by a set of field definitions (in the form of primitive datatype attributes), such that one of the unique fields is defined to be the enumeration literal field, and a set of operation definitions.
Unfortunately, record enumerations, as the most general form of an enumeration datatype, are not supported by the current version of UML (2.5) where the general form of an enumeration is defined as a special kind of datatype (with optional field and operation definitions) having an additional list of unique strings as enumeration literals (shown in a fourth compartment). The UML definition does neither allow designating one of the unique fields as the enumeration literal field, nor does it allow populating an enumeration with records.
Consequently, for showing a record enumeration in a UML class diagram, we need to find a workaround. For instance, if our modeling tool allows adding a drawing, we could draw a rectangle with four compartments, such that the first three of them correspond to the name, properties and operations compartments of a datatype rectangle, and the fourth one is a table with the names of properties/fields defined in the second compartment as column headers, as shown in the following figure.
UnitEL | ||
---|---|---|
«el» unitSymbol: String unitName: String dimension: String |
||
Unit Symbol | Unit Name | Dimension |
m | meter | length |
kg | kilogram | mass |
g | gram | mass |
s | second | time |
ms | millisecond | time |
There may be cases of enumerations that need to be extensible, that is, it must be possible to extend their list of enumeration values (labels or code/label pairs) by adding a new one. This can be expressed in a class diagram by appending an ellipsis to the list of enumeration values, as shown in Figure 10.1.
Since enumeration values are internally represented by enumeration literals, which are normally stored as plain positive integers in a database, a new enumeration value can only be added at the end of the value list such that it can be assigned a new index integer without re-assigning the indexes of other enumeration values. Otherwise, the mapping of enumeration indexes to corresponding enumeration values would not be preserved.
Alternatively, if new enumeration values have to be inserted in-between other enumeration values, and their indexes re-assigned, this implies that
enumeration indexes are plain sequence numbers and do no longer identify an enumeration value;
the value of an enumeration literal can no longer be an enumeration index, but rather has to be an identifying string: preferably the enumeration code in the case of a code list, or the enumeration label, otherwise.
.