3.1. Creating a Relational Database

First we have to choose and install an RDBMS. For learning how to create and deal with an RDB, we can use a single-user RDBMS, such as Microsoft Access or the DB Browser for SQLite. A multi-user RDBMS, such as MySQL or PostgreSQL, comes with a much greater installation and administration effort.

As an illustrative example, we are going to create a movie database with three tables:

  1. An entity table for information about people with just two columns: ID and name.
  2. An entity table for information about movies with four columns: ID, title, release date and director.
  3. A relationship table for information about which actors play in which movies with two columns: person ID and movie ID.

In the following hands-on exercise, you are going to create a DB in three steps:

  1. Get prepared:

    1. If you don't have it yet, install the RDBMS of your choice (MS Access or DB Browser for SQLite) on your computer.
    2. Start the RDBMS.
    3. Create a new DB file "MovieDB" in a suitable folder (like "databases") on a storage medium (such as a hard disk or SSD) of your computer.
  2. Create the two entity tables people and movies and populate them by inserting person records and movie records.
  3. Create the relationship table people_as_actors_in_movies and populate it by inserting pairs of person IDs and movie IDs.

Exercise: Create a Movie DB with the SQLite DB Browser

1 Get prepared

After installing the SQLite DB Browser (normally by clicking their download link "DB.Browser.for.SQLite-x.y.z-win64.msi" on their download page), start it and click the "New Database" menu option. Then choose the folder where to store the DB file, enter its name ("MovieDB") and confirm.

You'll see the full name of your DB file (including the file system path), such as D:\databases\MovieDB.db, on the top of the DB Browser window, as shown in Figure 1-3.

2 Create the entity table people

Click on Create Table and enter "people" as the name of the new table. Then click Add field, where "field" means the same as attribute or column.

Enter "id" as the name of the first field and keep "INTEGER" as its type. Since "id" is supposed to be the primary key for people records, check the "PK" checkbox. Since "id" is supposed to be an automatically assigned sequence number, check also the "AI" checkbox (where "AI" stands for "AUTOINCREMENT").

Then click Add field again for defining the "name" attribute of the "people" table. Choose "TEXT" as its type and check the "NN" checkbox for defining the "name" attribute to be mandatory, which is called "NOT NULL" in SQL.

Notice the SQL code that has been generated automatically on the lower part of the Edit table definition wizard:

1
2
3
4
CREATE TABLE "people" (
  "id" INTEGER PRIMARY KEY AUTOINCREMENT,
  "name" TEXT NOT NULL
);

This SQL data definition statement defines the schema (or structure) of the "people" table.

Since we are done with defining the "people" table schema, you can click OK, now, and proceed with populating the table by entering records as table rows. Choose three of your favorite movies and enter the names of their directors and at least two actors for each of them (alternatively, you can also enter the data records shown in Figure 3-2).

For entering "people" records, click the Browse Data tab while the table "people" is selected. Then click the New Record button repeatedly for entering the names of people. Notice that the id number, as an auto-incremented primary key, is automatically pre-filled.

3 Create the entity table movies

Go back to the Database Structure tab and click again on Create Table. Enter "movies" as the name of the new table. Then click Add field for defining the attributes of movie records.

As before, enter "id" as the name of the first field and keep "INTEGER" as its type. Since "id" is supposed to be the primary key consisting of an automatically generated sequence number, check the "PK" and the "AI" checkboxes.

Then click Add field again for defining the "title" attribute. Choose "TEXT" as its type and check the "NN" checkbox for defining the attribute to be mandatory ("NOT NULL").

Since SQLite does not have a datatype for calendar dates, we break the "release date" attribute up into a combination of three SQLite fields: "release_date_D", "release_date_M" and "release_date_Y", for allowing to enter a combination of three numbers for the day, month and year of a release date. Click Add field for defining each of these three fields with "INTEGER" as their type and the "NN" checkbox checked.

Finally, add the field "director" with type "INTEGER" and "NN" checked. This field will hold the ID value of the person that is the director of the movie.

Notice the SQL code that has been generated automatically on the lower part of the Edit table definition wizard:

1
2
3
4
5
6
7
8
CREATE TABLE "movies" (
  "id" INTEGER PRIMARY KEY AUTOINCREMENT,
  "title" TEXT NOT NULL,
  "release_date_D" INTEGER NOT NULL,
  "release_date_M" INTEGER NOT NULL,
  "release_date_Y" INTEGER NOT NULL,
  "director" INTEGER NOT NULL
);

This SQL data definition statement defines the schema (or structure) of the "movies" table.

Now click OK and proceed with populating the table by going to the Browse Data tab.

4 Create the relationship table people_as_actors_in_movies

Go back to the Database Structure tab and click again on Create Table. Enter (or copy & paste) "people_as_actors_in_movies" as the name of the new table. Then define two attributes: "person_id" and "movie_id" with Add field. Since the attribute "person_id" is supposed to hold the ID number of a person who is an actor in the movie having the ID provided by "movie_id", both fields are of type "INTEGER" and mandatory (their "NN" checkbox has to be checked).

The generated SQL code for defining the table schema is as follows:

1
2
3
4
CREATE TABLE "people_as_actors_in_movies" (
  "person_id" INTEGER NOT NULL,
  "movie_id" INTEGER NOT NULL
);

Now click OK and proceed with populating the table by going to the Browse Data tab. Enter the data shown in Figure 3-1 below.

Figure 3-1. The relationship data: which people are actors in which movies.

Having only the ID numbers of people and movies is all the information needed for relating people as actors to movies, but seeing only the ID numbers is not very well readable for human users. Therefore, in a user interface, we should show additional information, such as the names of people and the titles of movies, for allowing users to more easily understand these people_as_actors_in_movies relationships.

In Figure 3-2 the contents of the movie DB is shown in the form of three tables.

Figure 3-2. The contents of the SQLite movie database
Table 3-1. people
idname
1Stephen Frears
2George Lucas
3Quentin Tarantino
4Uma Thurman
5John Travolta
6Ewan McGregor
7Natalie Portman
8Keanu Reeves
Table 3-2. people_as_actors_in_movies
person_idmovie_id
41
51
62
72
83
43
Table 3-3. movies
idtitlerelease_date_Drelease_date_Mrelease_date_Ydirector
1Pulp Fiction12519943
2Star Wars25519772
3Dangerous Liaisons161219881

Using numbers as references to entities comes with the risk of entering a number that does not reference an existing entity record. This would be a violation of referential integrity. For avoiding this risk, an SQL DBMS allows defining foreign keys, which require the entered values to reference existing records. This issue will be discussed in Part III.