3.1. Creating a Relational Database
First we have to choose and install an RDBMS. For learning how to create and deal with an RDB, we can use a single-user RDBMS, such as Microsoft Access or the DB Browser for SQLite. A multi-user RDBMS, such as MySQL or PostgreSQL, comes with a much greater installation and administration effort.
As an illustrative example, we are going to create a movie database with three tables:
- An entity table for information about people with just two columns: ID and name.
- An entity table for information about movies with four columns: ID, title, release date and director.
- A relationship table for information about which actors play in which movies with two columns: person ID and movie ID.
In the following hands-on exercise, you are going to create a DB in three steps:
Get prepared:
- If you don't have it yet, install the RDBMS of your choice (MS Access or DB Browser for SQLite) on your computer.
- Start the RDBMS.
- Create a new DB file "MovieDB" in a suitable folder (like "databases") on a storage medium (such as a hard disk or SSD) of your computer.
- Create the two entity tables
people
andmovies
and populate them by inserting person records and movie records. - Create the relationship table
people_as_actors_in_movies
and populate it by inserting pairs of person IDs and movie IDs.
Exercise: Create a Movie DB with the SQLite DB Browser
1 Get prepared
After installing the SQLite DB Browser (normally by clicking their download link "DB.Browser.for.SQLite-x.y.z-win64.msi" on their download page), start it and click the "New Database" menu option. Then choose the folder where to store the DB file, enter its name ("MovieDB") and confirm.
You'll see the full name of your DB file (including the file system path), such as D:\databases\MovieDB.db, on the top of the DB Browser window, as shown in Figure 1-3.
2 Create the entity table people
Click on
and enter "people" as the name of the new table. Then click , where "field" means the same as attribute or column.Enter "id" as the name of the first field and keep "INTEGER" as its type. Since "id" is supposed to be the primary key for people records, check the "PK" checkbox. Since "id" is supposed to be an automatically assigned sequence number, check also the "AI" checkbox (where "AI" stands for "AUTOINCREMENT").
Then click
again for defining the "name" attribute of the "people" table. Choose "TEXT" as its type and check the "NN" checkbox for defining the "name" attribute to be mandatory, which is called "NOT NULL" in SQL.Notice the SQL code that has been generated automatically on the lower part of the Edit table definition wizard:
1 2 3 4 | CREATE TABLE "people" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "name" TEXT NOT NULL ); |
This SQL data definition statement defines the schema (or structure) of the "people" table.
Since we are done with defining the "people" table schema, you can click Figure 3-2).
, now, and proceed with populating the table by entering records as table rows. Choose three of your favorite movies and enter the names of their directors and at least two actors for each of them (alternatively, you can also enter the data records shown inFor entering "people" records, click the Browse Data tab while the table "people" is selected. Then click the id
number, as an auto-incremented primary key, is automatically pre-filled.
3 Create the entity table movies
Go back to the Database Structure tab and click again on
. Enter "movies" as the name of the new table. Then click for defining the attributes of movie records.As before, enter "id" as the name of the first field and keep "INTEGER" as its type. Since "id" is supposed to be the primary key consisting of an automatically generated sequence number, check the "PK" and the "AI" checkboxes.
Then click
again for defining the "title" attribute. Choose "TEXT" as its type and check the "NN" checkbox for defining the attribute to be mandatory ("NOT NULL").Since SQLite does not have a datatype for calendar dates, we break the "release date" attribute up into a combination of three SQLite fields: "release_date_D", "release_date_M" and "release_date_Y", for allowing to enter a combination of three numbers for the day, month and year of a release date. Click
for defining each of these three fields with "INTEGER" as their type and the "NN" checkbox checked.Finally, add the field "director" with type "INTEGER" and "NN" checked. This field will hold the ID value of the person that is the director of the movie.
Notice the SQL code that has been generated automatically on the lower part of the Edit table definition wizard:
1 2 3 4 5 6 7 8 | CREATE TABLE "movies" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "title" TEXT NOT NULL, "release_date_D" INTEGER NOT NULL, "release_date_M" INTEGER NOT NULL, "release_date_Y" INTEGER NOT NULL, "director" INTEGER NOT NULL ); |
This SQL data definition statement defines the schema (or structure) of the "movies" table.
Now click
and proceed with populating the table by going to the Browse Data tab.4 Create the relationship table people_as_actors_in_movies
Go back to the Database Structure tab and click again on
. Enter (or copy & paste) "people_as_actors_in_movies" as the name of the new table. Then define two attributes: "person_id" and "movie_id" with . Since the attribute "person_id" is supposed to hold the ID number of a person who is an actor in the movie having the ID provided by "movie_id", both fields are of type "INTEGER" and mandatory (their "NN" checkbox has to be checked).The generated SQL code for defining the table schema is as follows:
1 2 3 4 | CREATE TABLE "people_as_actors_in_movies" ( "person_id" INTEGER NOT NULL, "movie_id" INTEGER NOT NULL ); |
Now click Figure 3-1 below.
and proceed with populating the table by going to the Browse Data tab. Enter the data shown inHaving only the ID numbers of people and movies is all the information needed for relating people as actors to movies, but seeing only the ID numbers is not very well readable for human users. Therefore, in a user interface, we should show additional information, such as the names of people and the titles of movies, for allowing users to more easily understand these people_as_actors_in_movies
relationships.
In Figure 3-2 the contents of the movie DB is shown in the form of three tables.
id | name |
---|---|
1 | Stephen Frears |
2 | George Lucas |
3 | Quentin Tarantino |
4 | Uma Thurman |
5 | John Travolta |
6 | Ewan McGregor |
7 | Natalie Portman |
8 | Keanu Reeves |
person_id | movie_id |
---|---|
4 | 1 |
5 | 1 |
6 | 2 |
7 | 2 |
8 | 3 |
4 | 3 |
id | title | release_date_D | release_date_M | release_date_Y | director |
---|---|---|---|---|---|
1 | Pulp Fiction | 12 | 5 | 1994 | 3 |
2 | Star Wars | 25 | 5 | 1977 | 2 |
3 | Dangerous Liaisons | 16 | 12 | 1988 | 1 |
Using numbers as references to entities comes with the risk of entering a number that does not reference an existing entity record. This would be a violation of referential integrity. For avoiding this risk, an SQL DBMS allows defining foreign keys, which require the entered values to reference existing records. This issue will be discussed in Part III.