As we already discussed, a database table stores the information that is related to a specific entity type. Different entities are stored in different tables, with each table having its own structure, and own attributes related to that entity. In most situations, the entities in these tables are not isolated, but are linked with some kind of relationship. For example, in a corporate environment, we might have a table for employees and a table for departments. However, departments contain employees, or employees work in departments. That is the link between the two entity types.

In databases, this “link” is represented by what we call a foreign key relationship. For example, consider the two tables shown. We have a customers table and an Orders table. Each table has its own primary key (which is called ID in both tables). So what is the link between those tables? Well, each order is made by one customer. A customer can make any number of orders. Notice the attribute Cust_ID in the Orders table. This attribute indicates the customer that made each particular order. So for example, order no. 2 is made by customer whose ID is “024″ (i.e., Bauer). The values in the Cust_ID column refer to the values in the ID column of the Customer table. The Cust_ID column is called a Foreign Key, since it refers to the key (primary key) of another (foreign) table.
So why do we need to do that? Why not just put the customer names in the Order table? The actual reason is called Normalization, which as a whole different topic that we might discuss later on. But for now, think of it as follows: Let’s say the company found a spelling mistake in Mr. Jack Bauer’s name (customer no. 024). If the customer name was placed with each single order, then the company would have to go and correct that spelling mistake in every order made my that customer. However, using foreign keys, the company will have to correct the mistake only in the Customers table, and the Orders will still be referring to the correct person.
This leads us to a closely-related concept, which is Referential Integrity. As we mentioned, the Cust_ID column (foreign key) references the ID column (primary key) of the Customer table. The referential integrity rule states that each value in the referring column (Cust_ID) must exist in the referenced column (ID). In other words, we cannot place an order with Cust_ID = 034 without having a customer in the Customers tables with ID = 034.
However, referential integrity might be violated for example, if we try to delete a row from the customer table (when there are orders made my that customer). Databases have multiple ways of ensuring that referential integrity still holds. One common way is to prevent (restrict) deletion in this case. For example, if a user tries to delete customer no. 024 from the Customer table, the database will not perform that action, but will give an error message instead. Another common way is to cascade the deletion process. So, in the above example, when the user tries to delete customer no. 024, the database will perform that action, but it will also delete all orders placed by that customer. So in the end, referential integrity still holds. The choice of which strategy to use can be defined by the user, and depends on the context of the data.
