What is a Database? (Part 3 of 4)

Related Articles

Relational Databases

Of course in the 80's the "Relational Database Model" became the rage. The Relational Model developed out of the work done by Dr. E. F. Codd at IBM in the late 1960s who was looking for ways to solve the problems with the existing models.

Because he was a mathematician, he naturally built the model on mathematical concepts which he expounded in the famous work called "A Relational Model of Data for Large Shared Databanks".

At the core of the relational model is the concept of a table (also called a relation) in which all data is stored. Each table is made up of records (horizontal rows also known as tuples) and fields (vertical columns also known as attributes).

It is important to note that how or where the tables of data are stored makes no difference. Each table can be identified by a unique name and that name can be used by the database to find the table behind the scenes. As a user, all you need to know is the table name in order to use it. You do not need to worry about the complexities of how the data is stored on the hard drive.

This is quite a bit different from the hierarchical and network models in which the user had to have an understanding of how the data was structured within the database in order to retrieve, insert, update, or delete records from the database.

So how do you find data in a relational database if there is no map (like a hierarchy defined by pointers) to follow?

Well, in the relational model, operations that manipulate data do so on the basis of the data values themselves. Thus, if you wish to retrieve a row from a table for example, you do so by comparing the value stored within a particular column for that row to some search criteria.

For example, you might say (not getting into syntax yet) "Give me all the rows from the 'STUDENTS' table which have 'Selena' in the 'FIRST_NAME' column."

The database might return a list which looks essentially like this:

Selena Sol SID-001 213-456-7890
Selena Roberts SID-268 818-934-5069
Selena Smith SID-991 310-234-6475

You could then use the data from a retrieved row to query another table. For example, you might say "Okay, I want to know what grade 'Selena Sol' received in 'Underwater Basket Weaving 101'. So I will now use the 'Student ID' number from my previous query as the keyword in my next query. I want the row in the 'Underwater Basket Weaving Course' table where student ID equals 'SID-001'.

This data access methodology makes the relational model a lot different from and better than the earlier database models because it is a much simpler model to understand. This is probably the main reason for the popularity of relational database systems today.

Another benefit of the relational system is that it provides extremely useful tools for database administration. Essentially, tables can not only store actual data but they can also be used as the means for generating meta-data (data about the table and field names which form the database structure, access rights to the database, integrity and data validation rules etc).

Thus everything within the relational model can be stored in tables. This means that many relational systems can use operations recursively in order to provide information about the database. In other words, a user can query information concerning table names, access rights, or some data and the results of these queries would then be presented to the user in the form of a table.

This makes database administration as easy as usage!

Client/Server Databases

As we said before, most databases that you will come across these days will be relational databases. However, there are many types of relational databases and not all of them will be useful for web applications.

In particular, it will be the client/server databases rather than the stand-alone packages that you will use for the web.

A client/server database works like this: A database server is left running 24 hours a day, and 7 days a week. Thus, the server can handle database requests at any hour. Database requests come in from "clients" who access the database through its command line interface or by connecting to a database socket. Requests are handled as they come in and multiple requests can be handled at one time.

For web applications which must be available for world wide time zone usage, it is essential to build upon a client/server database which can run all the time.

For the most part, these are the only types of databases that Internet Service Providers will even provide. However if you are serving web pages yourself, you should consider many of the excellent freeware, shareware or commercial products around. I myself like postgres for UNIX since I prefer a UNIX-based web server. However, there are plenty of good applications for PC and Mac as well.

Software:

Good Database Design

It is most likely that as a web developer, you will be working with one of the modern relational databases and that you will be able to work in conjunction with an existing database administrator. That is, this tutorial is limited to the "use" of databases rather than to the creation and administration of them. In fact, the creation and administration of databases is a topic well beyond the scope of this tutorial and probably well beyond the scope of your job. After all, database administration is its own entire job description.

However, we have been spending a lot of time going through general database theory because although you may not be designing databases yourself, in order to take the most advantage of them, it is important that you understand how they work. Likewise, it is important that you have a feel for good database design. After all, a database's usefulness is directly proportional to the sleekness and efficiency of its design. And your ability to take advantage of a database is directly proportional to your ability to decipher and internalize that design.

When thinking about good database design, it is important that you keep data retrieval, storage and modification efficiency in mind. It will pay off one thousand fold if you take a week or two to simply play with different arrangements of data. You will find that certain table structures will provide easier and more intuitive access than others.

Tables should describe only one subject, have distinct fields, contain no redundant data, and have a field with unique values so that the table can be related to others.

You should also keep in mind future expansion of the database and make sure that your design is easily extensible.

Typically, you will go through a requirements phase in which you should simply sit with the problem, interview users, and achieve an intuition about the data and the project.

Next, you should spend time modeling the data, preferably using some standard methodology like ER Diagramming. However, even if you do not model in any traditional way, you can still play with different ideas and think about the pros and cons.

Finally, you should try out your ideas and hone them through limited trials.

Hopefully, you will also choose a database with full- functionality such as security and concurrency control (making sure that two users cannot simultaneously change a record). There are many excellent choices available in the market today from freeware to commercial products.

Of course, as we said above, you will probably be coming onto a project with an already existing database. This is the case for most web developers. In this case, you should work closely with the database administrator to define the database.

At the very least you should sit down over a brew one evening and discuss the database design thoroughly.

Talking to Databases

Once the database is up and running and populated with data, you will need a way of talking to it. Essentially, there are two ways of doing that: connecting to the database directly using a command shell or by connecting to the database over the network using sockets (if that database allows such interaction).

Command shells are pretty common and come bundled with just about every database. These are usually simple filters which allow you to log on to the database, execute commands and receive output. They can either be very simple command-line shells or nice, fanciful graphical user interfaces. Whichever the case, command shells provide access to the database from the machine on which the database is running. We will see more examples of this methodology when we discuss CGI-based web databases in part three.

Socket-based middleware does the exact same thing, but over a network. Thus, with a socket-based system, I could send commands to my database in Los Angeles while I am working in Singapore by perhaps using TCP/IP over the internet to connect to my database. We will see more examples of this when we look at JDBC-based web databases in part four.

You will almost certainly want to get access to a database and its command shell just to try things out during this tutorial. Hopefully your ISP will provide a database to test, but if not, you can probably install a database like Microsoft Access on your local system just to play with SQL commands.

Of course, whether you choose a command shell or a socket based connection, you will need to know the language of the database in order to send commands. For this tutorial, we are going to focus on SQL which is actually the de facto language of databases today.

SQL comes in many flavors depending on the proprietary database system implementing it. However, regardless of the proprietary extensions, the core of the language is the same in every environment. Fortunately, for most web uses, generic SQL will be more than sufficient.

SQL protects us from the implementation of data storage and retrieval. Essentially, so long as we use the standard commands such as:

SELECT * FROM database1

It is up to the database itself to figure out what to do. We don't need to know any of the internal workings of the database or how it stores data on the file system.

The Basics of the SQL Database

As we said, SQL (Structured Query language) is the language of choice for most modern multi-user, relational databases. That is because SQL provides the syntax and idioms (language) you need to talk to (query) relational databases in a standardized, cross-platform/product way (structured).

The beauty of SQL is that it idiomizes the relational model. Rather than refer to data as a set of pointers, SQL provides predefined procedures to allow you to use any value in a table to relate other tables in a database. So long as a database is structured using the relational model, SQL will be a natural fit because SQL was designed to make sense in a relational system. SQL by its very design is a language that can be used to talk about relating tables.

For the rest of Part One and Two, we will examine how you will use SQL to access relational databases. However, first we should say a little bit about the structure of SQL databases before we plunge into usage.

SQL databases (most modern relational databases), as you will recall, are composed of a set of row/column-based "tables", indexed by a "data dictionary". To access data in the tables, you simply use SQL to navigate the system and produce "views" based on search criteria defined in the SQL query.

Okay, that was quite a bit of jargon all at once. Let's step back for a moment and look at each of these terms.

SQL Tables

We have already discussed the concept of tables in the last part, but let's just refresh our memory in terms of how tables relate to SQL. A table is a systematic way to store data. For the most part, a table is just like a spreadsheet. Tables are composed of rows (records) and each row is composed of columns (fields).

Employee Table
Employee ID Number Employee Name Employee Phone Salary
001 Lim Li Chuen 654-3210 90,000 USD
002 Lim Sing Yuen 651-0987 40,000 USD
003 Loo Soon Keat 123-4567 50,000 USD

How the tables are stored by the database you are using does not really make a difference for you. The beauty of SQL is that it works independently of the internal structure of the database. The tables could be stored as simple flat files on a local PC or as complex, networked, compressed, encrypted and proprietary data structures.

All you need to know is the table's name. If you know the name, you can use SQL to call up the table.

We'll look at manipulating tables in detail a bit later. But first, let's look at the data dictionary.

The Data Dictionary

How does the database know where all of these tables are located? Well, behind the scenes, the database maintains a "data dictionary" (a.k.a. catalog) which contains a list of all the tables in the database as well as pointers to their locations.

Essentially, the data dictionary is a table of tables containing a list of all the tables in the database, as well as the structure of the tables and often, special information about the database itself.

When you use SQL to talk to the database and provide a table name, the database looks up the table you referred to in the data dictionary. Of course, you needn't worry about the data dictionary; the database does all the searching itself. As we said before, you just need to know the name of the table you want to look at.

It is interesting to note that because the data dictionary is a table, in many databases, you can even query the data dictionary itself to get information about your environment. This can often be a very useful tool when exploring a new database.

Okay, so how do you actually grab table data using the data dictionary? Well, in an SQL database you create "views". Let's examine views a bit.

Views >>


Publication Date: Friday 1st August, 2003
Author: Selena Sol View profile

Related Articles