Internet

SQL & MySQL Terminologies

�

Database Theory

�

The earliest known use of the term 'data base' was in July 1973, when the System Development Corporation sponsored a symposium under the title Development and Management of a Computer-centered Data Base.

�

Database as a single word became common in Europe in the early 1970s and by the end of the decade it was being used in major American newspapers. (Databank, a comparable term, had been used in the Washington Post newspaper as early as 1966.)

�

In computing, a database can be defined as a structured collection of records or data that is stored in a computer so that a program can consult it to answer queries. The records retrieved in answer to queries become information that can be used to make decisions.

�

The computer program used to manage and query a database is known as a database management system (DBMS). The properties and design of database systems are included in the study of information science.

�

The term "database" originated within the computing discipline. Although its meaning has been broadened by popular use, even to include non-electronic databases, this article is about computer databases.

�

Database-like records have been in existence since well before the industrial revolution in the form of ledgers, sales receipts and other business related collections of data.

�

Broader Definition

�

There is scarcely to be found a term that is less precise than database. A database can be a list of addresses residing in a spreadsheet program (such as Excel), or it can be the administration files of a telecommunications firm in which several million calls are registered daily, their charges accurately calculated, monthly bills computed, and warning letters sent to those who are in arrears.

�

A simple database can be a stand-alone operation (residing locally on a computer for a single user), while others may be used simultaneously by thousands of users, with the data parceled out among several computers and dozens of hard drives. The size of a database can range from a few kilobytes into the terabytes.

�

In ordinary usage, the word "database" is used to refer to the actual data, the resulting database files, the database system (such as MySQL or Oracle), or a database client (such as a PHP script or a program written in C++). Thus there arises a great potential for confusion as soon as two people begin to converse on the subject of databases.

�

Relations, Database Systems, Servers, and Clients

�

A database is an ordered collection of data, which is normally stored in one or more associated files. The data are structured as tables, where cross references among tables are possible. The existence of such relations among the tables leads to the database being called a relational database.

�

Let us clarify matters with an example. A database might consist of a table with data on a firm's customers (name, address, etc.), a table with data on the products the firm offers, and finally, a table containing the firm's orders. Through the table of orders it is possible to access the data in the other two tables (for example, via customer and product numbers).

�

MySQL, Oracle, the Microsoft SQL server, and IBM DB2 are examples of relational database systems. Such a system includes the programs for managing relational databases.

�

Among the tasks of a relational database system are not only the secure storage of data, but also such jobs as the processing of commands for querying, analyzing, and sorting existing data and for storing new data. All of this should be able to take place not only on a single computer, but over a network as well. Instead of a database system we shall often speak of a database server.

�

Where there are servers, there are clients. Every program that is connected to the database system is called a database client. Database clients have the job of simplifying the use of the database for the end user. No user of a database system in his or her right mind would wish to communicate directly with the database server.

�

That is much too abstract and inconvenient. (Let programmers worry about such direct communication!) Instead, the user has a right to expect convenient tables, listboxes, and so on to enable the location of data or to input new data.

�

Database clients can assume a variety of forms, and indeed, they are often not recognized by the user as database programs at all. Some examples of this type of client are HTML pages for the display and input of messages in an on-line discussion group, a traditional program with several windows for managing addresses and appointments, and a Perl script for executing administrative tasks. There is thus wide scope for database programming.

�

Relational vs Object-Oriented Database Systems

�

Relational databases have dominated the database world for decades, and they are particularly well suited for business data, which usually lend themselves to structuring in the form of tables. Except for the following two paragraphs, this entire book discusses only relational databases (though we shall not always stress this point).

�

Another kind of database is the object-oriented database. Such databases can store free-standing objects (without having to arrange them in tables). Although in recent years there has been a trend in the direction of object-oriented programming languages (such as Object-Store, O2, Cach�), object-oriented databases have found only a small market niche.

�

Note that relational databases can be accessed by means of object-oriented programming languages. However, that does not turn a relational database into an object-oriented one.

�

Object-oriented database systems enable direct access to objects defined in the programming language in question and the storage of such objects in the database without conversion (persistency). It is precisely this that is not possible with relational database systems, in which everything must be structured in tables.

�

Tables, Records, Fields, Queries, SQL, Index, Keys

�

We have already mentioned tables, which are the structures in which the actual data are located. Every line in such a table is called a data record, or simply record, where the structure of each record is determined by the definition of the table.

�

For example, in a table of addresses every record might contain fields for family name, given name, street, and so on. For every field there are precise conditions on the type of information that can be stored (such as a number in a particular format, or a character string with apredetermined maximum number of characters).

�

Note Sometimes, instead of data records with fields, we speak of rows and columns. The meaning is the same.

�

The description of a database consisting of several tables with all of its fields, relations, and indexes (see below) is called a database model. This model defines the construction of the data structures and at the same time provides the format in which the actual data are to be stored.

�

Tables usually contain their data in no particular order (more precisely, the order is usually that in which the data have been entered or modified). However, for efficient use of the data it is necessary that from these unordered data a list can be created that is ordered according to one or more criteria.

�

It is frequently useful for such a list to contain only a selection of the data in the table. For example, one could obtain a list of all of one's customers, ordered by ZIP code, who have ordered a rubber ducky within the past twelve months.

�

To create such a list, one formulates queries. The result of the query is again a table; however, it is one that exists in active memory (RAM) and not on the hard drive.

�

To formulate a query one uses SQL instructions, which are commands for selecting and extracting data. The abbreviation SQL stands for Structured Query Language, which has become a standard in the formulation of database queries. Needless to say, every producer of a database system offers certain extensions to this standard, which dilutes the goal of compatibility among various database systems.

�

When tables get large, the speed at which a query can be answered depends significantly on whether there is a suitable index giving the order of the data fields. An index is an auxiliary table that contains only information about the order of the records. An index is also called a key.

�

An index speeds up access to data, but it has disadvantages as well. First, every index increases the amount of storage on the hard drive necessary for the database file, and second, the index must be updated each time the data are altered, and this costs time.

�

(Thus an index saves time in the reading of data, but it costs time in entering and altering data. It thus depends on the use to which the data are to be put whether an index is on the whole a net plus or minus in the quest for efficiency.)

�

A special case of an index is a primary index,or primary key,which is distinguished in that the primary index must ensure a unique reference to a record. Often, for this purpose one simply uses a running index number (ID number). Primary indexes play a significant role in relational databases, and they can speed up access to data considerably.

�