The database administrator (DBA) for an organization has several responsibilities, which vary according to the size and operations of the organization, supporting staff, and so on. Depending on the particular organization's structure, if you are the organization's DBA, your responsibilities might include the following:
► Installing and maintaining database servers — You might install and maintain the database software. Maintenance can involve installing patches as well as upgrading the software at the appropriate times. As DBA, you might need to have root access to your system and know how to manage software (refer to Chapter 2, "Fedora Quick Start"). You also need to be aware of kernel, file system, and other security issues.
► Installing and maintaining database clients — The database client is the program used to access the database (you'll learn more about that later in this chapter, in the section "Database Clients"), either locally or remotely over a network. Your responsibilities might include installing and maintaining these client programs on users' systems. This chapter discusses how to install and work with the clients from both the Linux command line and through its graphical interface database tools.
► Managing accounts and users — Account and user management includes adding and deleting users from the database, assigning and administering passwords, and so on. In this chapter, you will learn how to grant and revoke user privileges and passwords for MySQL and PostgreSQL while using Fedora.
► Ensuring database security — To ensure database security, you need to be concerned with things such as access control, which ensures that only authorized people can access the database, and permissions, which ensure that people who can access the database cannot do things they should not do. In this chapter, you will learn how to manage Secure Shell (SSH), web, and local GUI client access to the database. Planning and overseeing the regular backup of an organization's database and restoring data from those backups are other critical components of securing the database.
► Ensuring data integrity — Of all the information stored on a server's hard disk storage, chances are the information in the database is the most critical. Ensuring data integrity involves planning for multiple-user access and ensuring that changes are not lost or duplicated when more than one user is making changes to the data base at the same time.
A Brief Review of Database Basics
Database services under Linux that use the software discussed in this chapter are based on a client/server model. Database clients are often used to input data and to query or display query results from the server. You can use the command line or a graphical client to access a running server. Databases generally come in two forms: flat file and relational. A flat file database can be as simple as a text file with a space, tab, or some other character delimiting different parts of the information. One example of a simple flat file database is the Fedora /etc/passwd file. Another example could be a simple address book that might look something like this:
Doe-John-505 Some Street-Anytown-NY-12345-555-555-1212
You can use standard Unix tools such as grep, awk, and perl to search for and extract information from this primitive database. Although this might work well for a small data base such as an address book that only one person uses, flat file databases of this type have several limitations:
► They do not scale well — You do not have random access to data in flat file data bases. You have only sequential access. This means that any search function has to scan each line in the file, one by one, to look for specific information. As the size of the database grows, access times increase and performance decreases.
► Flat file databases are unsuitable for multi-user environments — Depending on
how the database is set up, it either enables only one user to access it at a time or allows two users to make changes simultaneously, making changes that could end up overwriting each other and causing data loss.
These limitations obviously make the flat file database unsuitable for any kind of serious work in even a small business — much less in an enterprise environment. Relational databases, or relational database management systems (RDBMSs), to give them their full name, are good at finding the relationships between individual pieces of data. An RDBMS stores data in tables with fields much like those in spreadsheets, making the data searchable and sortable. RDBMSs are the focus of this chapter.
Oracle, DB2, Microsoft SQL Server, and the freely available PostgreSQL and MySQL are all examples of RDBMSs. The following sections discuss how relational databases work and provide a closer look at some of the basic processes involved in administering and using databases. You will also learn about SQL, the standard language used to store, retrieve, and manipulate database data.
How Relational Databases Work
An RDBMS stores data in tables, which you can visualize as spreadsheets. Each column in the table is a field; for example, a column might contain a name or an address. Each row in the table is an individual record. The table itself has a name you use to refer to that table when you want to get data out of it or put data into it. Figure 18.1 shows an example of a simple relational database that stores name and address information.
FIGURE 18.1 In this visualization of how an RDBMS stores data, the database stores four records (rows) that include name and address information, divided into seven fields (columns) of data.
In the example shown in Figure 18.1, the database contains only a single table. Most RDBMS setups are much more complex than this, with a single database containing multiple tables. Figure 18.2 shows an example of a database named sample_database that contains two tables.
In the sample_database example, the phonebook table contains four records (rows) and each record hold three fields (columns) of data. The cd_collection table holds eight records, divided into five fields of data.
If you are thinking that there is no logical relationship between the phonebook table and the cd_collection table in the sample_database example, you are correct. In a relational database, users can store multiple tables of data in a single database — even if the data in one table is unrelated to the data in others.
For example, suppose that you run a small company that sells widgets and you have a computerized database of customers. In addition to storing each customer's name, address, and phone number, you want to be able to look up outstanding order and invoice information for any of your customers. You could use three related tables in an RDBMS to store and organize customer data for just those purposes. Figure 18.3 shows an example of such a database.
FIGURE 18.2 A single database can contain two tables — in this case, phonebook and cd_collection.
In the example in Figure 18.3, we have added a Customer ID field to each customer record. This field holds a customer ID number that is the unique piece of information that can be used to link all other information for each customer to track orders and invoices. Each customer is given an ID unique to him; two customers might have the same data in their name fields, but their ID field values will never be the same. The Customer ID field data in the Orders and Overdue tables replaces the Last Name, First Name, and Shipping Address field information from the Customers table. Now, when you want to run a search for any customer's order and invoice data, you can search based on one key rather than multiple keys. You get more accurate results in faster, easier-to-conduct data searches.