:year (parse-year (year id3))
:id3-size (size id3))))
You don't have to worry about normalizing the values since insert-row takes care of that for you. You do, however, have to convert the string values returned by the track and year into numbers. The track number in an ID3 tag is sometimes stored as the ASCII representation of the track number and sometimes as a number followed by a slash followed by the total number of tracks on the album. Since you care only about the actual track number, you should use the :end argument to PARSE-INTEGER to specify that it should parse only up to the slash, if any.[294]
(defun parse-track (track)
(when track (parse-integer track :end (position #\/ track))))
(defun parse-year (year)
(when year (parse-integer year)))
Finally, you can put all these functions together, along with walk-directory from the portable pathnames library and mp3-p from the ID3v2 library, to define a function that loads an MP3 database with data extracted from all the MP3 files it can find under a given directory.
(defun load-database (dir db)
(let ((count 0))
(walk-directory
dir
#'(lambda (file)
(princ #\.)
(incf count)
(insert-row (file->row file) db))
:test #'mp3-p)
(format t "~&Loaded ~d files into database." count)))
Querying the Database
Once you've loaded your database with data, you'll need a way to query it. For the MP3 application you'll need a slightly more sophisticated query function than you wrote in Chapter 3. This time around you want not only to be able to select rows matching particular criteria but also to limit the results to particular columns, to limit the results to unique rows, and perhaps to sort the rows by particular columns. In keeping with the spirit of relational database theory, the result of a query will be a new table object containing the desired rows and columns.
The query function you'll write, select, is loosely modeled on the SELECT statement from Structured Query Language (SQL). It'll take five keyword parameters: :from, :columns, :where, :distinct, and :order-by. The :from argument is the table object you want to query. The :columns argument specifies which columns should be included in the result. The value should be a list of column names, a single column name, or a T, the default, meaning return all columns. The :where argument, if provided, should be a function that accepts a row and returns true if it should be included in the results. In a moment, you'll write two functions, matching and in, that return functions appropriate for use as :where arguments. The :order-by argument, if supplied, should be a list of column names; the results will be sorted by the named columns. As with the :columns argument, you can specify a single column using just the name, which is equivalent to a one-item list containing the same name. Finally, the :distinct argument is a boolean that says whether to eliminate duplicate rows from the results. The default value for :distinct is NIL.
Here are some examples of using select:
;; Select all rows where the :artist column is "Green Day"
(select :from *mp3s* :where (matching *mp3s* :artist "Green Day"))
;; Select a sorted list of artists with songs in the genre "Rock"
(select
:columns :artist
:from *mp3s*
:where (matching *mp3s* :genre "Rock")
:distinct t
:order-by :artist)
The implementation of select with its immediate helper functions looks like this:
(defun select (&key (columns t) from where distinct order-by)
(let ((rows (rows from))
(schema (schema from)))
(when where
(setf rows (restrict-rows rows where)))
(unless (eql columns 't)
(setf schema (extract-schema (mklist columns) schema))
(setf rows (project-columns rows schema)))
(when distinct
(setf rows (distinct-rows rows schema)))
(when order-by
(setf rows (sorted-rows rows schema (mklist order-by))))
(make-instance 'table :rows rows :schema schema)))
(defun mklist (thing)
(if (listp thing) thing (list thing)))
(defun extract-schema (column-names schema)
(loop for c in column-names collect (find-column c schema)))
(defun find-column (column-name schema)
(or (find column-name schema :key #'name)
(error "No column: ~a in schema: ~a" column-name schema)))
(defun restrict-rows (rows where)
(remove-if-not where rows))
(defun project-columns (rows schema)
(map 'vector (extractor schema) rows))
(defun distinct-rows (rows schema)
(remove-duplicates rows :test (row-equality-tester schema)))
(defun sorted-rows (rows schema order-by)
(sort (copy-seq rows) (row-comparator order-by schema)))
Of course, the really interesting part of select is how you implement the functions extractor, row-equality-tester, and row-comparator.
As you can tell by how they're used, each of these functions must return a function. For instance, project-columns uses the value returned by extractor as the function argument to MAP. Since the purpose of project-columns is to return a set of rows with only certain column values, you can infer that extractor returns a function that takes a row as an argument and returns a new row containing only the columns specified in the schema it's passed. Here's how you can implement it:
(defun extractor (schema)
(let ((names (mapcar #'name schema)))
#'(lambda (row)
(loop for c in names collect c collect (getf row c)))))
Note how you can do the work of extracting the names from the schema outside the body of the closure: since the closure will be called many times, you want it to do as little work as possible each time it's called.
The functions row-equality-tester and row-comparator are implemented in a similar way. To decide whether two rows are equivalent, you need to apply the appropriate equality predicate for each column to the appropriate column values. Recall from Chapter 22 that the LOOP clause always will return NIL as soon as a pair of values fails their test or will cause the LOOP to return T.
(defun row-equality-tester (schema)
(let ((names (mapcar #'name schema))
294
If any MP3 files have malformed data in the track and year frames, PARSE-INTEGER could signal an error. One way to deal with that is to pass PARSE-INTEGER the :junk-allowed argument of T, which will cause it to ignore any non-numeric junk following the number and to return NIL if no number can be found in the string. Or, if you want practice at using the condition system, you could define an error and signal it from these functions when the data is malformed and also establish a few restarts to allow these functions to recover.