We will build a database using the following datafiles and query it to answer questions about the demands on the distribution center. Each of the following datafiles is a plain ASCII file with tab-delimited fields and with field names in the first row. To make this as easy as possible, the data has been purged of serious flaws. (Warning: Some flaws remain. For example, some skus are listed as having 0 units (pieces) per carton (case) and cartons are listed as having physical dimensions of 0. For such skus we may interpret this to mean that the unit (piece) is the only unit of measure that is handled by the warehouse.)
If you already have a preferred database program, import the text files above into your database and continue with the next section. Otherwise, here are instructions for setting up hsqldb, which is fast, small, sufficient to the purpose, open source, runs on the Java platform, and includes a (rudimentary) user interface. Other open-source databases, which are generally more than we need, but still quite good, include MySQL, Derby (formerly known as Cloudscape), and Firebird.
There are many excellent on-line SQL references available; here is one and another. In addition, see the hsql documentation, which describes how hsqldb realizes the SQL standard(s).
For our purposes the most important part of SQL is the
SELECT statement.
SELECT * FROM Skus;
Then, click "Execute" to execute the query. (Note that a SQL command is terminated by a semicolon.)
SkuID and UnitWeight, type the following
into the command panel:
SELECT SkuID, UnitWeight FROM Skus;
If you would like the database to save its work for future
reference, use the SELECT...INTO... construct,
as in the following.
SELECT SkuID, UnitWeight INTO Weights FROM Skus;
This will save the results of the query as a table named
Weights, which will become part of the database (until you
choose to DROP it). To examine this table,
execute the query
SELECT * FROM Weights;
WHERE specifies the search
conditions. For example, to select all the rows corresponding
to heavy skus, execute the query:
SELECT * FROM Skus WHERE UnitWeight > 25;
If you want only the fields corresponding to sku ID and unit weight then execute the query:
SELECT Sku, UnitWeight FROM Skus WHERE UnitWeight
> 25;
SQL also allows allows use of = (equal),
<> (not equal), < (less
than), <= (less than or equal to), and
>= (greater than or equal to).
LIKE command can be used. For instance, to
find all skus with ID beginning with "AAGD", execute the query:
SELECT * FROM Skus WHERE SkuID LIKE 'AAGD%';
The % character is used as a wildcard. Note
that strings of text must be surrounded by single quotes,
whereas no quotes are used with numbers.
There are many ways to transmit SQL commands to the database. Most databases include some type of query input facility. For example, MS Access has a QueryBuilder tool; and hsqldb comes with SqlTool. SQuirreL is a nicer and more powerful input facility that is also open source and can connect to most common database engines and so can be worth learning for future use. SQuirreL also comes with a SQL syntax verifier that can help you avoid SQL errors. You can also copy results directly from SQuirreL to paste into reporting tools such as spreadsheets.
Most front ends will also allow you to submit a file of SQL commands to be executed in batch, once you have debugged them.
Use the sales history, description of skus and their locations to answer the following. (SQL queries are provided for some questions to help you get started. Sometimes the query can be rewritten to run faster: You are urged to experiment.)
--
-- This is a comment. Be sure to document each query
--
SELECT Max( UnitLength ) FROM Skus;
Check your answer: 105.5
Write similar queries for the remaining dimensions. Or you can combine multiple queries into a single one.
--
-- Sum all requests for each sku and list skus from most
popular to least popular
--
SELECT Sales.SkuID, Count(Sales.SkuID) AS TotalPicks
INTO NumberRequests
FROM Sales
GROUP BY Sales.SkuID
ORDER BY TotalPicks DESC;
Check your answer: skus PAP747-01 and SMD68670 were requested 27 times apiece.
--
-- Number of orders per day, in calendar sequence
--
SELECT Date, COUNT(Sales.OrderNumber) AS Orders
INTO OrdersPerDate
FROM Sales
GROUP BY Date
ORDER BY Date;
--
-- Number of picks per zone listed from most busy to
least
--
SELECT Locations.Zone, COUNT(Sales.SkuID) AS Picks
FROM Locations INNER JOIN Sales
ON Locations.SkuID = Sales.SkuID
GROUP BY Locations.Zone
ORDER BY Picks DESC;
Check your answer: 1,510 picks were from zone A.
--
-- Sum quantities ordered of each sku
--
SELECT SkuID, SUM(OrderQuantity) AS TotalQuantity
INTO Quantities
FROM Sales
GROUP BY SkuID;
First compute the volume of each sku and then look this up to avoid recomputing for each query. (Since dimensions were given in linear inches, we must divide by 12³ to convert to cubic feet.)
--
-- Compute volume ordered of each sku and save into a new
table
-- for future reference.
--
SELECT Skus.SkuID, Skus.VendorID, (Skus.UnitWidth *
Skus.UnitHeight * Skus.UnitLength * OrderQuantity / 1728.0)
AS CubicFeet
INTO Volumes
FROM Skus INNER
JOIN Sales ON Skus.SkuID = Sales.SkuID;
Now submit the query:
--
-- Volumes shipped by vendor (Note that we must JOIN the two
-- tables Volumes and Skus to put the information together.)
--
SELECT Skus.VendorID, SUM(Volumes.CubicFeet) AS
VendorVolume
FROM Volumes INNER JOIN Skus ON Skus.SkuID = Volumes.SkuID
GROUP BY Skus.VendorID
ORDER BY VendorVolume DESC;
Check your answer: Customer 552321 ordered the greatest volume.
--
-- To get the volume from each zone we must JOIN
-- three tables.
--
SELECT Locations.Zone, SUM(Volumes.CubicFeet) AS ZoneVolume
FROM (Locations INNER JOIN Volumes ON Locations.SkuID = Volumes.SkuID
GROUP BY Locations.Zone
ORDER by ZoneVolume;
Check your answer: The total volume picked from zone B was 1,022 cubic feet.
The following questions are harder and may require direct processing of the data files via some general programming language (Java recommended).
Thanks to Aloysius CHEN Yan Kai for help in preparing the queries on this page.