(
Latest Revision:
03/01/2005
)
Radix Sort: Using Queues and Structured Data
RATIONALE:
Writing this program will give you exposure to an advanced sorting algorithm,
practice using queues in a non-trivial application, and an opportunity to
invent additional data structures to suit the needs of the problem at hand.
THE ASSIGNMENT:
Write a program that
- implements a queue data type whose elements are
arrays of eight characters,
- implements a set of ten of the
above-described queues, giving random access to each
queue in the set, and
- uses the queue set to do a radix sort of a non-empty
series of eight-digit non-negative integers.
INPUT:
The input will consist of a series of one or more lines of text. The text on
each line will consist of one eight-digit integer, left justified. Each
integer will be non-negative. Each integer will have exactly eight digits --
no more, no less.
The program will read all input from standard input.
SAMPLE INPUT:
54897255
70191409
58952946
07097884
92239748
32697899
48118297
67622567
16415236
61447055
01159999
34216297
OUTPUT:
The program will sort the numbers into ascending order and
print the sorted list to standard output. The output will be
formatted identically to the input.
SAMPLE OUTPUT:
01159999
07097884
16415236
32697899
34216297
48118297
54897255
58952946
61447055
67622567
70191409
92239748
HOW THE USER WILL OPERATE THE PROGRAM:
Suppose the executable program is called rsort. If the user wants to
enter the input from the keyboard, s/he just enters the command:
rsort
and then enters the desired series of integers. When finished entering data
the user presses the enter key and makes a control-D to signal the
end-of-input.
Suppose the user wants the input to come from a prepared file called
input01. In that case s/he just enters:
rsort < input01
In this case, where redirection of standard input to a file is done, there is
no need to enter the end-of-input signal.
The user may wish to save the output of the program to a file. In that case,
s/he can use this command:
rsort < input01 > output01
When I test your program, I will test it on several prepared files. Some
input files will contain a large quantity of integers to sort. Other files
will be small.
THE ALGORITHM:
There are ten queues: queue[0], queue[1], ..., queue[9].
Each input integer is a eight-digit number. The radix sort makes one
pass for each of these eight digits.
The first pass sorts according to the least significant digit (ones-digit).
In this pass, the program inserts each number into the queue corresponding to
the number's least significant digit. In the 7 succeeding passes the program
takes numbers off the queues and re-queues them according to their
"tens-digit", "hundreds-digit", and finally their most significant digit
("ten-millions-digit"). When it performs one of these passes the program
first takes all the elements from the previous pass off queue[0] and re-queues
them. It then does the same thing for queue[1], queue[2], and so on, up to
queue[9]. After the eighth and final pass, the program dequeues the numbers
and writes them to the output. They come out in sorted order. (If you are
"still confused" you can look at
this example.
I'll discuss the example in class.)
Besides the queue set and one array of 8 characters to use as an input
buffer, your program will not need any additional storage to hold the
integers to be sorted. Immediately after inputting each individual
integer, the program may enqueue it according to the value of its
ones-digit. After the last pass of the sort is complete the integers will
be in the queues, distributed according to their "ten-millions-digit". The
program can then write the sorted list to the standard ouput merely by
emptying the queues, starting at the 0-queue and proceeding in numerical
order to the 9-queue.
One element is still missing from the description of the algorithm. Since the
program takes integers out of queues and re-enqueues them into the same set of
queues, we enqueue a marker in each queue before beginning a new
pass. When the program dequeues a marker, it knows that it has finished
dequeueing all the integers that were in that queue at the start of the pass.
For the marker, you may choose any eight-character array that cannot be
confused with a list element. For example, the array ZZZZZZZZ could be the
marker.
IMPLEMENTATION OF THE NUMBERS AND QUEUES:
The program must read each eight-digit integer as a sequence of eight
characters, not as an integer variable. The program must internally
represent each eight-digit integer as an array of eight characters. The
program has to select individual digits from each integer as part of the
sort. The value of the digit determines which queue the integer will be
inserted into next. When we represent the integers as arrays of
characters, it makes it particularly easy to perform this digit selection.
Use a linked list implementation of the queue data type. I recommend that
you use the
queue ADT header file
and
queue ADT implementation file
in the assignment directory.
In order to use the recommended queue implementation, you will have to
define an itemClass in an item.h file or modify the QueueP.h file slightly
to make the queue element type an array of eight characters.
You may want to get some ideas from this
sample header file for an item class.
Since there are ten related queues to be implemented, they must be
organized into an appropriate data structure. An array of ten queues will
work well. This
sample header file for a "queue set" has
some ideas that you can consider using.
ADDITIONAL SPECIFICATIONS:
Before you do any assignment for me, you need to read the
programming assignment rules.
This document contains my general rules regarding form, and
style. The document also describes my grading criteria.
You also need to reference the other documents here:
http://www.cs.csustan.edu/~john/Classes/General_Info/progAsgRules/
to make sure you are correctly applying the top-down design methodology and to
make sure that you are including everything necessary in the programs and
scripts you send me.
HELP WITH TESTING:
There is a
makeList program
in the assignment directory that you can use to generate lists of integers
to sort.
When you test the algorithm on a long list of integers you can
use the unix sort and diff commands to verify
that your program sorted the list correctly.
For example, suppose that you have a file called data
containing 1000 integers in random order. Suppose that the
name of your executable radix sort program is rsort.
if you execute the following commands
rsort < data > myoutput
sort -n < data > sortoutput
diff myoutput sortoutput
then rsort sorted the numbers correctly if and only if
diff found no differences between myoutput
and sortoutput. (When there are no differences
between the files, diff has no output at all, or
possibly just some blank lines.)
I expect to see your use of rsort and diff in
the test scripts you make.
MORE HELP:
Come to class to get help and hints.
WHAT TO TURN IN:
You will turn in two "phases" of this assignment:
- a level 2 version, and
- a final version.
For each phase of the assignment, you will turn in a printer output
(hardcopy) and you will send me an e-mail message.
For the first phase of the assignment you will send me an e-mail message.
For the second (final) phase, you will turn in a printer output (hardcopy) and
you will also send me an e-mail message.
Please follow these rules:
- Always send me e-mail as plain text in the main message body.
Never send me attachments.
- Always use the exact subject line I specify for each
message. (I often get hundreds of e-mail messages in a week. The
subject line allows me to find, filter and sort messages.) You will lose
a significant number of points on the assignment if you use the wrong
subject line.
- Be very careful when you send the e-mail. You may use the
instructions in your
Hello World! lab excercise
for guidance. Of course, you will need to make the obvious changes to
those directions -- you have to use the correct subject line and
filename.
- Always send yourself a copy of each e-mail message you send to me,
and check immediately to see if you receive the message intact.
You are responsible for sending
e-mail correctly.
Here is the list of things you have to turn in:
-
At the start of class on the
first due date,
place the following item on the "counter" in front of me:
- a hardcopy of your level 2 (or
greater) program. (All the source code, i.e. all the *.h and *.cpp
files) Make sure all the code is properly formatted and that it all
shows on the paper.
- Using the subject line: CS3100,prog2.2 send the following item
to me by e-mail before midnight on the
first due date:
One
shell archive file
(only one) containing items 1-4.
- All source files for your level 2
program (everything I will need to compile and run it:
all *.cpp files and *.h files, including Queue code)
- Your test script showing adequate testing of your level 2 program.
- A file named 'README' containing the compilation command one should
use to compile your program.
- A copy of your 'makefile' if you used one.
- At the start of class on the
second due date,
place the following item on the "counter" in front of me:
- a hardcopy of your final level
program. (all the source code) Make sure all the code is properly
formatted and that it all shows on the paper.
- Using the subject line: CS3100,prog2.f send the following item
to me by e-mail before midnight on the
second due date:
One
shell archive file
(only one) containing items 1-4.
- All source files for your final level
program (everything I will need to compile and run it:
all *.cpp files and *.h files, including Queue code)
- Your test script showing adequate testing of your final level program.
- A file named 'README' containing the compilation command one should
use to compile your program.
- A copy of your 'makefile' if you used one.
Note that there are no spaces in the subject lines given. It is important
that you do not insert any spaces. My e-mail address is: john@ishi.csustan.edu.
DUE DATES:
For due dates, see
the class schedule.