Selasa, 29 Januari 2008

Examples using structs

17 Examples using structs
This chapter contains a few simple examples illustrating the use of structs in programs.
In future programs, you will use structs mainly when communicating with existing C
libraries. For example, you will get to write code that uses the graphics primitives on
your system. The "applications programmer interface" (API) for the graphics will be
defined by a set of C functions (in some cases based on an earlier interface defined by a
set of Pascal functions). If you a programming on Unix, you will probably be using the
Xlib graphics package, on an Apple system you would be using Quickdraw, and on an
Intel system you would use a Windows API. The functions in the API will make
extensive use of simple structs. Thus, the Xlib library functions use instances of the
following structs:
typedef struct {
short x, y;
} XPoint;
typedef struct {
short x, y;
unsigned short width, height;
} XRectangle;
and
typedef struct {
short x, y;
unsigned short width, height;
short angle1, angle2;
} XArc;
So, if you are using the Xlib library, your programs will also define and use variables
that are instances of these struct types.
For new application-specific code you will tend to use variables of class types more
often than variables of struct types.
17
Using structs with
existing libraries
Your own structs?
476 Examples using structs
The examples in this chapter are intended merely to illustrate how to declare struct
types, use instances of these types, access fields and so forth. The example programs in
section 17.3, introduce the idea of a "file of records". These programs have a single
struct in memory and a whole set of others in a disk file. The struct in memory gets
loaded with data from a requested record on disk. Record files are extremely common
in simple business data processing applications, and represent a first step towards the
more elaborate "databases" used in industry. You will learn more about "files of
records" and "databases" in later subjects. While you can continue to use C/C++ when
working with database systems, you are more likely to use specialized languages like
COBOL and SQL.
17.1 REORDERING THE CLASS LIST AGAIN
The example in section 13.2 illustrated a "sorting" algorithm that was applied to reorder
pupil records. The original used two arrays, pupils and marks:
typedef char Name[40];
Name pupils[] = {
"Armstrong, Alice S.",

"Zarra, Daniela"
};
int marks[] = {
57,

78
};
But the data in the two arrays "belong together" – Miss Armstrong's mark is 57 and she
shouldn't be separated from it.
This is a typical case where the introduction of a simple struct leads to code that
reflects the problem structure more accurately (and is also slightly clearer to read). The
struct has to package together a name and a mark:
struct PupilRec {
Name fName;
int fMark;
};
(Naming conventions again: use 'f' as the first character of the name of a data member
in all structs and classes that are defined for a program.)
With this struct declared, an initialized array of PupilRecs can be defined:
Reordering the class list 477
PupilRec JuniorYear[] = {
{ "Armstrong, Alice S.", 57 } ,
{ "Azur, Hassan M.", 64 } ,
{ "Bates, Peter", 33 } ,

{ "Zarra, Daniela", 81 }
};
The function MakeOrderedCopy() can be rewritten to use these PupilRec structs:
void MakeOrderedCopy(const PupilRec orig[],int num,
PupilRec reord[])
{
int mark_count[101];
for(int i = 0; i< 101; i++)
mark_count[i] = 0;
// Count number of occurrences of each mark
for(i = 0; i < num; i++) {
int mark = orig[i].fMark;
mark_count[mark]++;
}
// Make that count of number of pupils with marks less than
// or equal to given mark
for(i=1; i<101; i++)
mark_count[i] += mark_count[i-1];
for(i=num - 1; i >= 0; i--) {
int mark = orig[i].fMark;
int position = mark_count[mark];
position--; // correct to zero based array
// copy data
reord[position] = orig[i];
mark_count[mark]--;
}
}
The function takes two arrays of PupilRec structs. The orig parameter is the array
with the records in alphabetic order; it is an "input parameter" so it is specified as
const. The reord parameter is the array that is filled with the reordered copy of the
data.
An expression like:
orig[i].fMark
illustrates how to access a data member of a chosen element from an array of structures.
Note the structure assignment:
Arrays of structs as
"input" and
"output" parameters
Accessing data
member of one of
structs in the array
Assignment of structs
478 Examples using structs
reord[position] = orig[i];
The compiler "knows" the size of structs so allows struct assignment while assignment
of arrays is not allowed.
17.2 POINTS AND RECTANGLES
The following structs and functions illustrate the functionality that you will find in the
graphics libraries for your system. Many of the basic graphics functions will use points
and rectangles; for example, a line drawing function might take a "start point" and an
"end point" as its parameters, while a DrawOval() function might require a rectangle to
frame the oval.
The graphics packages typically use short integers to represent coordinate values. A
"point" will need two short integer data fields:
struct Point {
short fx;
short fy;
};
A struct to represent a rectangle can be defined in a number of alternative ways,
including:
struct Rectangle {
short ftop;
short fleft;
short fwidth;
short fheight;
};
and
struct Rectangle {
Point fcorner;
short fwidth;
short fheight;
};
Provided that struct Point has been declared before struct Rectangle, it is
perfectly reasonable for struct Rectangle to have data members that are instances
of type Point. The second declaration will be used for the rest of the examples in this
section.
The graphics libraries define numerous functions for manipulating points and
rectangles. The libraries would typically include variations of the following functions:
Points and Rectangles 479
int Point_In_Rect(const Point& pt, const Rectangle& r);
Returns "true" (1) if point pt lies with Rectangle r.
int Equal_Points(const Point& p1, const Point& p2);
Returns "true" if points p1 and p2 are the same.
void Add_Point(const Point& p1, const Point& p2, Point& res);
Changes the "output parameter" res to be the 'vector sum'
of points p1 and p2.
Point MidPoint(const Point& p1, const Point& p2);
Returns the mid point of the given points.
int ZeroRect(const Rectangle& r)
Return "true" if r's width and height are both zero.
Rectangle UnionRect(const Rectangle& r1, const Rectangle& r2);
Returns the smallest circumscribing rectangle of the given
rectangles r1 and r2.
Rectangle IntersectRect(const Rectangle& r1,
const Rectangle& r2);
Returns a rectangle that represents the intersection of the
given rectangles, or a zero rectangle if they don't
intersect.
Rectangle Points2Rect(const Point& p1, const Point& p2);
Returns a rectangle that includes both points p1 and p2
within its bounds or at least on its perimeter
These functions are slightly inconsistent in their prototypes, some return structs as
results while others have output parameters. This is deliberate. The examples are
meant to illustrate the different coding styles. Unfortunately, it is also a reflection of
most of the graphics libraries! They tend to lack consistency. If you were trying to
design a graphics library you would do better to decide on a style; functions like
Add_Point() and MidPoint() should either all have struct return types or all have
output parameters. In this case, there are reasons to favour functions that return structs.
Because points and rectangles are both small, it is acceptable for the functions to return
structs as results. Code using functions that return structs tends to be a little more
readable than code using functions with struct output parameters.
This example is like the "curses" and "menu selection" examples in Chapter 12. We
need a header file that describes the facilities of a "points and rectangles" package, and
an implementation file with the code of the standard functions. Just so as to illustrate
the approach, some of the simpler functions will be defined as "inline" and their
definitions will go in the header file.
The header file would as usual have its contents bracketed by conditional
compilation directives:
480 Examples using structs
#ifndef __MYCOORDS__
#define __MYCOORDS__
the interesting bits
#endif
As explained in section 12.1, these conditional compilation directives protect against
the possibility of erroneous multiple inclusion.
The structs would be declared first, then the function prototypes would be listed.
Any "inline" definitions would come toward the end of the file:
#ifndef __MYCOORDS__
#define __MYCOORDS__
struct Point {
short fx;
short fy;
};
struct Rectangle {

};
int Point_In_Rect(const Point& pt, const Rectangle& r);

inline int Equal_Points(const Point& p1, const Point& p2)
{
return ((p1.fx == p2.fx) && (p1.fy == p2.fy));
}
inline int ZeroRect(const Rectangle& r)
{
return ((r.fwidth == 0) && (r.fheight == 0));
}
#endif
"Inline" functions have to be defined in the header. After all, if the compiler is to
replace calls to the functions by the actual function code it must know what that code is.
When compiling a program in another file that uses these functions, the compiler only
knows the information in the #included header.
The definitions of the other functions would be in a separate mycoords.cp file:
#include "mycoords.h"
Header file
"mycoords.h"
inline functions
defined in the header
file
Points and Rectangles 481
int Point_In_Rect(const Point& pt, const Rectangle& r)
{
// Returns "true" if point pt lies with Rectangle r.
if((pt.fx < r.fcorner.fx) || (pt.fy < r.fcorner.fy))
return 0;
int dx = pt.fx - r.fcorner.fx;
int dy = pt.fy - r.fcorner.fy;
if((dx > r.fwidth) || (dy > r.fheight))
return 0;
return 1;
}
A rectangle has a Point data member which itself has int fx, fy data members.
The name of the data member that represents the x-coordinate of the rectangle's corner
is built up from the name of the rectangle variable, e.g. r1, qualified by the name of its
point data member fcorner, qualified by the name of the field, fx.
Function Add_Point() returns its result via the res argument:
void Add_Point(const Point& p1, const Point& p2, Point& res)
{
// Changes the "output parameter" res to be the 'vector sum'
// of points p1 and p2.
res.fx = p1.fx + p2.fx;
res.fy = p1.fy + p2.fy;
}
while MidPoint() returns its value via the stack (it might as well use Add_Point() in
its implementation).
Point MidPoint(const Point& p1, const Point& p2)
{
// Returns the mid point of the given points.
Point m;
Add_Point(p1, p2, m);
m.fx /= 2;
m.fy /= 2;
return m;
}
Function Points2Rect() is typical of the three Rectangle functions:
Rectangle Points2Rect(const Point& p1, const Point& p2)
{
// Returns a rectangle that includes both points p1 and p2
// within its bounds or at least on its perimeter
Note multiply
qualified names of
data members
482 Examples using structs
Rectangle r;
int left, right, top, bottom;
if(p1.fx < p2.fx) { left = p1.fx; right = p2.fx; }
else { left = p2.fx; right = p1.fx; }
if(p1.fy < p2.fy) { top = p1.fy; bottom = p2.fy; }
else { top = p2.fy; bottom = p1.fy; }
r.fcorner.fx = left;
r.fcorner.fy = top;
r.fwidth = right - left;
r.fheight = bottom - top;
return r;
}
If you were developing a small library of functions for manipulation of points and
rectangles, you would need a test program like:
#include
#include "mycoords.h"
int main()
{
Point p1;
p1.fx = 1;
p1.fy = 7;
Rectangle r1;
r1.fcorner.fx = -3;
r1.fcorner.fy = -1;
r1.fwidth = 17;
r1.fheight = 25;
if(Point_In_Rect(p1, r1))
cout << "Point p1 in rect" << endl;
else cout << "p1 seems to be lost" << endl;
Point p2;
p2.fx = 11;
p2.fy = 9;
Point p3;
Add_Point(p1, p2, p3);
cout << "I added the points, getting p3 at ";
cout << p3.fx << ", " << p3.fy << endl;
Point p4;
p4.fx = 12;
Points and Rectangles 483
p4.fy = 16;
if(Equal_Points(p3, p4))
cout << "which is where I expected to be" << endl;
else cout << "which I find surprising!" << endl;


Rectangle r4 = UnionRect(r2, r3);
cout << "I made rectangle r4 to have a corner at ";
cout << r4.fcorner.fx << ", " << r4.fcorner.fy << endl;
cout << "and width of " << r4.fwidth <<
", height " << r4.fheight << endl;
Rectangle r5 = IntersectRect(r4, r1);
cout << "I made rectangle r5 to have a corner at ";
cout << r5.fcorner.fx << ", " << r5.fcorner.fy << endl;
cout << "and width of " << r5.fwidth <<
", height " << r5.fheight << endl;
return 0;
}
The test program would have calls to all the defined functions and would be organized
to check the results against expected values as shown.
You may get compilation errors relating to struct Point. Some IDEs
automatically include the system header files that define the graphics functions, and
these may already define some other struct Point. Either find how to suppress this
automatic inclusion of headers, or change the name of the structure to Pt.
17.3 FILE OF RECORDS
Rather than writing their own program, anyone who really wants to keep files of
business records would be better off using the "database" component of one of the
standard "integrated office packages". But someone has to write the "integrated office
package" of the future, maybe you. So you had better learn how to manipulate simple
files of records.
The idea of a file of records was briefly previewed in section 9.6 and is again
illustrated in Figure 17.1.
A file record will be exactly the same size as a memory based struct. Complete
structs can be written to, and read from, files. The i/o transfer involves simply copying
bytes (there is no translation between internal binary forms of data and textual strings).
The operating system is responsible for fitting the records into the blocks on disk and
keeping track of the end of the file.
484 Examples using structs
"Record structured" file:
Customer-1 Customer-2 Customer-3
End of file
Get (and/or) Put
pointer
Figure 17.1 A file of "customer records"
Files of record can be processed sequentially. The file can be opened and records
read one by one until the end of file marker is reached. This is appropriate when you
want to all records processed. For example, at the end of the month you might want to
run through the complete file identifying customers who owed money so that letters
requesting payment could be generated and dispatched.
Files of records may also be processed using "random access". Random access does
not mean that any data found at random will suffice (this incorrect explanation was
proposed by a student in an examination). The "randomness" lies in the sequence in
which records get taken from file and used. For example, if you had a file with 100
customer records, you might access them in the "random" order 25, 18, 49, 64, 3, ….
File systems allow "random access" because you can move the "get" (read) or "put"
(write) pointer that the operating system associates with a file before you do the next
read or write operation. You can read any chosen record from the file by moving the
get pointer to the start of that record. This is easy provided you know the record
number (imagine that the file is like an array of records, you need the index number into
this array). You simply have to multiply the record number by the size of the record,
this gives the byte position where the "get pointer" has to be located.
You have to be able to identify the record that you want. You could do something
like assign a unique identifying number (in the range 0…?) to each customer and
require that this is specified in all correspondence, or you could make use of an
auxiliary table (array) of names and numbers.
You would want to use "random access" if you were doing something like taking
new orders as customers came to, or phoned the office. When a customer calls, you
want to be able to access their record immediately, you don't want to read all the
preceding records in the file. (Of course reading the entire file wouldn't matter if you
have only 100 records; but possibly you have ambitions and wish to grow to millions.)
Sequential access
Random access
File of records 485
New iostream and
fstream features
Working with record files and random access requires the use of some additional
facilities from the iostream and fstream libraries.
First, the file that holds the data records will be an "input-output" file. If you need to
do something like make up an order for a customer, you need to read (input) the
customer's record, change it, then save the changed record by writing it back to file
(output). Previously we have used ifstream objects (for inputs from file) and
ofstream objects (for outputs to file), now we need an fstream object (for
bidirectional i/o). We will need an fstream variable:
fstream gDataFile;
which will have to be connected to a file by an open() request:
gDataFile.open(gFileName, ios::in | ios::out );
The open request would specify ios::in (input) and ios::out (output); in some
environments, the call to open() might also have to specify ios::binary. As this file
is used for output, the operating system should create the file if it does not already exist.
Next, we need to use the functions that move the "get" and "put" pointers.
(Conceptually, these are separate pointers that can reference different positions in the
file; in most implementations, they are locked together and will always refer to the same
position.) The get pointer associated with an input stream can be moved using the
seekg() function; similarly, the put pointer can be moved using the seekp() function.
These functions take as arguments a byte offset and a reference point. The reference
point is defined using an enumerated type defined in the iostream library; it can be
ios::beg (start of file), ios::cur (current position), or ios::end (end of the file).
So, for example, the call:
gDataFile.seekg(600, ios::beg);
would position the get pointer 600 bytes after the beginning of the file; while the call:
gDataFile.seekp(0, ios::end);
would position the put pointer at the end of the file.
The libraries also have functions that can be used to ask the positions of these
pointers; these are the tellp() and tellg() functions. You can find the size of a file,
and hence the number of records in a record file, using the following code:
gDataFile.seekg(0, ios::end);
long pos = gDataFile.tellg();
gNumRecs = pos / sizeof(Customer);
fstream
Positioning the
get/put pointers
Finding your current
position in a file
486 Examples using structs
The call to seekg() moves the get pointer to the end of the file; the call to tellg()
returns the byte position where the pointer is located, and so gives the length of the file
(i.e. the number of bytes in the file). The number of records in the file can be obtained
by dividing the file length by the record size.
Data bytes can be copied between memory and file using the read and write
functions:
read(void *data, int size);
write(const void *data, size_t size);
(The prototypes for these functions may be slightly different in other versions of the
iostream library. The type size_t is simply a typedef equivalent for unsigned int.)
These functions need to be told the location in memory where the data are to be placed
(or copied from) and the number of bytes that must be transferred.
The first argument is a void*. In the case of write(), the data bytes are copied
from memory to the disk so the memory data are unchanged, so the argument is a const
void*. These void* parameters are the first example that we've seen of pointers.
(Actually, the string library uses pointers as arguments to some functions, but we
disguised them by changing the function interfaces so that they seemed to specify
arrays).
A pointer variable is a variable that contains the memory address of some other data
element. Pointers are defined as derived types based on built in or programmer defined
struct (and class) types:
int *iptr;
double *dptr;
Rectangle *rptr;
and, as a slightly special case:
void *ptr_to_somedata;
These definitions make iptr a variable that can hold the address of some integer data
item, dptr a variable that holds the address of a double, and rptr a variable that holds
the address where a Rectangle struct is located.
The variable, ptr_to_somedata, is a void*. This means that it can hold the
address of a data item of any kind. (Here void is not being used to mean empty, it is
more that it is "unknown" or at least "unspecified").
As any C programmer who has read this far will have noted, we have been carefully
avoiding pointers. Pointers are all right when you get to know them, but they can be
cruel to beginners. From now on, almost all your compile time and run-time errors are
going to relate to the use of pointers.
But in these calls to the read() and write() functions, there are no real problems.
All that these functions require is the memory address of the data that are involved in
Read and write
transfers
Memory address
"Pointers"
File of records 487
the transfer. In this example, that is going to mean the address of some variable of a
struct type Customer.
In C and C++, you can get the address of any variable by using the & "address-of"
operator. Try running the following program (addresses are by convention displayed in
hexadecimal rather than as decimal numbers):
float pi = 3.142;
int array[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int main()
{
int data = 101;
void *ptr;
ptr = &data;
cout << "data is at " << hex << ptr << endl;
ptr = π
cout << "pi is at " << hex << ptr << endl;
ptr = &(array[2]);
cout << "array[2] is at " << hex << ptr << endl;
ptr = &(array[3]);
cout << "array[3] is at " << hex << ptr << endl;
return 0;
}
You should be able to relate the addresses that you get to the models that you now have
for how programs and data are arranged in memory. (If you find it hard to interpret the
hexadecimal numbers, switch back to decimal outputs.)
The & operator is used to get the addresses that must be passed to the read() and
write() functions. If we have a variable:
Customer c;
we can get it loaded with data from a disk file using the call:
gDataFile.read(&c, sizeof(Customer));
or we can save its contents to disk by the call:
gDataFile.write(&c, sizeof(Customer));
(Some compilers and versions of the iostream library may require "(char*)" before the
& in these calls. The & operator and related matters are discussed in Chapter 20.)
The & "address-of"
operator
488 Examples using structs
Specification
Write a program that will help a salesperson keep track of customer records.
The program is to:
1 Maintain a file with records of customers. These records are to include the
customer name, address, postcode, phone number, amount owed, and details of any
items on order.
2 Support the following options:
• list all customers
• list all customers who currently owe money
• show the record of a named customer
• create a record for a new customer
• fill in an order for a customer (or clear the
record for goods delivered and paid for)
• quit.
3 Have a list of all products stocked, along with their costs, and should provide a
simple way for the salesperson to select a product by name when making up a
customer order.
The program is not to load the entire contents of the data file into memory. Records are
to be fetched from the disk file when needed. The file will contain at most a few
hundred records so sophisticated structures are not required.
Design
The overall structure of the program will be something like:
Open the file and do any related initializations
Interact with the salesperson, processing commands until
the Quit command is entered
Close the file
These steps obviously get expanded into three main functions.
The "Close File" function will probably be trivial, maybe nothing more than actually
closing the file. Apart from actually opening the file (terminating execution if it won't
open) the "Open File" function will probably have to do additional work, like
determining how many records exist. During design of the main processing loop, we
may identify data that should be obtained as the file is opened. So, function "open file"
should be left for later consideration.
The main processing loop in the routine that works with the salesperson will be
along the following lines:
First Iteration
File of records 489
"Run the shop"
quitting = false
while (not quitting)
prompt salesperson for a command
use command number entered to select
• list all customers
• list debtors

• deal with quit command, i.e. quitting = true
Obviously, each of the options like "list all customers" becomes a separate function.
Several of these functions will have to "get" a record from the file or "put" a record
to file, so we can expect that they will share "get record", "put record" and possibly
some other lower level functions.
The UT functions developed in Chapter 12 (selection from menu, and keyword
lookup) might be exploited. Obviously, the menu selection routine could handle the
task of getting a command entered by the salesperson. In fact, if that existing routine is
to be used there will be no further design work necessary for "Run the shop". The
sketched code is easy to implement.
So, the next major task is identifying the way the functions will handle the tasks like
listing customers and getting orders. The routines for listing all customers and listing
debtors are going to be very similar:
"list all customers"
if there are no customers
just print some slogan and return
for i = 0 , i < number of customers,. i++
get customer record i
print details of customer
"list debtors"
initialize a counter to zero
for i = 0 , i < number of customers,. i++
get customer record i
if customer owes money
print details of customer
increment counter
if counter is zero report "no debtors"
You might be tempted to fold these functions into one, using an extra boolean argument
to distinguish whether we are interested in all records or just those with debts.
However, if the program were later made more elaborate you would probably find
greater differences in the work done in these two cases (e.g. you might want the "list
Second Iteration
The "list …"
functions
490 Examples using structs
debtors" operation to generate form letters suggesting that payment would be
appreciated). Since the functions can be expected to become increasingly different, you
might as well code them as two functions from the start.
The sketched pseudo-code uses for loops to work through the file and relies on a
(global?) variable that holds the number of records. As noted in the introduction to this
section, it is easy to work out the number of records in a file like this; this would get
done when the file is opened.
The two functions could work with a local variable of a struct type "Customer".
This would get passed as a reference parameter to a "get record" function, and as a
"const reference" to a "print details" function.
The two additional auxiliary functions, "get record" and "print details", also need to
be sketched:
"get record"
convert record number to byte position
read from file into record
if i/o error
print error message and exit
"print details"
output name, address, postcode, phone number
if amount owed
print the amount
else ? (either blank, or "nothing owing")
if any items on order
print formatted list of items
The specification didn't really tie down how information was to be displayed; we can
chose the formats when doing the implementation.
The other required functions are those to add a new customer record to the file, show
the record of a customer identified by name, and place an order for a customer (the
specification is imprecise, but presumably this is again going to be a customer identified
by name).
Adding a customer should be easy. The "add customer" routine would prompt the
salesperson for the name, address, postcode, and phone number of the new customer.
These data would be filled into a struct, then the struct would be written to the end of
the existing file. The routine would have to update the global counter that records the
number of records so that the listing routines would include the new record.
"add customer"
? check any maximum limits on the number of customers
prompt for data like name, filling in a struct,
set amount owed, and items on order to zero
Auxiliary "get
record" and "print
details" functions
Functions for other
processing options
File of records 491
write the struct to the file
update number of customers
The file itself would not limit the number of records (well, not until it got to contain so
many millions of records that it couldn't fit on a disk). But limits could (and in this case
do) arise from other implementation decisions.
Writing the struct to file would be handled by a "put record" function that matches
the "get record":
"put record"
convert record number to byte position
copy data from memory struct to file
if i/o error
print error message and exit
The other two functions, "show customer" and "record order", both apparently need
to load the record for a customer identified by name. The program could work reading
each record from the file until it found the correct one. This would be somewhat costly
even for little files of a few hundred records. It would be easier if the program kept a
copy of the customer names in memory. This should be practical, the names would
require much less storage than the complete records.
It would be possible for the "open file" routine to read all existing records, copying
the customer names into an array of names. This only gets done when the program
starts up so later operations are not slowed. When new records are added to the file, the
customer name gets added to this array.
The memory array could hold small structures combining a name and a record
number; these could be kept sorted by name so allowing binary search. This would
necessitate a sort operation after the names were read from file in the "open file"
routine, and an "insert" function that would first find the correct place for a new name
and then move other existing names to higher array locations so as to make room for it.
Alternatively, the array could just contain the names; the array indexes would
correspond to the record numbers. Since the names wouldn't be in any particular order,
the array would have to be searched linearly to find a customer. Although crude, this
approach is quite acceptable in this specific context. These searches will occur when
the salesperson enters a "show customer" or "place order" command. The salesperson
will be expecting to spend many seconds reading the displayed details, or minutes
adding new orders. So a tiny delay before the data appear isn't going to matter. A
program can check many hundreds of names in less than a tenth of a second; so while a
linear search of the table of names might be "slow" in computer terms, it is not slow in
human terms and the times of human interaction are going to dominate the workings of
this program.
Further, we've got that UT_PickKeyWord() function. It searches an array of
"words", either finding the match or listing the different possible matches. This would
Getting the record for
a named customer
492 Examples using structs
actually be quite useful here. If names can be equated with the UT_Words we can use
the pick key word function to identify a name from its first few characters. Such an
"intelligent" interactive response would help the salesperson who would only have to
enter the first few characters of a name before either getting the required record or a list
of the name and similar names.
So, decisions: Customer names will be UT_Words (this limits them to less than 15
characters which could be a problem), things like addresses might as well be UT_Texts.
There will be a global array containing all the names. Functions that require the
salesperson to select a customer will use the keyword search routine (automatically
getting its ability to handle abbreviations.
These decisions lead to the following design sketches for the two remaining
processing options:
"show customer"
use keyword picker function to prompt for data (customer name)
and find its match in names array
(returns index of match)
use Get Record to get the record corresponding to index
Print Details of record got from file
and
"record order"
use keyword picker function to prompt for data (customer name)
and find its match in names array
(returns index of match)
use get record to get the record corresponding to index
reset customer record to nothing ordered, nothing owed
loop
get next order item
update amount owed
until either no more order items or maximum number ordered
put updated record back in file
The loop getting items in function "record order" is once again a candidate for
promotion to being a separate function. If it had to do all the work, function "record
order" would become too complex. Its role should be one of setting up the context in
which some other function can get the order items. So, its sketch should be revised to:
"record order"
use keyword picker function to prompt for data (customer name)
and find its match in names array
File of records 493
(returns index of match)
use get record to get the record corresponding to index
get items
put updated record back in file
This new "get items" function now has to be planned. The specification stated that
this routine should either make up an order for a customer or clear the record for goods
delivered and paid for. This might not prove ideal in practice because it means that you
can't deliver partial orders, and you can't accept supplementary orders; but it will do to
start with. It means that the "get items" routine should start by clearing any existing
record of amount owed and items ordered. Then the routine should ask the user
whether any item is to be ordered. While the reply is "yes", the routine should get
details of the item, add it to the record and its cost to the amount owed, and then again
ask whether another item is to be ordered. The Customer struct can have an array to
hold information about ordered items. Since this array will have a fixed size, the loop
asking for items would need to terminate if the array gets filled up.
The specification requires the program to have a table of goods that can be ordered
and a convenient mechanism allowing the salesperson to enter the name of a chosen
article. The key word picking function can again be pressed into service. There will
need to be a global array with the names of the goods articles, and an associated array
with their costs.
An initial sketch for "get items" is:
"get items"
change amount owing data member of record to zero
ask (YesNo() function) whether another item to be ordered
while item to be ordered and space left
use keyword picker function to prompt for
data (item name) and find its match
in goods array
(returns index of match)
copy name of item into record
update amount owed by cost of item
update count of items ordered
ask (YesNo()) whether another item needed
This has sketch has identified another function, YesNo(), that gets a yes/no input from
the user.
494 Examples using structs
Open File revisited Earlier consideration of the open file function was deferred. Now, we have a better
idea of what it must do. It should open the file (or terminate the program if the file
won't open). It should then determine the number of records. Finally, it should scan
through all records copying the customer names into an array in memory.
The array for names has a fixed size. So there will be a limit on the number of
customers. This will have to be catered for in the "add customer" function.
What is a Customer? It is about time to make some decisions regarding the data.
A Customer had better be a struct that has data members for:
1 customer name (already decided to use a UT_Word, i.e. up to about 15 characters);
2 customer address (a UT_Text, i.e. up to 60 characters);
3 postcode and phone, these could also be UT_Words;
4 amount owing (a double);
5 a count of items on order;
6 an array with the names of the items on order, need to fix a size for this.
Other fields might need to be added later. The following structs declarations should
suffice:
#ifndef __MYCUSTOMER__
#define __MYCUSTOMER__
#include "UT.h"
struct Date {
int fDay, fMonth, fYear;
};
const int kMAXITEMS = 5;
struct Customer {
UT_Word fName;
UT_Text fAddress;
UT_Word fPostcode;
UT_Word fDialcode;
UT_Word fOrders[kMAXITEMS];
int fNumOrder;
double fAmountOwing;
Date fLastOrder;
};
#endif
Third iteration
through the design
File of records 495
An extra Date struct has been declared and a Date data member has been included in
the Customer struct. This code doesn't use dates; one of the exercises at the end of the
chapter involves implementing a code to handle dates.
As a Customer uses UT_Word and UT_Text this header file has to include the UT.h
header that contains the typedef defining these character array types.
The main implementation file is going to contain a number of global arrays:
• gStock[], an array of UT_Words with names of items stocked by shop;
• gItemCosts[], an array of doubles with the costs of items;
• gCommands[], an array with the phrases that describe the commands that the
salesperson can select;
• gNames[], an array to hold customer names.
Other global (or filescope) variables will be needed for the file name, the fstream object
that gets attached to the file, and for a number of integer counters (number of records,
number of commands, number of items in stock list).
The functions have already been considered in detail and don't require further
iterations of design. Their prototypes can now be defined:
void GetRecord(Customer& rec, int cNum);
void PutRecord(Customer& rec, int cNum);
void PrintDetails(const Customer& c);
void ShowCustomer(void);
void ListAll(void);
void ListDebtors(void);
void AddCustomer(void);
int YesNo(void);
void GetItems(Customer& c);
void RecordOrder(void);
void RunTheShop(void);
void OpenTheDataFile(void);
void CloseTheDataFile(void);
int main();
Function prototypes
496 Examples using structs
Other linked files The code written specifically for this problem will have to be linked with the UT code
from Chapter 12 so as to get the key word and menu selection functions.
Quite a large number of systems header files will be needed. The programs uses
both iostream and fstream libraries for files. The exit() function will get used to
terminate the program if an i/o error occurs with the file accesses, so stdlib is also
needed. Strings representing names will have to be copied using strcpy() from the
string library. The "yes/no" function will have to check characters so may need the
ctype header that defines standard functions like tolower(). Error checking might use
assert.
Implementation
The implementation file will start with the #includes:
#include
#include
#include
#include
#include
#include
#include "UT.h"
#include "mycustomer.h"
(If you think about it carefully, you will see that we end up #including UT.h twice; that
sort of thing happens very easily and it is why we have those #ifdef … #endif brackets
on all header files).
The declarations of globals come next. Several are initialized:
const char gFileName[] = "CustRec.XXX";
const int kMAXCUSTOMERS = 200;
fstream gDataFile;
int gNumRecs;
UT_Word gNames[kMAXCUSTOMERS];
UT_Word gStock[] = {
"Floppy disks",

"Marker pens",
"Laser pointer"
};
double gItemCosts[] = {
18.50, /* disks $18.50 per box */

6.50, /* pens */
Header files needed
File of records 497
180.0 /* laser pointer */
};
int gNStock = sizeof(gStock) / sizeof(UT_Word);
UT_Text gCommands[] = {
"Quit",
"List all customers",
"List all debtors",
"Add Customer",
"Show Customer",
"Record Order"
};
int gNCommands = sizeof(gCommands) / sizeof(UT_Text);
In many ways it would be better to introduce a new struct that packages together an
item name and its cost. Then we could have an array of these "costed item" structs
which would reduce the chance of incorrect costs being associated with items.
However, that would preclude the use of the existing keyword functions that require a
simple array of UT_Words.
The function definitions come next:
void GetRecord(Customer& rec, int cNum)
{
/* cNum is customer number (0-based) */
/* convert into offset into file */
long where = cNum * sizeof(Customer);
gDataFile.seekg(where, ios::beg);
gDataFile.read(&rec, sizeof(Customer));
if(!gDataFile.good()) {
cout << "Sorry, can't read the customer file"
<< endl;
exit(1);
}
}
void PutRecord(Customer& rec, int cNum)
{
long where = cNum * sizeof(Customer);
gDataFile.seekp(where, ios::beg);
gDataFile.write(&rec, sizeof(Customer)); //maybe
(char*)&rec
if(!gDataFile.good()) {
cout << "Sorry, can't write to the customer file"
<< endl;
exit(1);
}
498 Examples using structs
}
Terminating the program after an i/o error may seem a bit severe, but really there isn't
much else that we can do in those circumstances.
void PrintDetails(const Customer& c)
{
cout << "----" << endl;
cout << "Customer Name : " << c.fName << endl;
cout << "Address : " << c.fAddress << ", "
<< c.fPostcode << endl;
cout << "Phone : " << c.fDialcode << endl;
if(c.fAmountOwing > 0.0)
cout << "Owes : $" << c.fAmountOwing
<< endl;
else cout << "Owes nothing" << endl;
if(c.fNumOrder == 1) cout << "On order: " << c.fOrders[0]
<< endl;
else
if(c.fNumOrder >0) {
cout << "On order" << endl;
for(int j = 0; j < (c.fNumOrder-1); j++)
cout << c.fOrders[j] << ", ";
cout << "and ";
cout << c.fOrders[c.fNumOrder-1] << endl;
}
cout << "---" << endl;
}
The PrintDetails() function gets a little elaborate, but it is trying to provide a nice
listing of items with commas and the word "and" in the right places.
void ShowCustomer(void)
{
UT_Text aPrompt = "Customer Name : ";
int who = UT_PickKeyWord(aPrompt, gNames, gNumRecs);
if(who < 0)
return;
Customer c;
GetRecord(c, who);
PrintDetails(c);
}
There is one problem in using the "pick keyword" function. If the salesperson enters
something that doesn't match the start of any of the names, the error message is "there is
no keyword …". Possibly the "pick keyword" function should have been designed to
take an extra parameter that would be used for the error message.
void ListAll(void)
File of records 499
{
if(gNumRecs == 0) {
cout << "You have no customers" << endl;
return;
}
for(int i = 0; i < gNumRecs; i++) {
Customer c;
GetRecord(c, i);
PrintDetails(c);
}
}
void ListDebtors(void)
{
int count = 0;
for(int i = 0; i < gNumRecs; i++) {
Customer c;
GetRecord(c, i);
if(c.fAmountOwing > 0.0) {
PrintDetails(c);
count++;
}
}
if(count == 0) cout << "Nothing owed" << endl;
}
Function AddCustomer() checks whether the program's array of names is full and
prevents extra names being added. The input statements use getline(). This is
because addresses are going to be things like "234 High Street" which contain spaces.
If we tried to read an address with something like cin >> c.fAddress, the address
field would get "234", leaving "High …" etc to confuse the input to post code. The
routine isn't robust; you can cause lots of troubles by entering names that are too long to
fit in the specified data member.
void AddCustomer(void)
{
if(gNumRecs == kMAXCUSTOMERS) {
cout << "Sorry, you will have to edit program "
"before it can handle" << endl;
cout << " more customers." << endl;
return;
}
Customer c;
// N.B. This input routine is "unsafe"
// there are no checks successful reads etc
cout << "Name : ";
cin.getline(c.fName, UT_WRDLENGTH-1, '\n');
500 Examples using structs
cout << "Address : ";
cin.getline(c.fAddress, UT_TXTLENGTH-1, '\n');
cout << "Post code : ";
cin.getline(c.fPostcode, UT_WRDLENGTH-1, '\n');
cout << "Phone : ";
cin.getline(c.fDialcode, UT_WRDLENGTH-1, '\n');
c.fNumOrder = 0;
c.fAmountOwing = 0.0;
PutRecord(c, gNumRecs);
strcpy(gNames[gNumRecs], c.fName);
gNumRecs++;
}
int YesNo(void)
{
char ch;
cout << "Order an item? (Y or N)";
cin >> ch;
ch = tolower(ch);
return (ch == 'y');
}
void GetItems(Customer& c)
{
c.fAmountOwing = 0.0;
int count = 0;
while(YesNo() && (count UT_Text aPrompt = "Identify Type of Goods";
int which =
UT_PickKeyWord(aPrompt, gStock, gNStock);
strcpy(c.fOrders[count], gStock[which]);
c.fAmountOwing += gItemCosts[which];
count++;
}
c.fNumOrder = count;
}
Look a bug in GetItems()! Can you spot it? It isn't serious, the program won't
crash. But it makes the user interaction clumsy.
If you can't spot the bug, run the code and try to order more than the limit of five
items. You should then observe a certain clumsiness.
The bug can be fixed by a trivial change to the code.
File of records 501
void RecordOrder(void)
{
UT_Text aPrompt = "Enter Customer Name";
int who = UT_PickKeyWord(aPrompt, gNames, gNumRecs);
Customer c;
GetRecord(c, who);
GetItems(c);
PutRecord(c, who);
}
void RunTheShop(void)
{
UT_Text aPrompt = "Command";
int quitting = 0;
while(!quitting) {
int command =
UT_MenuSelect(aPrompt,gCommands,
gNCommands, 0);
// Need to consume any trailing spaces or newlines
// after the command number
char ch;
cin.get(ch);
while(ch != '\n')
cin.get(ch);
switch(command) {
case 0: /* Quit */
quitting = 1;
break;
case 1: /* List all customers */
ListAll();
break;
case 2: /* List all debtors */
ListDebtors();
break;
case 3: /* Add Customer */
AddCustomer();
break;
case 4: /* Show Customer */
ShowCustomer();
break;
case 5: /* Record Order */
RecordOrder();
break;
}
}
502 Examples using structs
}
The RunTheShop() function has one complication. The salesperson has to enter a
number when picking the required menu option. The digits get read but any trailing
spaces and newlines will still be waiting in the input stream. These could confuse
things if the next function called needed to read a string, a character, or a complete line.
So the input stream is cleaned up by reading characters until get the '\n' at the end of the
input line.
void OpenTheDataFile(void)
{
// Open the file, allow creation if not already there
gDataFile.open(gFileName, ios::in | ios::out );
// may need also ios::binary
if(!gDataFile.good()) {
cout << "? Couldn't open the file. Sorry." << endl;
exit(1);
}
gDataFile.seekg(0, ios::end);
long pos = gDataFile.tellg();
gNumRecs = pos / sizeof(Customer);
assert(gNumRecs <= kMAXCUSTOMERS);
for(int i = 0; i < gNumRecs; i++) {
Customer c;
GetRecord(c, i);
strcpy(gNames[i], c.fName);
}
}
Logically, there is no way that the file could hold more than the maximum number of
records (the file has to be created by this program and the "add customer" function
won't let it happen).
Don't believe it. Murphy's law applies. The program can go wrong if the file is too
large, so it will go wrong. (Something will happen like a user concatenating two data
files to make one large one.) Since the program will overwrite arrays if the file is too
large, it better not even run. Hence the assert() checking the number of records.
void CloseTheDataFile(void)
{
gDataFile.close();
}
int main()
{
OpenTheDataFile();
RunTheShop();
File of records 503
CloseTheDataFile();
return 0;
}
On the whole, the program runs OK:
Command
Enter option number in range 1 to 6, or ? for help
6
Enter Customer Name
Ga
Possible matching keywords are:
Gates, B.
Garribaldi, J
Gamble,P
Gam
i.e. Gamble,P
Order an item? (Y or N)y
Identify Type of Goods
Las
i.e. Laser pointer
Order an item? (Y or N)Y
Identify Type of Goods
Tone
i.e. Toner
Order an item? (Y or N)n
Command
Enter option number in range 1 to 6, or ? for helpEnter option
number in range 1 to 6, or ? for help
3
----
Customer Name : Gates, B.
Address : The Palace, Seattle, 923138
Phone : 765 456 222
Owes : $186.5
On order
Laser pointer, and Marker pens
---
----
Customer Name : Gamble,P
Address : 134 High St, 89143
Phone : 1433
Owes : $220
On order
Laser pointer, and Toner
---
504 Examples using structs
EXERCISES
1 Fix the "bug" in GetItems() from 17.3.
2 Change the way that the Customer records program handles the names to the alternative
approach described in the text (sorted array of name-index number structures).
3 Implement code to support the Date data structure and arrange that Customer orders include a
Date.

Enum, Struct, and Union

16 Enum, Struct, and Union
This chapter introduces three simple kinds of programmer defined data types.
You aren't limited to the compiler provided char, int, double data types and their
derivatives like arrays. You can define your own data types. Most of Part IV of this
text is devoted to various forms of programmer defined type. Here we introduce just
three relatively simple kinds. These are almost the same as the equivalents provided in
C; in contrast, C has no equivalent to the more sophisticated forms of programmer
defined type introduced in Part IV.
Enumerated types, treated in section 16.1, are things that you should find useful in
that they can improve the readability of your programs and they allow the compiler to
do a little extra type checking that can eliminate certain forms of error. Although
useful, enumerated types are not that major a factor in C++ programming.
Structs are the most important of the three language elements introduced here,
section 16.2. They do have wider roles, but here we are interested in their primary role
which is the grouping of related data.
Unions: treat them as a "read only" feature of C++. You will sometimes see unions
being employed in library code that you use, but it is unlikely that you will find any real
application for unions in the programs that you will be writing.
16
16.1 ENUMERATED TYPES
You often need to use simple integer constants to represent domain specific data. For
example, suppose you needed to represent the colour of an automobile. You could have
the following:
const int cRED = 0;
const int cBLUE = 1;

int auto_colour;
460 Enum, struct, and union
auto_colour = cBLUE;
This is quite workable. But there are no checks. Since variable auto_colour is an
integer, the following assignments are valid:
auto_colour = -1;

auto_colour = rand();
But of course, these statements lose the semantics; variable auto_colour doesn't any
longer represent a colour, it is just an integer.
It is nice to be able to tell the compiler:
"Variable auto_colour is supposed to represent a colour, that is one of the
defined set of choices RED, BLUE, …."
and then have the compiler check, pretty thoroughly, that auto_colour is only used in
this way throughout the code.
This is the role of enums or enumerated types. If you think how they are actually
implemented, they are just an alternative way of declaring a set of integer constants and
defining some integer variables. But they also introduce new distinct types and allow
the compiler to do type checking. It is this additional type checking that makes enums
worthwhile.
The following is a simple example of an enum declaration:
enum Colour { eRED, eBLUE, eYELLOW, eGREEN, eSILVERGREY,
eBURGUNDY };
The entries in the enumeration list are just names (of constant values). The same rules
apply as for any other C++ names: start with a letter, contain letters digits and
underscores. However, by convention, the entries in the enum list should have names
that start with 'e' and continue with a sequence of capital letters. This makes enum
values stand out in your code.
With Colour now defined, we can have variables of type Colour:
Colour auto_colour;

auto_colour = eBURGUNDY;
The compiler will now reject things like auto_colour = 4. Depending on the
compiler you are using you may get just "Warning – anachronism", or you may get an
error (really, you should get an error).
What about the "enumerators" eRED, eBLUE etc? What are they?
enums and type
checking
Naming convention
for enums
enumerators
Enumerated types 461
Really, they are integers of some form. The compiler may chose to represent them
using shorts, or as unsigned chars. Really, it isn't any of your business how your
compiler represents them.
The compiler chooses distinct values for each member of an enumeration.
Normally, the first member has value 0, the second is 1, and so forth. So in this
example, eRED would be a kind of integer constant 0, eSILVERGREY would be 4.
Note the effect of the type checking:
auto_colour = 4; // Wrong, rejected or at least
// warned by compiler
auto_colour = eSILVERGREY; // Fine, set auto_colour to
// (probably) the value 4
It isn't the values that matter, it is the types. The value 4 is an integer and can't be
directly assigned to a Colour variable. The constant eSILVERGREY is a Colour
enumerator and can be assigned to a Colour variable.
You can select for yourself the integer values for the different members of the
enumeration, provided of course that you keep them all distinct. You won't have cause
to do this yourself; but you should be able to read code like:
enum MagicEnum { eMERLIN = 17, eGANDALF = 103, eFRED = 202 };
Treat this as one of those "read only" features of C++. It is only in rare circumstances
that you want to define specific values for the members of the enumeration. (Defining
specific values means that you aren't really using the enum consistently; at some places
in your code you intend to treat enums as characters or integers.)
Output of enums
Enums are fine in the program, but how do you get them transferred using input and
output statements?
Well, with some difficulty!
An enum is a form of integer. You can try:
cout << auto_colour;
and you might get a value like 0, or 3 printed. More likely, the compiler will give you
an error message (probably something rather obscure like "ambiguous reference to
overloaded operator function"). While the specific message may not be clear, the
compiler's unhappiness is obvious. It doesn't really know how you want the enum
printed.
You can tell the compiler that it is OK to print the enum as an integer:
462 Enum, struct, and union
cout << int(auto_colour);
The function like form int(auto_colour) tells the compiler to convert the data value
auto_colour to an integer. The statement is then cout << integer which the
compiler knows to convert into a call to a PrintInteger() routine. (The compiler
doesn't need to generate any code to do the conversion from enum to integer. This
conversion request is simply a way for the programmer to tell the compiler that here it
is intended that the enum be regarded as just another integer).
Printing an enum as an integer is acceptable if the output is to a file that is going to
be read back later by the program. Human users aren't going to be pleased to get output
like:
Engine: 1.8 litre
Doors: 4
Colour: 5
If you are generating output that is to be read by a human user, you should convert the
enum value into an appropriate character string. The easiest way is to use a switch
statement:
switch(auto_colour) {
eRED: cout << "Red"; break;
eBLUE: cout << "Blue"; break;

eBURGUNDY:
cout << "Burgundy"; break;
}
Input
If your program is reading a file, then this will contain integer values for enums; for
example, the file could have the partial contents
1.8 4 5
for an entry describing a four door, burgundy coloured car with 1.8 litre engine But you
can't simply have code like:
double engine_size;
int num_doors;
Colour auto_colour;

input >> engine_size;
input >> num_doors ;
input >> auto_colour;
Enumerated types 463
The compiler gets stuck when it reaches input >> auto_colour;. The compiler's
translation tables let it recognize input >> engine_size as a form of "input gives to
double" (so it puts in a call to a ReadDouble() routine), and similarly input >>
num_doors can be converted to a call to ReadInt(). But the compiler's standard
translation tables say nothing about input >> Colour.
The file contains an integer; so you had better read an integer:
int temp;
input >> temp;
Now you have an integer value that you hope represents one of the members of the
enum Colour. Of course you must still set the Colour variable auto_colour. You
can't simply have an assignment:
auto_colour = temp;
because the reason we started all of this fuss was to make such assignments illegal!
Here, you have to tell the compiler to suspend its type checking mechanisms and
trust you. You can say "I know this integer value will be a valid Colour, do the
assignment." The code is
auto_colour = Colour(temp);
Input from file will use integers, but what of input from a human user?
Normally, if you are working with enumerated types like this, you will be prompting
the user to make a selection from a list of choices:
Colour GetColour()
{
cout << "Please enter preferred colour, select from "
<< endl;
cout << "1\tRed" << endl;
cout << "2\tBlue" << endl;

cout << "6\tBurgundy" << endl;
for(;;) {
int choice;
cin >> choice;
switch(choice) {
case 1: return eRED;
case 2: return eBLUE;

case 6: return eBURGUNDY;
}
cout << "There is no choice #" << choice << endl;
464 Enum, struct, and union
cout << "Please select from 1 Red, 2 Blue … "
" 6, Burgundy" << endl;
}
}
(Note, internally eRED may be 0, eBLUE may be 1 etc. But you will find users generally
prefer option lists starting with 1 rather than 0. So list the choices starting from 1 and
make any adjustments necessary when converting to internal form.)
Other uses of enums
You do get cases like Colour auto_colour where it is appropriate to use an
enumerated type; but they aren't that common (except in text books). But there is
another place where enums are very widely used.
Very often you need to call a routine specifying a processing option from a fixed set:
DrawString – this function needs to be given the string to be
displayed and a style which should be one of
Plain
Bold
Italic
Outline
AlignObjects – this function needs to be given an array with the
object identifiers, a count, and an alignment specification which
should be one of
LeftAligned
Centred
RightAligned
You can define integer constants:
const int cPLAIN = 0;
const int cBOLD = 1;

and have an integer argument in your argument list:
void DrawString(char txt[], int style);
but the compiler can't check that you only use valid styles from the set of defined
constants and so erroneous code like
DrawString("silly", 12345);
Enumerated types 465
gets through the compiler to cause problems at run time.
But, if you code using an enumerated type, you do get compile time checks:
enum TextStyles { ePLAIN, eBOLD, eITALIC, eOUTLINE };
void DrawString(char txt[], TextStyles style);
Now, calls like:
DrawString("Home run", eBOLD);
are fine, while erroneous calls like
DrawString("Say what?", 59);
are stomped on by the compiler (again, you may simply get a warning, but it is more
likely that you will get an error message about a missing function).
16.2 STRUCTS
It is rare for programs to work with simple data values, or even arrays of data values,
that are individually meaningful. Normally, you get groups of data values that belong
together.
Let's pick on those children again. This time suppose we want records of children's
heights in cm, weights in kilos, age (years and months), and gender. We expect to have
a collection of about one thousand children and need to do things like identify those
with extreme heights or extreme weights.
We can simply use arrays:
const int kMAXCHILDREN = 1000;
double heights[kMAXCHILDREN];
double weights[kMAXCHILDREN];
int years[kMAXCHILDREN];
int months[kMAXCHILDREN];
char gender[kMAXCHILDREN];

// read file with data, file terminated by sentinel data
// value with zero height
count = 0;
infile >> h;
while(h > 0.0) {
heights[count] = h;
infile >> weights[count] >> years[count] >>
months[count] >> gender[count];
The need for
"structs"
466 Enum, struct, and union
count++;
infile >> h;
}

Now heights[5], weights[5], …, gender[5] all relate to the same child. But if
we are using arrays, this relationship is at most implicit.
There is no guarantee that we can arrange that the data values that logically belong
together actually stay together. After all, there is nothing to stop one from sorting the
heights array so that these values are in ascending order, at the same time losing the
relationship between the data value in heights[5] and those in weights[5] …
gender[5].
A programming language must provide a way for the programmer to identify groups
of related data. In C++, you can use struct.
A C++ struct declaration allows you to specify a grouping of data variables:
struct Child {
double height;
double weight;
int years;
int months;
char gender;
};
After reading such a declaration, the compiler "knows what a Child is"; for the
purposes of the rest of the program, the compiler knows that a Child is a data structure
containing two doubles, two integers, and a character. The compiler works out the
basic size of such a structure (it would probably be 25 bytes); it may round this size up
to some larger size that it finds more convenient (e.g. 26 bytes or 28 bytes). It adds the
name Child to its list of type names.
This grouping of data is the primary role of a struct. In C++, structs are simply a
special case of classes and they can have perform roles other than this simple grouping
of data elements. However it is useful to make a distinction. In the examples in this
book, structs will only be used as a way of grouping related data elements.
The term "record" is quite often used instead of struct when describing programs.
The individual data elements within a struct are said to be "fields", or "data members".
The preferred C++ terminology is "data members".
A declaration doesn't create any variables, it just lets the compiler know about a new
type of data element that it should add to the standard char, long, double etc. But, once
a struct declaration has been read, you can start to define variables of the new type:
Child cute;
Child kid;
Child brat;
Have to be able to
group related data
elements
Declaring structs
"record", "field",
"data member"
Defining variables of
struct types
Definitions of some
variables of struct
type
Structs 467
and arrays of variables of this type:
Child surveyset[kMAXCHILDREN];
Each of these Child variables has its own doubles recording height and weight, its own
ints for age details, and a char gender flag.
The definition of a variable of struct type can include the data needed to initialize its
data members:
Child example = {
125.0, 32.4, 13, 2, 'f'
};
An instance of a struct can be defined along with the declaration:
struct Rectangle {
int left, top;
int width, height;
} r1, r2, r3;
This declares the form of a Rectangle and defines three instances. This style is widely
use in C programs, but it is one that you should avoid. A declaration should introduce a
new data type; you should make this step separate from any variable definitions. The
construct is actually a source of some compile-time errors. If you forget the ';' that
terminates a structure declaration, the compiler can get quite lost trying to interpret the
next few program elements as being names of variables (e.g. the input struct Rect {
… } struct Point {…} int main() { … } will make a compiler quite unhappy).
Programs have to be able to manipulate the values in the data members of a struct
variable. Consequently, languages must provide a mechanism for referring to a
particular data member of a given struct variable.
Most programming languages use the same approach. They use compound names
made up of the variable name and a qualifying data member name. For example, if you
wanted to check whether brat's height exceeded a limit, then in C++ you would write:
if(brat.height > h_limit)

Similarly, if you want to set cute's weight to 35.4 kilos, you would write:
cute.weight = 35.4;
Most statements and expressions will reference individual data members of a struct,
but assignment of complete structures is permitted:
Child Tallest;
Initialization of
structs
Accessing data
members of a
variable of a struct
type
Fields of a variable
identified using
compound names
Assignment of structs
468 Enum, struct, and union
Tallest.height = 0;
for(int i = 0; i < NumChildren; i++)
if(surveyset[i].height > Tallest.height)
Tallest = surveyset[i];
Compilers generally handle such assignments by generating code using a "blockmove".
A blockmove (which is often an actual instruction built into the machine hardware)
copies a block of bytes from one location to another. The compiler knows the size of
the structs so it codes a blockmove for the appropriate number of bytes.
In C++, a struct declaration introduces a new type. Once you have declared:
struct Point {
int x;
int y;
};
You can define variables of type Point:
Point p1, p2;
In C, struct (and enum) declarations don't make the struct name (or enum name) a
new type. You must explicitly tell the compiler that you want a new type name to be
available. This is done using a typedef. There are a variety of styles. Two common
styles are:
struct Point {
int x;
int y;
};
typedef struct Point Point;
or:
typedef struct _xpt {
int x;
int y;
} Point;
Similarly, enums require typedefs.
enum Colour { eRED, eBLUE, eGREEN };
typedef enum Colour Colour;
You will see such typedefs in many of the C libraries that you get to use from your C++
programs.
Struct and enum
declarations in C
libraries
typedef
Structs 469
Structs and functions A function can have arguments of struct types. Like simple variables, structs can be
passed by value or by reference. If a struct is passed by value, it is handled like an
assignment – a blockmove is done to copy the bytes of the structure onto the stack.
Generally, because of the cost of the copying and the need to use up stack space, you
should avoid passing large structs by value. If a function uses a struct as an "input
parameter", its prototype should specify the struct as const reference, e.g.:
void PrintChildRecord(const struct& theChild)
{
cout << "Height " << theChild.height << …

}
Functions can have structs as their return values. The following illustrates a function
that gets an "car record" filled in:
struct car {
double engine_size;
int num_doors;
Colour auto_colour;
};
car GetAutoDetails()
{
car temp;
cout << "Select model, GL (1), GLX (2), SX (3), TX2(4) : "
;
int model;
cin >> model;
while((model <>4)) {
cout << "Model value must be in range 1…4" << endl;
cout << "Select model :"
cin >> model;
}
switch(model) {
case 1: temp.engine_size = 1.8; temp.num_doors = 3; break;
case 2: …
}
temp.colour = GetColour();
return temp;
}
Although this is permitted, it should not be overused. Returning a struct result doesn't
matter much with a small structure like struct car, but if your structures are large
this style becomes expensive both in terms of space and data copying operations.
Code using the GetAutoDetails() function would have to be something like:
car purchasers_choice;
Function with a
struct as a returned
value
Local variable of
return type defined
Data entered into
fields of local struct
variable
return the local struct
as the value
470 Enum, struct, and union

purchasers_choice = GetAutoDetails();
The instructions generated would normally be relatively clumsy. The stack frame setup
for the function would have a "return value area" sufficient in size to hold a car record;
there would be a separate area for the variable temp defined in the function. The return
statement would copy the contents of temp into the "return value area" of the stack
frame. The data would then again be copied in the assignment to purchasers_choice.
If you needed such a routine, you might be better to have a function that took a
reference argument:
void GetAutoDetails(struct car& temp)
{
cout << "Select model, GL (1), GLX (2), SX (3), TX2(4) : "
;

temp.colour = GetColour();
return ;
}
with calling code like:
car purchasers_choice;

GetAutoDetails(purchasers_choice);
16.3 UNIONS
Essentially, unions define a set of different interpretations that can be placed on the data
content area of a struct. For you, "unions" should be a "read-only" feature of C++. It
may be years before you get to write code where it might be appropriate for you to
define a new union. However, you will be using libraries of C code, and some C++
libraries, where unions are employed and so you need to be able to read and understand
code that utilizes unions.
Unions are most easily understood from real examples The following examples are
based on code from Xlib. This is a C library for computers running Unix (or variations
like Mach or Linux). The Xlib library provides the code needed for a program running
on a computer to communicate with an X-terminal. X-terminals are commonly used
when you want a multi-window style of user interface to Unix.
An X-terminal is a graphics display device that incorporates a simple
microprocessor and memory. The microprocessor in the X-terminal does part of the
work of organizing the display, so reducing the computational load on the main
computer.
Unions 471
When the user does something like move the mouse, type a character, or click an
action button, the microprocessor packages this information and sends it in a message to
the controlling program running on the main computer.
In order to keep things relatively simple, all such messages consist of a 96 byte
block of data. Naturally, different actions require different data to be sent. A mouse
movement needs a report of where the mouse is now located, a keystroke action needs
to be reported in terms of the symbol entered.
Xlib-based programs use XEvent unions to represent these 96 byte blocks of data.
The declaration for this union is
typedef union _XEvent {
int type;
XAnyEvent xany;
XButtonEvent xbutton;
XMotionEvent xmotion;
XCreateWindowEvent xcreatewindow;


} XEvent;
This declaration means that an XEvent may simply contain an integer (and 92 bytes of
unspecified data), or it may contain an XAnyEvent, or it may contain an XButtonEvent,
or …. There are about thirty different messages that an Xterminal can send, so there are
thirty different alternative interpretations specified in the union declaration.
Each of these different messages has a struct declaration that specifies the data that
that kind of message will contain. Two of these structs are:
typedef struct {
int type;
unsigned long serial;
Bool send_event;
Display *display;
Window window;
Window root;
Window subwindow;
Time time;
int x, y;
int x_root, y_root;
unsigned int state;
unsigned int button;
Bool same_screen;
} XButtonEvent;
typedef struct {
int type;
unsigned long serial;
Bool send_event;
union declaration
Declaration of
alternative structs
that can be found in
an XEvent
472 Enum, struct, and union
Display *display;
Window window;
int x, y;
int width, height;
int border_width;
Bool override_redirect;
} XCreateWindowEvent;
As illustrated in Figure 16.1, the first part of any message is a type code. The way
that the rest of the message bytes are used depends on the kind of message.
type type type type
serial
send_event
window
root
subwindow
time
x
y
x_root
y_root
state
button
same_screen
serial
send_event
window
x
y
width
height
border
override
serial
send_event
window
root
subwindow
time
x
y
x_root
y_root
state
hint
same_screen
XCreateWindowEvent XEvent
XButtonEvent XMotionEvent
Figure 16.1 XEvents – an example of a union from the Xlib library.
Unions 473
If you have an XEvent variable ev, you can access its type field using the "."
operator just like accessing a data field in a normal structure:
XEvent ev;

switch(ev.type) {

}
If you know that ev really encodes an XCreateWindowEvent and you want to work
out the area of the new window, you can use code like:
area = ev.xcreatewindow.width * eve.xcreatewindow.height;
The appropriate data fields are identified using a doubly qualified name. The variable
name is qualified by the name of the union type that is appropriate for the kind of record
known to be present (so, for an XCreateWindowEvent, you start with
ev.xcreatewindow). This name is then further qualified by the name of the data
member that should be accessed (ev.xcreatewindow.width).
A programmer writing the code to deal with such messages knows that a message
will start with a type code. The Xlib library has a series of #defined constants, e.g.
ButtonPress, DestroyNotify, MotionNotify; the value in the type field of a message will
correspond to one of these constants. This allows messages from an Xterminal to be
handled as follows:
XEvent eV;

/* Code that gets calls the Unix OS and
gets details of the next message from the Xterminal
copied into eV */

switch(ev.type) {
caseButtonPress:
/* code doing something depending on where the button was
pressed, access using xbutton variant from union */
if((ev.xbutton.x > x_low) && (ev.xbutton.x < x_high) && …
break;
case MotionNotify:
/* user has moved pointer, get details of when this
happened
and decide what to do, access using xmotion variant from
union */
thetime = ev.xmotion.time;

Doubly qualified
names to access data
members of variants
in union
474 Enum, struct, and union
break;
case CreateNotify:
/* We have a new window, code here that looks at where and
what size, access using xcreatewindow variant of union */
int hpos = ev.xcreatewindow.x;

break;

}

Design C++

15 Design and
documentation : 1
The examples given in earlier chapters, particularly Chapter 12, have in a rather
informal way illustrated a design style known as "top-down functional decomposition".
"Top down functional decomposition" is a limited approach to design. It works very
well for simple scientific and engineering applications that have the basic structure:
initialize
get the input data
process the data
print the results
and where the "data" are something simple and homogeneous, like the array of double
precision numbers in the heat diffusion example. Top down functional decomposition
is not a good strategy for the overall design of more complex programs, such as those
that are considered in Parts IV and V. However, it does reappear there too, in a minor
role, when defining the individual "behaviours of objects".
Although the approach is limited, it is simple and it does apply to a very large
number of simple programs. So, it is worth learning this design approach at least as a
beginning.
The first section in this chapter simply summarizes materials from earlier examples
using them to illustrate a basic strategy for program development. The second section
looks briefly at some of the ways that you can document design decisions. Here, simple
textual documentation is favoured.
15
15.1 TOP DOWN FUNCTIONAL DECOMPOSITION
As it says, you start the top with "program", decompose it into functions, then you
iterate along the list of functions going down one level into each to decompose into
452 Design and documentation
auxiliary functions. The process is repeated until the auxiliary functions are so simple
that they can be coded directly.
Note the focus on functions. You don't usually have to bother much about the data
because the data will be simple – a shared array or something similar.
You begin with a phrase or one sentence summary that defines the program:
• The program models a two-dimensional heat diffusion experiment. e.g..
• The program plays the game of "hangman".
and try to get a caricature sketch of the main() function. This should be some slight
variation of the code at the start of this chapter.
Figure 15.1 illustrates initial decompositions for main() in the two example
programs.
Program Heat
Program Hangman
main
Initialize PlayGame AnotherGame
Initialize Heat Diffuse Show
main
Figure 15.1 From program specification to initial functional decomposition.
The sketched functions are:
Program Heat's main():
get iteration limit and print frequency
call Initialize() to initialize grid
loop
call HeatDiffuse() to update values in grid
if time to print, call Show()
Program Hangman's main():
Initialize()
Beginning
Top down functional decomposition 453
do
PlayGame()
while AnotherGame()
You aim for this first stage is to get this initial sketch for main() and a list of "top
level functions", each of which should be characterized by a phrase or sentence that
summarizes what it does:
AnotherGame()
Get and use a Yes/No input from user,
return "true" if Yes was input
PlayGame()
Organize the playing of one complete hangman game
HeatDiffuse()
Recalculate temperature at each point on grid.
You now consider each of these top-level functions in isolation. It is all very well to
say "Organize playing of game" or "Recalculate temperatures" but what do these
specifications really mean.
It is essentially the same process as in the first step. You identify component
operations involved in "organizing the playing of the game" or whatever, and you work
out the sequence in which these operations occur. This gives you a caricature sketch for
the code of the top-level function that you are considering and a list of second level
functions. Thus, HeatDiffuse() gets expanded to:
HeatDiffuse
get copy of grid values
double loop
for each row do
for each col do
using copy work out average temp.
of environment of grid pt.
calculate new temp based on current
and environment
store in grid
reset centre point to flame temperature
with a number of additional functions identified: CopyGrid(), Averageof-
Neighbors(), and NewTemperature().
The products of this design step are once again the sketches of functions and the lists
of the additional auxiliary functions together with one sentence specifications for each:
CopyGrid()
Copy contents of grid into separate data area.
AverageofNeighbors()
Products of
beginning step
Second step
454 Design and documentation
For a given grid point, this finds the average of the
temperatures of the eight neighboring points.
This process is repeated until no more functions need be introduced. The
relationships amongst the functions can be summarized in a "call graph" like that shown
in Figure 15.2.
Program Heat
main
Initialize Heat Diffuse Show
CopyGrid AverageofNeighbors NewTemperature CodeTemperature
TemperatureAt
Figure 15.2 "Call graph" for a program.
The next stage in the design process really focuses on data. You have to resolve
how the functions communicate and decide whether they share access to any global (or
filescope) data. For example, you may have decided that function TemperatureAt()
gets the temperature at a specified grid point, but you have still to determine how the
function "knows" which point and how it accesses the grid.
The "input/output" argument lists of all the functions, together with their return
types should now be defined. A table summarizing any filescope or global data should
be made up.
Generally, you need to plan some tests. A program like Hangman does not require
any elaborate preplanned tests; the simple processing performed can be easily checked
through interaction with the program. The Heat program was checked by working out
in advance the expected results for some simple cases (e.g. the heated point at 1000°C
and eight neighbors at 25°C) and using the debugger to check that the calculated "new
temperatures" were correct. Such methods suffice for simple programs but as you get
to move elaborate programs, you need more elaborate preplanned tests. These should
be thought about at this stage, and a suitable test plan composed.
The final outputs that you want from the design process will include:
1. A sketch for main() summarising the overall processing sequence.
2. A list of function prototypes along with one sentence descriptions of what these
functions do.
N-steps
Sorting out data and
communications
Planning test data
Finalising the design
Top down functional decomposition 455
char CodeTemperature(double temp);
Converts temperature to a letter code; different
letters correspond to different temperature ranges.
void Show(const Grid g);
Iterates over grid printing contents row by row,
uses CodeTemperature() to convert temp. to letter.
3. A list of any typedef types (e.g. typedef double Grid[kROWS][kCOLS]) and a list
defining constants used by the program.
4. A table summarizing global and filescope data.
5. A more detailed listing of the functions giving the pseudo-code outlines for each.
6. A test plan.
It is only once you have this information that it becomes worth starting coding.
Although you will often complete the design for the entire program and then start on
implementation, if the program is "large" like the Hangman example then it may be
worth designing and implementing a simplified version that is then expanded to
produce the final product.
15.2 DOCUMENTING A DESIGN
For most programs, the best documentation will be the design outlines just described.
You essentially have two documents. The first contains items 1…4. It is a kind of
executive summary and is what you would provide to a manager, to coworkers, or to the
teaching assistant who will be marking your program. The second document contains
the pseudo-code outlines for the functions. This you will use to guide your
implementation. Pseudo-code is supposed to be easier to read than actual code, so you
keep this document for use by future "maintenance programmers" who may have to
implement extensions to your code (they should also be left details of how to retest the
code to check that everything works after alterations).
This documentation is all textual. Sometimes, diagrams are required. Call graphs,
like that in Figure 15.2, are often requested. While roughly sketched call graphs are
useful while you are performing the design process and need to keep records as you
move from stage to stage, they don't really help that much in documenting large
programs. Figure 15.3 illustrates some of the problems with "call graphs".
The call graph for the sort program using Quicksort may be correct but it really fails
to capture much of the behaviour of that recursive system. The call graph for the
Hangman program is incomplete. But even the part shown is overwhelming in its
complexity.
456 Design and documentation
Program Sort
Program Hangman
main
Initialize PlayGame AnotherGame
Partition
main
SelectionSort
Quicksort
CG_Initialize
ShowGuess PickWord GetGuessedChar ShowCartoonPart CG_FrameWindow
srand
TickCount
strlen
CG_PutCharacter
Figure 15.3 Further examples of "call graphs".
Psychologists estimate that the human brain can work with about seven items of
information; if you attempt to handle more, you forget or confuse things. A complete
call graph has too much information present and so does not provide a useful abstract
overview.
You could break the graph down into subparts, separating PlayGame's call graph
from the overall program call graph. But such graphs still fail to convey any idea of the
looping structure (the fact that some functions are called once, others many times).
Overall, call graph diagrams aren't a particularly useful form of documentation.
Sometimes, "flowcharts" are requested rather than pseudo-code outlines for
functions. "Flowcharts" are old; they are contemporary with early versions of
FORTRAN. Flowcharting symbols correspond pretty much to the basic programming
constructs of that time, and lack convenient ways of representing multiway selections
(switch() statements etc).
At one time, complete programs were diagrammed using flowcharts. Such program
flowcharts aren't that helpful, for the same reason that full call graphs aren't that helpful.
There is too much detail in the diagram, so it doesn't effectively convey a program's
structure.
It is possible to "flowchart" individual functions, an example in flowcharting style is
shown in Figure 15.4. Most people find pseudo-code outlines easier to understand.
Documenting a design 457
Frame window
Pick word
Show guess
initialize counts
etc
Is game
over?
Get guessed character
Check for match
Match any?
ShowGuess
increment count of matched
Show Cartoon Part
increment count of errors
work out setting of game-
over flag
No Yes
No Yes
Figure 15.4 "Flowchart" representation, an alternative to a pseudo-code outline of a
function.
On the whole, diagrams aren't that helpful. The lists of functions, the pseudo-code
outlines etc are easier to understand and so it isn't worth trying to generate
diagrammatic representations of the same information.
However, you may have access to a CASE ("Computer Assisted Software
Engineering") program. These programs have a user interface somewhat similar to a
drawing package. There is a palette of tools that you can use to place statement
sequences, conditional tests, iterative constructs etc. Dialogs allow you to enter details
of the operations involved.
The CASE tool can generate the final function prototypes, and the code for their
implementation, from the information that you provide in these dialogs. If you use such
a program, you automatically get a diagrammatic representation of your program along
with the code.
458 Design and documentation

Tools C++

14 Tools
This chapter introduces two powerful support tools.
14
14.1 THE "CODE COVERAGE" TOOL
A "code coverage" tool helps you determine whether you have properly tested your
program. Think about all the if … else … and switch() statements in your code.
Each involves optionally executed code, with calls to other functions where there are
further conditional constructs. These conditional constructs mean that there can be a
very large number of different execution paths through your code.
You can never be certain that you have tested all paths through a complex program;
so there will always be chances that some paths have errors where data are combined
incorrectly. But if you have never tested certain paths, there can be gross errors that
will crash your program.
The simple programs that we have considered so far don't really justify the use of a
code coverage tool; after all, things simulating the π-cannon don't involve that many
choices and there is very little dependence on different inputs. But as you move to
more complex things, like some of the "tree algorithms" in Part IV, code coverage tools
become more and more necessary. The "tree algorithms" build up complicated data
structures that depend on the input data. Rules implemented in the algorithms define
how these structures are to change as new data elements get added. Some of these rules
apply to relatively rare special cases that occur only for specific combinations of data
values. The code for the "tree algorithms" has numerous paths, including those for
these rare special cases.
How can you test all paths? Basically, you have to chose different input data so as
to force all processing options to be executed. Choosing the data is sometimes hard;
you may think that you have included examples that cover all cases, but it is difficult to
be certain.
This is where code coverage tools get involved. The basic idea is that the compiler
and run-time system should help you check that your tests have at least executed the
Role of a code
coverage tool
446 Tools
code on every conditional branch of your program. Unfortunately, a "code coverage"
tool is not yet a standard part of the IDEs available for personal computers. The
example in this section uses the "tcov" tool on Sun Unix.
Special compile time options have to be specified when preparing a program for this
form of analysis. These options direct the compiler to add extra instructions to those
that would be generated from the program's source code. As well as extra instructions
embedded in the code, the compiler adds an extra data table, a startup function and a
termination function. The extra instructions, and related components, arrange for
counts to be kept of the number of times that each conditional block is executed, when
the program terminates these counts are saved to a data file.
You can imagine the process as being one where the compiler converts the following
simple program:
#include
int main()
{
cout << "Enter Number ";
int num;
cin >> num;
if(num >= 0)
cout << "that was positive" << endl;
else
cout << "that was negative" << endl;
return 0;
}
into an "instrumented" version:
#include
int __counters[3];
extern void __InitializeCounters(int d[], int n);
extern void __SaveCountersToFile(int d[], int n);
int main()
{
__InitializeCounters(__counters, 3);
__counters[0]++;
cout << "Enter Number ";
int num;
cin >> num;
if(num >= 0) {
__counters[1]++;
cout << "that was positive" << endl;
}
else {
__counters[2]++;
cout << "that was negative" << endl;
Compiler adds record
keeping code
Code Coverage Tool 447
}
__SaveCountersToFile(__counters, 3);
return 0;
}
The "instrumented" program will run just like the original, except that it saves the
counts to a file before terminating. In a properly implemented system, each time the
program is run the new counts are added to those in any existing record file.
When you have run your program several times on different data, you get the
analysis tool to interpret the contents of the file with the counts. This analysis tool
combines the counts data with the original source code to produce a listing that
illustrates the number of times each different conditional block of code has been
executed. Sections of code that have never been executed are flagged, warning the
tester that checks are incomplete. The following output from the Unix tcov tool run on
a version of the Quicksort program from Chapter 13.
const int kSMALL_ENOUGH = 15;
const int kBIG = 50000;
int data[kBIG];
void SelectionSort(int data[], int left, int right)
6246 -> {
for(int i = left; i < right; i++) {
43754 -> int min = i;
for(int j=i+1; j<= right; j++)
248487 -> if(data[j] < data[min]) min =
35251 -> int temp = data[min];
43754 -> data[min] = data[i];
data[i] = temp;
}
6246 -> }
int Partition( int d[], int left, int right)
6245 -> {
int val =d[left];
int lm = left-1;
int rm = right+1;
for(;;) {
152418 -> do
rm--;
518367 -> while (d[rm] > val);
152418 -> do
lm++;
412418 -> while( d[lm] < val);
152418 -> if(lm146173 -> int tempr = d[rm];
d[rm] = d[lm];
d[lm] = tempr;
}
Analysis tool
448 Tools
else
6245 -> return rm;
##### -> }
##### -> }
void Quicksort( int d[], int left, int right)
12491 -> {
if(left < (right-kSMALL_ENOUGH)) {
6245 -> int split_pt = Partition(d,left,
Quicksort(d, left, split_pt);
Quicksort(d, split_pt+1, right);
}
6246 -> else SelectionSort(d, left, right);
6246 -> }
int main()
1 -> {
int i;
long sum = 0;
for(i=0;i 50000 -> sum += data[i] = rand() % 15000;
50000 -> Quicksort(data, 0, kBIG-1);
1 -> int last = -1;
long sum2 = 0;
for(i = 0; i < kBIG; i++)
50000 -> if(data[i] < last) {
##### -> cout << "Oh ....; the data
cout << "Noticed the problem at
exit(1);
}
50000 -> else {
sum2 += last = data[i];
}
50000 -> if(sum != sum2) {
##### -> cout << "Oh ...; we seem to have
}
1 -> return 0;
All the parts of the program that we want executed have indeed been executed.
For real programs, code coverage tools are an essential part of the test and
development process.
14.2 THE PROFILER
A code coverage tool helps you check that you have tested your code, a "Profiler" helps
you make your code faster.
Profiler 449
First some general advice on speeding up programs:
General advice 1:
1. Make it work.
2. Make it fast.
Step 2 is optional.
General advice 2:
The slow bit isn't the bit you thought would be slow.
General advice 3:
Fiddling with "register" declarations, changing from arrays to pointers and general
hacking usually makes not one measurable bit of difference.
General advice 4
The only way to speed something up significantly will involve a fundamental change
of algorithm (and associated data structure).
General advice 5
Don't even think about making changes to algorithms until you have acquired some
detailed timing statistics.
Most environments provide some form of "profiling" tool that can be used to get the
timing statistics that show where a program really is spending its time. To use a
profiling tool, a program has to be compiled with special compile-time flags set. When
these flags are set, the compiler and link-loader set up auxiliary data structures that
allow run time addresses to be mapped back to source code. They also arrange for
some run-time component that can be thought of as regularly interrupting a running
program and asking where its program counter (PC) is. This run-time component takes
the PC value and works out which function it corresponds to and updates a counter for
that function. When the program finishes, the counts (and mapping data) are saved to
file. The higher the count associated with a particular bit of code, the greater the time
spent in that code.
A separate utility program can take the file with counts, and the source files and
produce a summary identifying which parts of a program use most time. The following
data were obtained by using the Unix profiler to analyze the performance of the
extended Quicksort program:
450 Tools
%Time Seconds Cumsecs Name
44.7 0.42 0.42 __0FJPartitionPiiTC
23.4 0.22 0.64 __0FNSelectionSortPiiTC
11.7 0.11 0.75 main
6.4 0.06 0.81 .rem
6.4 0.06 0.87 .mul
1.1 0.01 0.94 rand
Both the Symantec and the Borland IDEs include profilers that can provide information
to that illustrated.
The largest portion of this program's time is spent in the Partition() function, the
Quicksort() function itself doesn't even register because it represents less than 1% of
the time. (The odd names, like __0FNSelectionSortPiiTC are examples of
"mangled" names produced by a C++ compiler.)
Once you know where the program spends its time, then you can start thinking about
minor speed ups (register declarations etc) or major improvements (better algorithms).