This is a C Programming Tutorial for people who have a little experience with an interpreted programming language, such as Emacs Lisp or a GNU shell.
Edition 4.02
Copyright © 1987,1999 Mark Burgess
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Every program is limited by the language which is used to write it. C is a programmer's language. Unlike BASIC or Pascal, C was not written as a teaching aid, but as an implementation language. C is a computer language and a programming tool which has grown popular because programmers like it! It is a tricky language but a masterful one. Sceptics have said that it is a language in which everything which can go wrong does go wrong. True, it does not do much hand holding, but also it does not hold anything back. If you have come to C in the hope of finding a powerful language for writing everyday computer programs, then you will not be disappointed. C is ideally suited to modern computers and modern programming.
This book is a tutorial. Its aim is to teach C to a beginner, but with enough of the details so as not be outgrown as the years go by. It presumes that you have some previous aquaintance with programming -- you need to know what a variable is and what a function is -- but you do not need much experience. It is not essential to follow the order of the chapters rigorously, but if you are a beginner to C it is recommended. When it comes down to it, most languages have basically the same kinds of features: variables, ways of making loops, ways of making decisions, ways of accessing files etc. If you want to plan your assault on C, think about what you already know about programming and what you expect to look for in C. You will most likely find all of those things and more, as you work though the chapters.
The examples programs range from quick one-function programs,
which do no more than illustrate the sole use of one simple feature,
to complete application examples occupying several pages. In places
these examples make use of features before they have properly been
explained. These programs serve as a taster of what is to come.
Mark Burgess. 1987, 1999
This book was first written in 1987; this new edition was updated and rewritten in 1999. The book was originally published by Dabs Press. Since the book has gone out of print, David Atherton of Dabs and I agreed to release the manuscript, as per the original contract. This new edition is written in Texinfo, which is a documentation system that uses a single source file to produce both on-line information and printed output. You can read this tutorial online, using either the Emacs Info reader, the standalone Info reader, or a World Wide Web browser, or you can read this same text as a typeset, printed book.
What is C? What is it for? Why is it special?
Any kind of object that is sufficiently complicated can be thought of as having levels of detail; the amount of detail we see depends upon how closely we scrutinize it. A computer falls definitely into the category of complex objects and it can be thought of as working at many different levels. The terms low level and high level are often used to describe these onion-layers of complexity in computers. Low level is perhaps the easiest to understand: it describes a level of detail which is buried down amongst the working parts of the machine: the low level is the level at which the computer seems most primitive and machine-like. A higher level describes the same object, but with the detail left out. Imagine stepping back from the complexity of the machine level pieces and grouping together parts which work together, then covering up all the details. (For instance, in a car, a group of nuts, bolts, pistons can be grouped together to make up a new basic object: an engine.) At a high level a computer becomes a group of black boxes which can then be thought of as the basic components of the computer.
C is called a high level, compiler language. The aim of any high
level computer language is to provide an easy and natural way of giving
a programme of instructions to a computer (a computer program). The
language of the raw computer is a stream of numbers called machine
code. As you might expect, the action which results from a single
machine code instruction is very primitive and many thousands of them
are required to make a program which does anything substantial. It is
therefore the job of a high level language to provide a new set of
black box instructions, which can be given to the computer without us
needing to see what happens inside them - and it is the job of a
compiler to fill in the details of these "black boxes" so that the final
product is a sequence of instructions in the language of the computer.
C is one of a large number of high level languages which can be
used for general purpose programming, that is, anything from writing
small programs for personal amusement to writing complex applications. It
is unusual in several ways. Before C, high level languages were
criticized by machine code programmers because they shielded the user
from the working details of the computer, with their black box approach,
to such an extent that the languages become inflexible: in other words,
they did not not allow programmers to use all the facilities which the
machine has to offer. C, on the other hand, was designed to give access
to any level of the machine down to raw machine code and because of this
it is perhaps the most flexible of all high level languages.
Surprisingly, programming books often ignore an important role of
high level languages: high level programs are not only a way to express
instructions to the computer, they are also a means of communication
among human beings. They are not merely monologues to the machine, they
are a way to express ideas and a way to solve problems. The C
language has been equipped with features that allow programs to be
organized in an easy and logical way. This is vitally important for
writing lengthy programs because complex problems are only manageable
with a clear organization and program structure. C allows meaningful
variable names and meaningful function names to be used in programs
without any loss of efficiency and it gives a complete freedom of style;
it has a set of very flexible loop constructions (for,
while, do) and neat ways of making decisions. These
provide an excellent basis for controlling the flow of programs.
Another unusual feature of C is the way it can express ideas concisely. The richness of a language shapes what it can talk about. C gives us the apparatus to build neat and compact programs. This sounds, first of all, either like a great bonus or something a bit suspect. Its conciseness can be a mixed blessing: the aim is to try to seek a balance between the often conflicting interests of readability of programs and their conciseness. Because this side of programming is so often presumed to be understood, we shall try to develop a style which finds the right balance.
C allows things which are disallowed in other languages: this is no defect, but a very powerful freedom which, when used with caution, opens up possibilities enormously. It does mean however that there are aspects of C which can run away with themselves unless some care is taken. The programmer carries an extra responsibility to write a careful and thoughtful program. The reward for this care is that fast, efficient programs can be produced.
C tries to make the best of a computer by linking as closely as possible to the local environment. It is no longer necessary to have to put up with hopelessly inadequate input/output facilities anymore (a legacy of the timesharing/mainframe computer era): one can use everything that a computer has to offer. Above all it is flexible. Clearly no language can guarantee intrinsically good programs: there is always a responsibility on the programmer, personally, to ensure that a program is neat, logical and well organized, but it can give a framework in which it is easy to do so.
The aim of this book is to convey some of the C philosophy in a practical way and to provide a comprehensive introduction to the language by appealing to a number of examples and by sticking to a strict structuring scheme. It is hoped that this will give a flavour of the kind of programming which C encourages.
What to do with a compiler. What can go wrong.
Using a compiler language is not the same as using an interpreted language like BASIC or a GNU shell. It differs in a number of ways. To begin with, a C program has to be created in two stages:
Compiler languages do not usually contain their own editor, nor do they
have words like RUN with which to execute a finished program. You
use a screen editor to create the words of a program (program text) and
run the final program in its compiled form usually by simply typing the
name of the executable file.
A C program is made by running a compiler which takes the
typed source program and converts it into an object file that the
computer can execute. A compiler usually operates in two or more phases
(and each phase may have stages within it). These phases must be
executed one after the other. As we shall see later, this approach
provides a flexible way of compiling programs which are split into many
files.
A two-phase compiler works in the following way:
gcc
-c; the output is one or more .o files.
gcc -o or ld.
To avoid the irritation of typing two or three separate commands (which
are often cumbersome) you will normally find a simple interface
for executing compiler. Traditionally this is an executable
program called cc for C Compiler:
cc filename gcc filename
On GNU systems, this results in the creation of an executable
program with the default name a.out. To tell the compiler
what you would like the executable program to be called,
use the -o option for setting the name of the object code:
gcc -o program-name filname
For example, to create a program called myprog from a file
called myprog.c, write
gcc -o myprog myprog.c
Errors are mistakes which we the programmers make. There are different kinds of error:
For example, suppose you write sin (x) y = ; in a program instead
of y = sin (x);, which assigns the value of the sin of x
to y. Upon compilation, you would see this error message:
eg.c: In function `main': eg.c:12: parse error before `y'
(If you compile the program in Emacs, you can jump directly to the error.)
A program with syntax errors will cause a compiler program to stop trying to generate machine code and will not create an executable. However, a compiler will usually not stop at the first error it encounters but will attempt to continue checking the syntax of a program right to the last line before aborting, and it is common to submit a program for compilation only to receive a long and ungratifying list of errors from the compiler.
It is a shock to everyone using a compiler for the first time how a single error can throw the compiler off course and result in a huge and confusing list of non-existent errors, following a single true culprit. The situation thus looks much worse than it really is. You'll get used to this with experience, but it can be very disheartening.
As a rule, look for the first error, fix that, and then
recompile. Of course, after you have become experienced, you will
recognize when subsequent error messages are due to independent problems
and when they are due to a cascade. But at the beginning, just look for
and fix the first error.
If the compilation of a program is successful, then a new file is created. This file will contain machine code which can be executed according to the rules of the computer's local operating system.
When a programmer wants to make alterations and corrections to a C program, these have to be made in the source text file itself using an editor; the program, or the salient parts, must then be recompiled.
One of the reasons why the compiler can fail to produce the executable file for a program is you have mistyped something, even through the careless use of upper and lower case characters. The C language is case dependent. Unlike languages such as Pascal and some versions of BASIC, the C compiler distinguishes between small letters and capital letters. This is a potential source of quite trivial errors which can be difficult to spot. If a letter is typed in the wrong case, the compiler will complain and it will not produce an executable program.
Compiler languages require us to make a list of the names and types of all variables which are going to be used in a program and provide information about where they are going to be used. This is called declaring variables. It serves two purposes: firstly, it provides the compiler with a definitive list of the variables, enabling it to cross check for errors, and secondly, it informs the compiler how much space must be reserved for each variable when the program is run. C supports a variety of variable types (variables which hold different kinds of data) and allows one type to be converted into another. Consequently, the type of a variable is of great importance to the compiler. If you fail to declare a variable, or declare it to be the wrong type, you will see a compilation error.
C programs are constructed from a set of reserved words which provide
control and from libraries which perform special functions. The basic
instructions are built up using a reserved set of words, such as
main, for, if,while, default,
double, extern, for, and int, to name just a
few. These words may not be used in just any old way: C demands that
they are used only for giving commands or making statements. You cannot
use default, for example, as the name of a variable. An attempt
to do so will result in a compilation error.
See All the Reserved Words, for a complete list of the reserverd words.
Words used in included libaries are also, effectively, reserved. If you use a word which has already been adopted in a library, there will be a conflict between your choice and the library.
Libraries provide frequently used functionality and, in practice,
at least one library must be included in every program: the so-called C
library, of standard functions. For example, the stdio library,
which is part of the C library, provides standard facilities for input
to and output from a program.
In fact, most of the facilities which C offers are provided as libraries that are included in programs as plug-in expansion units. While the features provided by libraries are not strictly a part of the C language itself, they are essential and you will never find a version of C without them. After a library has been included in a program, its functions are defined and you cannot use their names.
printf() functionOne invaluable function provided by the standard input/output
library is called printf or `print-formatted'. It provides an
superbly versatile way of printing text. The simplest way to use it is
to print out a literal string:
printf ("..some string...");
Text is easy, but we also want to be able to print out the contents of
variables. These can be inserted into a text string by using a `control
sequence' inside the quotes and listing the variables after the string
which get inserted into the string in place of the control
sequence. To print out an integer, the control sequence %d is used:
printf ("Integer = %d",someinteger);
The variable someinteger is printed instead of %d. The
printf function is described in full detail in the relevant
chapter, but we'll need it in many places before that. The example
program below is a complete program. If you are reading this in Info,
you can copy this to a file, compile and execute it.
/***********************************************************/
/* Short Poem */
/***********************************************************/
#include <stdio.h>
/***********************************************************/
main () /* Poem */
{
printf ("Astronomy is %dderful \n",1);
printf ("And interesting %d \n",2);
printf ("The ear%d volves around the sun \n",3);
printf ("And makes a year %d you \n",4);
printf ("The moon affects the sur %d heard \n",5);
printf ("By law of phy%d great \n",6);
printf ("It %d when the the stars so bright \n",7);
printf ("Do nightly scintill%d \n",8);
printf ("If watchful providence be%d \n",9);
printf ("With good intentions fraught \n");
printf ("Should not keep up her watch divine \n");
printf ("We soon should come to %d \n",0);
}
Astronomy is 1derful \n" And interesting 2 The ear3 volves around the sun And makes a year 4 you The moon affects the sur 5 heard By law of phy6d great It 7 when the the stars so bright Do nightly scintill8 If watchful providence be9 With good intentions fraught Should not keep up her watch divine We soon should come to 0
Where is a C program born? How is it created?
The basic control of a computer rests with its operating system. This is a layer of software which drives the hardware and provides users with a comfortable environment in which to work. An operating system has two main components which are of interest to users: a user interface (often a command language) and a filing system. The operating system is the route to all input and output, whether it be to a screen or to files on a disk. A programming language has to get at this input and output easily so that programs can send out and receive messages from the user and it has to be in contact with the operating system in order to do this. In C the link between these two is very efficient.
Operating systems vary widely but most have a command language or shell which can be used to type in commands. Recently the tendency has been to try to eliminate typing completely by providing graphical user interfaces (GUIs) for every purpose. GUIs are good for carrying out simple procedures like editing, but they are not well suited to giving complicated instructions to a computer. For that one needs a command language. In the network version of this book we shall concentrate on Unix shell commands since they are the most important to programmers. On microcomputers command languages are usually very similar in concept, though more primitive, with only slightly different words for essentially the same commands. (This is a slightly superficial view).
When most compiler languages were developed, they were intended to be run on large mainframe computers which operated on a multi-user, time-sharing principle and were incapable of interactive communication with the user. Many compiler languages still have this inadequacy when carried over to modern computers, but C is an exception, because of its unique design. Input and output are not actually defined as a fixed, unchanging part of the C language. Instead there is a standard file which has to be included in programs and defines the input/output commands that are supported by the language for a particular computer and operating system. This file is called a standard C library. (See the next chapter for more information.) The library is standard in the sense that C has developed a set of functions which all computers and operating systems must implement, but which are specially adapted to your system.
The filing system is also a part of input/output. In many operating systems all routes in and out of the computer are treated by the operating system as though they were files or data streams (even the keyboard!). C does this implicitly (it comes from Unix). The file from which C normally gets its input from is called stdin or standard input file and it is usually the keyboard. The corresponding route for output is called "stdout" or standard output file and is usually a monitor screen. Both of these are parts of stdio or standard input output. The keyboard and the monitor screen are not really files, of course, they are `devices', (it is not possible to re-read what has been sent to the monitor", or write to the keyboard.), but devices are represented by files with special names, so that the keyboard is treated as a read-only file, the monitor as a write only file... The advantage of treating devices like this is that it is not necessary to know how a particular device works, only that it exists somewhere, connected to the computer, and can be written to or read from. In other words, it is exactly the same to read or write from a device as it is to read or write from a file. This is a great simplification of input/output! The filenames of devices (often given the lofty title `pseudo device names') depend upon your particular operating system. For instance, the printer might be called "PRN" or "PRT". You might have to open it explicitly as a file. When input is taken solely from the keyboard and output is always to the screen then these details can just be forgotten.
The compiler uses a special convention for the file names, so that we do
not confuse their contents.
The name of a source program (the code which you write) is
filename.c. The compiler generates a file of object code from
this called filename.o, as yet unlinked. The final program,
when linked to libraries is called filename on Unix-like
operating systems, and filename.EXE on Windows derived
systems. The libraries themselves are also files of object code, typically
called liblibraryname.a or liblibraryname.so.
Header files are always called libname.h.
The endings `dot something' (called file extensions) identify the
contents of files for the compiler. The dotted endings mean that the
compiler can generate an executable file with the same name as the
original source - just a different ending. The quad file and the object
file are only working files and should be deleted by the compiler at the
end of compilation. The .c suffix is to tell the compiler that
the file contains a C source program and similarly the other letters
indicate non-source files in a convenient way. To execute the compiler
you type,
cc filenameFor example,
cc foo.c
In order to do anything with a compiler or an editor you need to know a little about the command language of the operating system. This means the instructions which can be given to the system itself rather than the words which make up a C program. e.g.
ls -l less filename emacs filename
In a large operating system (or even a relatively small one) it can be a major feat of recollection to know all of the commands. Fortunately it is possible to get by with knowing just handful of the most common ones and having the system manual around to leaf through when necessary.
Another important object is the `panic button' or program interruption key. Every system will have its own way of halting or terminating the operation of a program or the execution of a command. Commonly this will involve two simultaneous key presses, such as CTRL C, CTRL Z or CTRL-D etc. In GNU/Linux, CTRL-C is used.
Plug-in C expansions. Header files.
The core of the C language is small and simple. Special functionality is provided in the form of libraries of ready-made functions. This is what makes C so portable. Some libraries are provided for you, giving you access to many special abilities without needing to reinvent the wheel. You can also make your own, but to do so you need to know how your operating system builds libraries. We shall return to this later.
Libraries are files of ready-compiled code which we can merge with a C
program at compilation time. Each library comes with a number of
associated header files which make the functions easier to use.
For example, there are libraries of mathematical functions, string
handling functions and input/output functions and graphics libraries. It
is up to every programmer to make sure that libraries are added at
compilation time by typing an optional string to the compiler. For
example, to merge with the math library libm.a you would type
cc -o program_name prog.c -lm
when you compile the program.
The -lm means: add in libm. If we wanted to add in the socket
library libsocket.a to do some network programming as well, we
would type
cc -o program_name prog.c -lm -lsocket
and so on.
Why are these libraries not just included automatically? Because it would be a waste for the compiler to add on lots of code for maths functions, say, if they weren't needed. When library functions are used in programs, the appropriate library code is included by the compiler, making the resulting object code often much longer.
Libraries are supplemented by header files which define macros, data types and external data to be used in conjunction with the libraries. Once a header file has been included, it has effectively added to the list of reserved words and commands in the language. You cannot then use the names of functions or macros which have already been defined in libraries or header files to mean anything other than what the library specifies.
The most commonly used header file is the standard input/output library
which is called stdio.h. This belongs to a subset of the standard
C library which deals with file handling. The math.h header file
belongs to the mathematics library libm.a. Header files for
libraries are included by adding to the source code:
#include header.h
at the top of a program file. For instance:
#include "myheader.h"
includes a personal header file which is in the current directory. Or
#include <stdio.h>
includes a file which lies in a standard directory like /usr/include.
The #include directive is actually a command to the C
preprocessor, which is dealt with more fully later, See Preprocessor.
Some functions can be used without having to include library files or special
libraries explicitly since every program is always merged with the
standard C library, which is called libc.
#include <stdio.h>
main ()
{
printf ("C standard I/O file is included\n");
printf ("Hello world!");
}
A program wishing to use a mathematical function such as cos
would need to include a mathematics library header file.
#include <stdio.h>
#include <math.h>
main ()
{ double x,y;
y = sin (x);
printf ("Maths library ready");
}
A particular operating system might require its own special library for certain operations such as using a mouse or for opening windows in a GUI environment, for example. These details will be found in the local manual for a particular C compiler or operating system.
Although there is no limit, in principle, to the number of libraries which can be included in a program, there may be a practical limit: namely memory, since every library adds to the size of both source and object code. Libraries also add to the time it takes to compile a program. Some operating systems are smarter than others when running programs and can load in only what they need of the large libraries. Others have to load in everything before they can run a program at all, so many libraries would slow them down.
To know what names libraries have in a particular operating system you have to search through its documentation. Unix users are lucky in having an online manual which is better than most written ones.
The shape of programs to come.
C is actually a free format language. This means that there are no rules about how it must be typed, when to start new lines, where to place brackets or whatever. This has both advantages and dangers. The advantage is that the user is free to choose a style which best suits him or her and there is freedom in the way in which a program can be structured. The disadvantage is that, unless a strict style is adopted, very sloppy programs can be the result. The reasons for choosing a well structured style are that:
No simple set of rules can ever provide the ultimate solution to writing good programs. In the end, experience and good judgement are the factors which decide whether a program is written well or poorly written. The main goal of any style is to achieve clarity. Previously restrictions of memory size, power and of particular compilers often forced restrictions upon style, making programs clustered and difficult. All computers today are equipped with more than enough memory for their purposes, and have very good optimizers which can produce faster code than most programmers could write themselves without help, so there are few good reasons not to make programs as clear as possible.
What goes into a C program? What will it look like?
C is made up entirely of building blocks which have a particular `shape'
or form. The form is the same everywhere in a program, whether it is the
form of the main program or of a subroutine. A program is made up of
functions, functions are made up of statements and declarations
surrounded by curly braces { }.
The basic building block in a C program is the function. Every C
program is a collection of one or more functions, written in some
arbitrary order. One and only one of these functions in the program must
have the name main(). This function is always the starting point of a C
program, so the simplest C program would be just a single function
definition:
main ()
{
}
The parentheses () which follow the name of the function must be
included even though they apparently serve no purpose at this
stage. This is how C distinguishes functions from ordinary variables.
The function main() does not have to be at the top of a program
so a C program does not necessarily start at line 1. It always starts
where main() is. Also, the function main() cannot be called from
any other function in the program. Only the operating system can call
the function main(): this is how a C program is started.
The next most simple C program is perhaps a program which calls a
function do_nothing and then ends.
/******************************************************/
/* */
/* Program : do nothing */
/* */
/******************************************************/
main() /* Main program */
{
do_nothing();
}
/******************************************************/
do_nothing() /* Function called */
{
}
The program now consists of two functions, one of which is called by the
other. There are several new things to notice about this
program. Firstly the function do_nothing() is called by typing
its name followed by the characteristic () brackets and a
semi-colon. This is all that is required to transfer control to the new
function. In some languages, words like CALL or PROC are used, or even
a symbol like &. No such thing is needed in C.
The semi-colon is vital however. All instructions in C must end with a
semi-colon. This is a signal to inform the compiler that the end of a
statement has been reached and that anything which follows is meant to
be a part of another statement. This helps the compiler diagnose errors.
The `brace' characters { and } mark out a block
into which instructions are written. When the program meets the closing
brace } it then transfers back to main() where it meets
another } brace and the program ends. This is the simplest way
in which control flows between functions in C. All functions have the
same status as far as a program is concerned. The function main()
is treated just as any other function. When a program is compiled, each
function is compiled as a separate entity and then at the end the linker
phase in the compiler attempts to sew them all together.
The examples above are obviously very simple but they illustrate how control flows in a C program. Here are some more basic elements which we shall cover.
The skeleton plan of a program, shown below, helps to show how the elements of a C program relate. The following chapters will then expand upon this as a kind of basic plan.
/****************************************************/
/* */
/* Skeleton program plan */
/* */
/****************************************************/
#include <stdio.h> /* Preprocessor defns */
#include <myfile.c>
#define SCREAM "arghhhhh"
#define NUMBER_OF_BONES 123
/****************************************************/
main () /* Main program & start */
{ int a,b; /* declaration */
a=random();
b=function1();
function2(a,b);
}
/****************************************************/
function1 () /* Purpose */
{
....
}
/****************************************************/
function2 (a,b) /* Purpose */
int a,b;
{
....
}
Neither comments nor preprocessor commands have a special place in this
list: they do not have to be in any one particular place within the
program.
Annotating programs.
Comments are a way of inserting remarks and reminders into a program without affecting its content. Comments do not have a fixed place in a program: the compiler treats them as though they were white space or blank characters and they are consequently ignored. Programs can contain any number of comments without losing speed. This is because comments are stripped out of a source program by the compiler when it converts the source program into machine code.
Comments are marked out or delimited by the following pairs of characters:
/* ...... comment ......*/
Because a comment is skipped over as though it were a single space, it can be placed anywhere where spaces are valid characters, even in the middle of a statement, though this is not to be encouraged. You should try to minimize the use of comments in a program while trying to maximize the readability of the program. If there are too many comments you obscure your code and it is the code which is the main message in a program.
main () /* The almost trivial program */
{
/* This little line has no effect */
/* This little line has none */
/* This little line went all the way down
to the next line */
/* And so on ... */
}
#include <stdio.h> /* header file */
#define NOTFINISHED 0
/**********************************************/
/* A bar like the one above can be used to */
/* separate functions visibly in a program */
main ()
{ int i; /* declarations */
do
{
/* Nothing !!! */
}
while (NOTFINISHED);
}
/* .. to start but forgets the ..*/ to close?
Making black boxes. Solving problems. Getting results.
A function is a module or block of program code which deals with a particular task. Making functions is a way of isolating one block of code from other independent blocks of code. Functions serve two purposes. They allow a programmer to say: `this piece of code does a specific job which stands by itself and should not be mixed up with anyting else', and they make a block of code reusable since a function can be reused in many different contexts without repeating parts of the program text.
Functions help us to organize a program in a simple way; in Kernighan & Ritchie C they are always written in the following form:
identifier (parameter1,parameter2,..)
types of parameters
{ variable declarations
statements..
......
....
}
For example
Pythagoras(x,y,z)
double x,y,z;
{ double d;
d = sqrt(x*x+y*y+z*z);
printf("The distance to your point was %f\n",d);
}
In the newer ANSI standard, the same function is written slightly differently:
Pythagoras(double x, double y, double z)
{ double d;
d = sqrt(x*x+y*y+z*z);
printf("The distance to your point was %f\n",d);
}
You will probably see both styles in C programs.
Each function has a name or identifier by which is used to refer to it
in a program. A function can accept a number of parameters or values
which pass information from outside, and consists of a number
of statements and declarations, enclosed by curly braces { }, which
make up the doing part of the object. The declarations and `type of
parameter' statements are formalities which will be described in good
time.
The name of a function in C can be anything from a single letter to a
long word. The name of a function must begin with an alphabetic letter
or the underscore _ character but the other characters in the
name can be chosen from the following groups:
a .. z
A .. Z
0 .. 9
_
This means that sensible names can easily be chosen for functions
making a program easy to read. Here is a real example function which adds
together two integer numbers a and b and prints the result
c. All the variables are chosen to be integers to keep things
simple and the result is printed out using the print-formatted function
printf, from the the standard library, with a "%d" to indicate that it
is printing a integer.
Add_Two_Numbers (a,b) /* Add a and b */
int a,b;
{ int c;
c = a + b;
printf ("%d",c);
}
Notice the position of the function name and where braces and semi-colons are placed: they are crucial. The details are quickly learned with practice and experience.
This function is not much use standing alone. It has to be called from somewhere. A function is called (i.e. control is passed to the function) by using its name with the usual brackets () to follow it, along with the values which are to be passed to the function:
main ()
{ int c,d;
c = 1;
d = 53;
Add_Two_Numbers (c,d);
Add_Two_Numbers (1,2);
}
The result of this program would be to print out the number 54 and then the number 3 and then stop. Here is a simple program which makes use of some functions in a playful way. The structure diagram shows how this can be visualized and the significance of the program `levels'. The idea is to illustrate the way in which the functions connect together:
Level 0: main ()
|
Level 1: DownOne ()
/ \
/ \
Level 2: DownLeft() DownRight()
Note: not all functions fit into a tidy hierarchy like these. Some functions
call themselves, while others can be called from anywhere in a program.
Where would you place the printf function in this hierarchy?
/***********************************************/
/* */
/* Function Snakes & Ladders */
/* */
/***********************************************/
#include <stdio.h>
/***********************************************/
/* Level 0 */
/***********************************************/
main ()
{
printf ("This is level 0: the main program\n");
printf ("About to go down a level \n");
DownOne ();
printf ("Back at the end of the start!!\n");
}
/************************************************/
/* Level 1 */
/************************************************/
DownOne () /* Branch out! */
{
printf ("Down here at level 1, all is well\n");
DownLeft (2);
printf ("Through level 1....\n");
DownRight (2);
printf ("Going back up a level!\n);
}
/************************************************/
/* Level 2 */
/************************************************/
DownLeft (a) /* Left branch */
int a;
{
printf ("This is deepest level %d\n",a);
printf ("On the left branch of the picture\n");
printf ("Going up!!");
}
/************************************************/
DownRight (a) /* Right branch */
int a;
{
printf ("And level %d again!\n",a);
}
In other languages and in mathematics a function is understood to be something which produces a value or a number. That is, the whole function is thought of as having a value. In C it is possible to choose whether or not a function will have a value. It is possible to make a function hand back a value to the place at which it was called. Take the following example:
bill = CalculateBill(data...);
The variable bill is assigned to a function
CalculateBill() and data are some data which are passed to
the function. This statement makes it look as though
CalculateBill() is a number. When this statement is executed in a
program, control will be passed to the function CalculateBill()
and, when it is done, this function will then hand control back. The
value of the function is assigned to "bill" and the program continues.
Functions which work in this way are said to return a value.
In C, returning a value is a simple matter. Consider the function CalculateBill() from the statement above:
CalculateBill(starter,main,dessert) /* Adds up values */
int starter,main,dessert;
{ int total;
total = starter + main + dessert;
return (total);
}
As soon as the return statement is met CalculateBill() stops
executing and assigns the value total to the function. If there were
no return statement the program could not know which value it should
associate with the name CalculateBill and so it would not be
meaningful to speak of the function as having one value. Forgetting a
return statement can ruin a program. For instance if CalculateBill
had just been:
CalculateBill (starter,main,dessert) /* WRONG! */
int starter,main,dessert;
{ int total;
total = starter + main + dessert;
}
then the value bill would just be garbage (no predictable value),
presuming that the compiler allowed this to be written at all. On the
other hand if the first version were used (the one which did use the
return(total) statement) and furthermore no assignment were made:
main ()
{
CalculateBill (1,2,3);
}
then the value of the function would just be discarded, quite
legitimately. This is usually what is done with the input output
functions printf() and scanf() which actually return
values. So a function in C can return a value but it does not have to be
used; on the other hand, a value which has not been returned cannot be
used safely.
NOTE : Functions do not have to return integers: you can decide whether they should return a different data type, or even no value at all. (See next chapter)
Suppose that a program is in the middle of some awkward process
in a function which is not main(), perhaps two or three loops working
together, for example, and suddenly the function finds its
answer. This is where the beauty of the return statement becomes
clear. The program can simply call return(value) anywhere in the
function and control will jump out of any number of loops or whatever
and pass the value back to the calling statement without having to
finish the function up to the closing brace }.
myfunction (a,b) /* breaking out of functions early */
int a,b;
{
while (a < b)
{
if (a > b)
{
return (b);
}
a = a + 1;
}
}
The example shows this. The function is entered with some values for
a and b and, assuming that a is less than b,
it starts to execute one of C's loops called while. In that loop,
is a single if statement and a statement which increases a by one
on each loop. If a becomes bigger than b at any point the
return(b) statement gets executed and the function
myfunction quits, without having to arrive at the end brace
}, and passes the value of b back to the place it was
called.
exit() functionThe function called exit() can be used to terminate a program at
any point, no matter how many levels of function calls have been
made. This is called with a return code, like this:
#define CODE 0 exit (CODE);
This function also calls a number of other functions which perform tidy-up duties such as closing open files etc.
All the variables and values used up to now have been integers. But what happens if a function is required to return a different kind of value such as a character? A statement like:
bill = CalculateBill (a,b,c);
can only make sense if the variable bill and the value of the
function CalculateBill() are the same kind of object: in other
words if CalculatBill() returns a floating point number, then
bill cannot be a character! Both sides of an assignment must
match.
In fact this is done by declaring functions to return a particular type of data. So far no declarations have been needed because C assumes that all values are integers unless you specifically choose something different. Declarations are covered in the next section.
Storing data. Descriminating types. Declaring data.
A variable is a seqeuence of program code with a name (also called its
identifier). A name or identifier in C can be anything from a
single letter to a word. The name of a variable must begin with an
alphabetic letter or the underscore _ character but the other
characters in the name can be chosen from the following groups:
a .. z
A .. Z
0 .. 9
_
Some examples of valid variable names are:
a total Out_of_Memory VAR integer etc...
In C variables do not only have names: they also have types. The
type of a variable conveys to the the compiler what sort of data
will be stored in it. In BASIC and in some older, largely obsolete
languages, like PL/1, a special naming convention is used to determine
the sort of data which can be held in particular variables. e.g. the
dollar symbol $ is commonly used in BASIC to mean that a variable is a
string and the percentage % symbol is used to indicate an integer. No
such convention exists in C. Instead we specify the types
of variables in their declarations. This serves two purposes:
There is a lot of different possible types in C. In fact it is possible for us to define our own, but there is no need to do this right away: there are some basic types which are provided by C ready for use. The names of these types are all reserved words in C and they are summarized as follows:
char
short
short int
int
long
long int
float
long float
double
void
enum
volatile
There is some repetition in these words. In
addition to the above, the word unsigned can also be placed in
front of any of these types. Unsigned means that only positive or zero
values can be used. (i.e. there is no minus sign). The advantage of
using this kind of variable is that storing a minus sign takes up some
memory, so that if no minus sign is present, larger numbers can be
stored in the same kind of variable. The ANSI standard also
allows the word signed to be placed in front of any of these types,
so indicate the opposite of unsigned. On some systems variables are signed
by default, whereas on others they are not.
To declare a variable in a C program one writes the type followed by a list of variable names which are to be treated as being that type:
typename variablename1,..,..,variablenameN;
For example:
int i,j; char ch; double x,y,z,fred; unsigned long int Name_of_Variable;Failing to declare a variable is more risky than passing through customs and failing to declare your six tonnes of Swiss chocolate. A compiler is markedly more efficient than a customs officer: it will catch a missing declaration every time and will terminate a compiling session whilst complaining bitterly, often with a host of messages, one for each use of the undeclared variable.
There are two kinds of place in which declarations can be made, See Scope. For now it will do to simply state what these places are.
#include lines, for example.) Variables declared
here are called global variables. There are also called
static and external variables in special
cases.)
#include <stdio.h>
int globalinteger; /* Here! outside {} */
float global_floating_point;
main ()
{
}
main ()
{ int a;
float x,y,z;
/* statements */
}
or
function ()
{ int i;
/* .... */
while (i < 10)
{ char ch;
int g;
/* ... */
}
}
When a variable is declared in C, the language allows a neat piece of syntax which means that variables can be declared and assigned a value in one go. This is no more efficient than doing it in two stages, but it is sometimes tidier. The following:
int i = 0; char ch = 'a';
are equivalent to the more longwinded
int i; char ch; i = 0; ch = 'a';
This is called initialization of the variables. C always allows the programmer to write declarations/initializers in this way, but it is not always desirable to do so. If there are just one or two declarations then this initialization method can make a program neat and tidy. If there are many, then it is better to initialize separately, as in the second case. A lot means when it starts to look as though there are too many. It makes no odds to the compiler, nor (ideally) to the final code whether the first or second method is used. It is only for tidiness that this is allowed.
charA character type is a variable which can store a single ASCII
character. Groups of char form strings.
In C single characters are written enclosed by single
quotes, e.g. 'c'! (This is in contrast to strings of many characters which use
double quotes, e.g. "string") For instance, if ch is the name
of a character:
char ch; ch = 'a';
would give ch the value of the character a. The same effect can also
be achieved by writing:
char ch = 'a';
A character can be any ASCII character, printable or not printable
from values -128 to 127. (But only 0 to 127 are used.) Control
characters i.e. non printable characters are put into programs by
using a backslash \ and a special character or number. The characters
and their meanings are:
\b
\f
\n
\r
\t
\v
\"
\'
\\
\ddd
\xddd
/***************************************************/
/* */
/* Special Characters */
/* */
/***************************************************/
#include <stdio.h>
main ()
{
printf ("Beep! \7 \n");
printf ("ch = \'a\' \n");
printf (" <- Start of this line!! \r");
}
The output of this program is:
Beep! (and the BELL sound ) ch = 'a' <- Start of this line!!
and the text cursor is left where the arrow points. It is also possible to have the type:
unsigned char
This admits ASCII values from 0 to 255, rather than -128 to 127.
There are five integer types in C and they are called char, int,
long, long long and short. The difference between these is the size
of the integer which either can hold and the amount of storage required
for them. The sizes of these objects depend on the operating system of
the computer. Even different flavours of Unix can have varying sizes for
these objects. Usually, the two to remember are int and
short. int means a `normal' integer and short means
a `short' one, not that that tells us much. On a typical 32 bit
microcomputer the size of these integers is the following:
Type Bits Possible Values short 16 -32768 to 32767 unsigned short 16 0 to 65535 int 32 -2147483648 to 2147483647 long 32 (ditto) unsigned int 32 0 to 4294967295 long long 64 -9e18 to + 8e18
Increasingly though, 64 bit operating systems are appearing and long integers are 64 bits long. You should always check these values. Some mainframe operating systems are completely 64 bit, e.g. Unicos has no 32 bit values. Variables are declared in the usual way:
int i,j; i = j = 0;or
short i=0,j=0;
There are also long and short floating point numbers in C.
All the mathematical functions which C can use require
double or long float arguments so it is common to use the type
float for storage only of small floating point numbers and to use
double elsewhere. (This not always true since the C `cast' operator
allows temporary conversions to be made.) On a typical 32 bit
implementation the different types would be organized as follows:
Type Bits Possible Values float 32 +/- 10E-37 to +/- 10E38 double 64 +/- 10E-307 to +/- 10E308 long float 32 (ditto) long double ???
Typical declarations:
float x,y,z; x = 0.1; y = 2.456E5 z = 0; double bignum,smallnum; bignum = 2.36E208; smallnum = 3.2E-300;
The sort of procedure that you would adopt when choosing variable names is something like the following:
Some local variables are only used temporarily, for controlling loops for instance. It is common to give these short names (single characters). A good habit to adopt is to keep to a consistent practice when using these variables. A common one, for instance is to use the letters:
int i,j,k;
to be integer type variables used for counting. (There is not particular reason why this should be; it is just common practice.) Other integer values should have more meaningful names. Similarly names like:
double x,y,z;
tend to make one think of floating point numbers.
Variables can be assigned to numbers:
var = 10;
and assigned to each other:
var1 = var2;
In either case the objects on either side of the = symbol must be of
the same type. It is possible (though not usually sensible) to assign a
floating point number to a character for instance. So
int a, b = 1; a = b;
is a valid statement, and:
float x = 1.4; char ch; ch = x;
is a valid statement, since the truncated value 1 can be assigned to
ch. This is a questionable practice though. It is unclear why
anyone would choose to do this. Numerical values and characters will
interconvert because characters are stored by their ASCII codes (which
are integers!) Thus the following will work:
int i;
char ch = 'A';
i = ch;
printf ("The ASCII code of %c is %d",ch,i);
The result of this would be:
The ASCII code of A is 65
It is worth mentioning briefly a very valuable operator in C: it is called the cast operator and its function is to convert one type of value into another. For instance it would convert a character into an integer:
int i; char ch = '\n'; i = (int) ch;
The value of the integer would be the ASCII code of the character. This is the only integer which it would make any sense to talk about in connection with the character. Similarly floating point and integer types can be interconverted:
float x = 3.3; int i; i = (int) x;
The value of i would be 3 because an integer cannot represent decimal
points, so the cast operator rounds the number. There is no such
problem the other way around.
float x; int i = 12; x = (float) i;
The general form of the cast operator is therefore:
(type) variable
It does not always make sense to convert types. This will be seen particularly with regard to structures and unions. Cast operators crop up in many areas of C. This is not the last time they will have to be explained.
/***************************************************/
/* */
/* Demo of Cast operator */
/* */
/***************************************************/
#include <stdio.h>
main () /* Use int float and char */
{ float x;
int i;
char ch;
x = 2.345;
i = (int) x;
ch = (char) x;
printf ("From float x =%f i =%d ch =%c\n",x,i,ch);
i = 45;
x = (float) i;
ch = (char) i;
printf ("From int i=%d x=%f ch=%c\n",i,x,ch);
ch = '*';
i = (int) ch;
x = (float) ch;
printf ("From char ch=%c i=%d x=%f\n",ch,i,x);
}
static and externSometimes C programs are written in more than one text file. If this is the case then, on occasion, it will be necessary to get at variables which were defined in another file. If the word extern is placed in front of a variable then it can be referenced across files:
File 1 File 2
int i;
main () function ()
{ {
extern int i;
} }
In this example, the function main() in file 1 can use the
variable i from the function main in file 2.
Another class is called static. The name static is given to
variables which can hold their values between calls of a function:
they are allocated once and once only and their values are preserved
between any number of function calls. Space is allocated for static
variables in the program code itself and it is never disposed of
unless the whole program is. NOTE: Every global variable, defined
outside functions has the type static automatically. The opposite of
static is auto.
Functions do not always have to return values which are integers
despite the fact that this has been exclusively the case up to
now. Unless something special is done to force a function to return a
different kind of value C will always assume that the type of a
function is int.
If you want this to be different, then a function has to be declared to be a certain type, just as variables have to be. There are two places where this must be done:
float function1 ()
{
return (1.229);
}
A function which returns a character:
char function2 ()
{
return ('*');
}
main(), they would have to
declared in the variables section as:
main ()
{ char ch, function2 ();
float x, function1 ();
x = function1 ();
ch = function2 ();
}
If a function whose type is not integer is not declared like
this, then compilation errors will result! Notice also that the
function must be declared inside every function which calls it, not
just main().
Ralph23
80shillings
mission_control
A%
A$
_off
i and j.
floa and double.
int and unsigned int?
long float, it must
be done in, at least, two places. Where are these?
Ways in and out of functions.
Not all functions will be as simple as the ones which have been given so
far. Functions are most useful if they can be given information to work
with and if they can reach variables and data which are defined outside
of them. Examples of this have already been seen in a limited way. For
instance the function CalculateBill accepted three values
a,b and c.
CalculateBill (a,b,c)
int a,b,c;
{ int total;
total = a + b + c;
return total;
}
When variable values are handed to a function, by writing them
inside a functions brackets like this, the function is said to accept
parameters. In mathematics a parameter is a variable which controls
the behaviour of something. In C it is a variable which carries some
special information. In CalculateBill the "behaviour" is the addition
process. In other words, the value of total depends upon the
starting values of a,b and c.
Parameters are about communication between different functions in a program. They are like messengers which pass information to and from different places. They provide a way of getting information into a function, but they can also be used to hand information back. Parameters are usually split into two categories: value parameters and variable parameters. Value parameters are one-way communication carrying information into a function from somewhere outside. Variable parameters are two-way.
A function was defined by code which looks like this:
identifier (parameters...)
types of parameters
{
}
Parameters, like variables and functions, also have types which must be declared. For instance:
function1 (i,j,x,y)
int i,j;
float x,y;
{
}
or
char function2 (x,ch)
double x;
char ch;
{ char ch2 = '*';
return (ch2);
}
Notice that they are declared outside the block braces.
A value parameter is the most common kind of parameter. All of
the examples up to know have been examples of value parameters.
When a value parameter is passes information to a function its
value is copied to a new place which is completely isolated from the
place that the information came from. An example helps to show
this. Consider a function which is called from main() whose purpose is
to add together two numbers and to print out the result.
#include <stdio.h>
main ()
{
add (1,4);
}
/*******************************************/
add (a,b)
int a,b;
{
printf ("%d", a+b);
}
When this program is run, two new variables are automatically created by
the language, called a and b. The value 1 is copied into a
and the value 4 is copied into b. Obviously if a and b were given
new values in the function add() then this could not change the
values 1 and 4 in main(), because 1 is always 1 and 4 is always
4. They are constants. However if instead the program had been:
main ()
{ int a = 1, b = 4;
add (a,b);
}
/**************************************/
add (a,b)
int a,b;
{
printf ("%d", a+b);
}
then it is less clear what will happen. In fact exactly the same thing happens:
add() is called from main() two new variables a and
b are created by the language (which have nothing to do
with the variables a and b in main() and are
completely isolated from them).
a in main() is copied into the value of a in
add().
b in main() is copied into the value of b in
add().
Now, any reference to a and b within the function
add() refers only to the two parameters of add and not to
the variables with the same names which appeared in main(). This means that if
a and b are altered in add() they will not affect
a and b in main(). More advanced computing texts
have names for the old and they new a and b:
Here are some points about value parameters.
#include <stdio.h>
main ()
{ int a = 1, b = 4;
add (a,b);
}
/*******************************************/
add (i,j)
int i,j;
{
printf ("%d", i+j);
}
In this case the value of a in main() would be copied to
the value of i in add() and the value of b in main()
would be copied to the value of j in add().
main ()
{
function ('*',1.0);
}
/********************************/
function (ch,i)
char ch;
int i;
{
}
is probably wrong because 1.0 is a floating point value, not an integer.
sin(3.41415);
cos(a+b*2.0);
strlen("The length of this string");
The value returned by a function can be used directly as a value parameter. It does not have to be assigned to a variable first. For instance:
main ()
{
PrintOut (SomeValue());
}
/*********************************************/
PrintOut (a) /* Print the value */
int a;
{
printf ("%d",a);
}
/**********************************************/
SomeValue () /* Return an arbitrary no */
{
return (42);
}
This often gives a concise way of passing a value to a function.
/**************************************************/
/* */
/* Value Parameters */
/* */
/**************************************************/
/* Toying with value parameters */
#include <stdio.h>
/**************************************************/
/* Level 0 */
/**************************************************/
main () /* Example of value parameters */
{ int i,j;
double x,x_plus_one();
char ch;
i = 0;
x = 0;
printf (" %f", x_plus_one(x));
printf (" %f", x);
j = resultof (i);
printf (" %d",j);
}
/***************************************************/
/* level 1 */
/***************************************************/
double x_plus_one(x) /* Add one to x ! */
double x;
{
x = x + 1;
return (x);
}
/****************************************************/
resultof (j) /* Work out some result */
int j;
{
return (2*j + 3); /* why not... */
}
/******************************************************/
/* */
/* Program : More Value Parameters */
/* */
/******************************************************/
/* Print out mock exam results etc */
#include <stdio.h>
/******************************************************/
main () /* Print out exam results */
{ int pupil1,pupil2,pupil3;
int ppr1,ppr2,ppr3;
float pen1,pen2,pen3;
pupil1 = 87;
pupil2 = 45;
pupil3 = 12;
ppr1 = 200;
ppr2 = 230;
ppr3 = 10;
pen1 = 1;
pen2 = 2;
pen3 = 20;
analyse (pupil1,pupil2,pupil3,ppr1,ppr2,
ppr3,pen1,pen2,pen3);
}
/*******************************************************/
analyse (p1,p2,p3,w1,w2,w3,b1,b2,b3)
int p1,p2,p3,w1,w2,w3;
float b1,b2,b3;
{
printf ("Pupil 1 scored %d percent\n",p1);
printf ("Pupil 2 scored %d percent\n",p2);
printf ("Pupil 3 scored %d percent\n",p3);
printf ("However: \n");
printf ("Pupil1 wrote %d sides of paper\n",w1);
printf ("Pupil2 wrote %d sides\n",w2);
printf ("Pupil3 wrote %d sides\n",w3);
if (w2 > w1)
{
printf ("Which just shows that quantity");
printf (" does not imply quality\n");
}
printf ("Pupil1 used %f biros\n",b1);
printf ("Pupil2 used %f \n",b2);
printf ("Pupil3 used %f \n",b3);
printf ("Total paper used = %d", total(w1,w2,w3));
}
/*****************************************************/
total (a,b,c) /* add up total */
int a,b,c;
{
return (a + b + c);
}
(As a first time reader you may wish to omit this section until you have read about Pointers and Operators.)
One way to hand information back is to use the return statement.
This function is slightly limited however in
that it can only hand the value of one variable back at a time. There
is another way of handing back values which is less restrictive, but
more awkward than this. This is by using a special kind of parameter,
often called a variable parameter. It is most easily explained with
the aid of an example:
#include <stdio.h>
main ()
{ int i,j;
GetValues (&i,&j);
printf ("i = %d and j = %d",i,j)
}
/************************************/
GetValues (p,q)
int *p,*q;
{
*p = 10;
*q = 20;
}
To understand fully what is going on in this program requires a knowledge of pointers and operators, which are covered in later sections, but a brief explanation can be given here, so that the method can be used.
There are two new things to notice about this program: the
symbols & and *.
The ampersand & symbol should be read as "the address of..".
The star * symbol should be read as "the contents of the
address...". This is easily confused with the multiplication symbol
(which is identical). The difference is only in the context in which
the symbol is used. Fortunately this is not ambiguous since
multiplication always takes place between two numbers or variables,
whereas the "contents of a pointer" applies only to a single variable
and the star precedes the variable name.
So, in the program above, it is not the variables themselves which are
being passed to the procedure but the addresses of the the variables. In
other words, information about where the variables are stored in the
memory is passed to the function GetValues(). These addresses are
copied into two new variables p and q, which are said to
be pointers to i and j. So, with variable parameters, the
function does not receive a copy of the variables themselves, but
information about how to get at the original variable which was
passed. This information can be used to alter the "actual parameters"
directly and this is done with the * operator.
*p = 10;
means: Make the contents of the address held in p equal to
10. Recall that the address held in p is the address of the
variable i, so this actually reads: make i equal to
10. Similarly:
*q = 20;
means make the contents of the address held in q equal to 20. Other
operations are also possible (and these are detailed in the section on
pointers) such as finding out the value of i and putting it into a new
variable, say, a:
int a; a = *p; /* is equivalent to a = i */
Notice that the * symbol is required in the declaration of these parameters.
/**************************************************/
/* */
/* Program : Variable Parameters */
/* */
/**************************************************/
/* Scale some measurements on a drawing, say */
#include <stdio.h>
/**************************************************/
main () /* Scale measurements*/
{ int height,width;
height = 4;
width = 5;
ScaleDimensions (&height,&width);
printf ("Scaled height = %d\n",height);
printf ("Scaled width = %d\n",width);
}
/****************************************************/
ScaleDimensions (h,w) /* return scaled values */
int *h, *w;
{ int hscale = 3; /* scale factors */
int wscale = 1;
*h = *h * hscale;
*w = *w * wscale;
}
Where a program's fingers can't reach.
From the computer's point of view, a C program is nothing more than a collection of functions and declarations. Functions can be thought of as sealed capsules of program code which float on a background of white space, and are connected together by means of function calls. White space is the name given to the white of an imaginary piece of paper upon which a program is written, in other words the spaces and new line characters which are invisible to the eye. The global white space is only the gaps between functions, not the gaps inside functions. Thinking of functions as sealed capsules is a useful way of understanding the difference between local and global objects and the whole idea of scope in a program.
Another analogy is to think of what goes on in a function as being like watching a reality on television. You cannot go in and change the TV reality, only observe the output, but the television show draws its information from the world around it. You can send a parameter (e.g. switch channels) to make some choices. A function called by a function, is like seeing someone watching a televsion, in a television show.
Global variables are declared in the white space between functions. If every function is a ship floating in this sea of white space, then global variables (data storage areas which also float in this sea) can enter any ship and also enter anything inside any ship (See the diagram). Global variables are available everywhere;. they are created when a program is started and are not destroyed until a program is stopped. They can be used anywhere in a program: there is no restriction about where they can be used, in principle.
Local variables are more interesting. They can not enter just any region of the program because they are trapped inside blocks. To use the ship analogy: if it is imagined that on board every ship (which means inside every function) there is a large swimming pool with many toy ships floating inside, then local variables will work anywhere in the swimming pool (inside any of the toys ships, but can not get out of the large ship into the wide beyond. The swimming pool is just like a smaller sea, but one which is restricted to being inside a particular function. Every function has its own swimming pool! The idea can be taken further too. What about swimming pools onboard the toy ships? (Meaning functions or blocks inside the functions!
/* Global white space "sea" */
function ()
{
/* On board ship */
{
/* On board a toy ship */
}
}
The same rules apply for the toy ships. Variables can reach anywhere inside them but they cannot get out. They cannot escape their block braces {}. Whenever a pair of block braces is written into a program it is possible to make variable declarations inside the opening brace. Like this:
{ int locali;
char localch;
/* statements */
}
These variables do not exist outside the braces. They are only created when the opening brace is encountered and they are destroyed when the closing brace is executed, or when control jumps out of the block. Because they only work in this local area of a program, they are called local variables. It is a matter of style and efficiency to use local variables when it does not matter whether variables are preserved outside of a particular block, because the system automatically allocates and disposes of them. The programmer does not have to think about this.
Where a variable is and is not defined is called the scope of that variable. It tells a programmer what a variables horizons are!
If functions were sealed capsules and no local variables could ever communicate with other parts of the program, then functions would not be very useful. This is why parameters are allowed. Parameters are a way of handing local variables to other functions without letting them out! Value parameters (see last section) make copies of local variables without actually using them. The copied parameter is then a local variable in another function. In other words, it can't get out of the function to which is it passed ... unless it is passed on as another parameter.
Notice about the example that if there are two variables of the
same name, which are both allowed to be in the same place (c in the
example below) then the more local one wins. That is, the last
variable to be defined takes priority. (Technically adept readers will
realize that this is because it was the last one onto the variable
stack.)
/***************************************************************/
/* */
/* SCOPE : THE CLLLED CAPSULES */
/* */
/***************************************************************/
#include <stdio.h>
/***************************************************************/
main ()
{ int a = 1, b = 2, c = 3;
if (a == 1)
{ int c;
c = a + b;
printf ("%d",c);
}
handdown (a,b);
printf ("%d",c);
}
/**************************************************************/
handdown (a,b) /* Some function */
int a,b;
{
...
}
Some programmers complain about the use of global variables in a program. One complaint is that it is difficult to see what information is being passed to a function unless all that information is passed as parameters. Sometimes global variables are very useful however, and this problem need not be crippling. A way to make this clear is to write global variables in capital letters only, while writing the rest of the variables in mainly small letters..
int GLOBALINTEGER;
....
{ int local integer;
}
This allows global variables to be spotted easily. Another reason for restricting the use of global variables is that it is easier to debug a program if only local variables are used. The reason is that once a function capsule is tested and sealed it can be guaranteed to work in all cases, provided it is not affected by any other functions from outside. Global variables punch holes in the sealed function capsules because they allow bugs from other functions to creep into tried and tested ones. An alert and careful programmer can usually control this without difficulty.
The following guidelines may help the reader to decide whether to use local or global data:
All the programs in this book, which are longer than a couple of lines, are written in an unusual way: with a levelled structure There are several good reasons for this. One is that the sealed capsules are shown to be sealed, by using a comment bar between each function.
/**************************************/
Another good reason is that any function hands parameters down by only
one level at a time and that any return() statement hands values up a
single level. The global variables are kept to a single place at the
head of each program so that they can be seen to reach into
everything.
The diagram shows how the splitting of levels implies something about the scope of variables and the handing of parameters.
{} )
a "sealed capsule"?
number_of_hats,counter which are GLOBAL and two float variables
called x_coord,y_coord which are LOCAL inside the function
main(). Then add another function called another() and pass
x_coord,y_coord to this function. How many different storage spaces
are used when this program runs? (Hint: are x_coord,y_coord and their
copies the same?)
Making programming versatile.
C is unusual in that it has a pre-processor. This comes from its
Unix origins. As its name might suggest, the preprocessor is a phase
which occurs prior to compilation of a program. The preprocessor has two
main uses: it allows external files, such as header files, to be
included and it allows macros to be defined. This useful feature
traditionally allowed constant values to be defined in Kernighan and
Ritchie C, which had no constants in the language.
Pre-processor commands are distinguished by the hash (number) symbol
#. One example of this has already been encountered for the
standard header file stdio.h.
#include <stdio.h>
is a command which tells the preprocessor
to treat the file stdio.h as if it were the actually part of the
program text, in other words to include it as part of the program to
be compiled.
Macros are words which can be defined to stand in place of something complicated: they are a way of reducing the amount of typing in a program and a way of making long ungainly pieces of code into short words. For example, the simplest use of macros is to give constant values meaningful names: e.g.
#define TELEPHNUM 720663
This allows us to use the word TELEPHNUM in the program
to mean the number 720663. In this particular case, the word is
clearly not any shorter than the number it will replace, but it is
more meaningful and would make a program read more naturally than if
the raw number were used. For instance, a program which deals with
several different fixed numbers like a telephone number, a postcode
and a street number could write:
printf("%d %d %d",TELEPHNUM,postcode,streetnum);
instead of
printf("%d %d %d",720663,345,14);
Using the macros instead makes the actions much clearer and allows the programmer to forget about what the numbers actually are. It also means that a program is easy to alter because to change a telephone number, or whatever, it is only necessary to change the definition, not to retype the number in every single instance.
The important feature of macros is that they are not merely numerical constants which are referenced at compile time, but are strings which are physically replaced before compilation by the preprocessor! This means that almost anything can be defined:
#define SUM 1 + 2 + 3 + 4
would allow SUM to be used instead of 1+2+3+4. Or
#define STRING "Mary had a little lamb..."
would allow a commonly used string to be called by the identifier "string" instead of typing it out afresh each time. The idea of a define statement then is:
#define macroname definition on rest of lineMacros cannot define more than a single line to be substituted into a program but they can be used anywhere, except inside strings. (Anything enclosed in string quotes is assumed to be complete and untouchable by the compiler.) Some macros are defined already in the file
stdio.h such as:
EOF
NULL
A more advanced use of macros is also permitted by the
preprocessor. This involves macros which accept parameters and hand
back values. This works by defining a macro with some dummy parameter,
say x. For example: a macro which is usually defined in one of the
standard libraries is abs() which means the absolute or unsigned value
of a number. It is defined below:
#define ABS(x) ((x) < 0) ? -(x) : (x)
The result of this is to give the positive (or unsigned) part of any
number or variable. This would be no problem for a function which could
accept parameters, and it is, in fact, no problem for macros. Macros can
also be made to take parameters. Consider the ABS() example. If a
programmer were to write ABS(4) then the preprocessor would
substitute 4 for x. If a program read ABS(i) then the
preprocessor would substitute i for x and so on. (There is
no reason why macros can't take more than one parameter too. The
programmer just includes two dummy parameters with different names. See
the example listing below.) Notice that this definition uses a curious
operator which belongs to C:
<test> ? <true result> : <false result>
This is like a compact way of writing an if..then..else
statement, ideal for macros. But it is also slightly different: it is
an expression which returns a value, where as an if..then..else
is a statement with no value.
Firstly the test is made. If the test is
true then the first statement is carried out, otherwise the second is
carried out. As a memory aid, it could be read as:
if <test> then <true result> else <false result>
(Do not be confused by the above statement which is meant to show what a programmer might think. It is not a valid C statement.) C can usually produce much more efficient code for this construction than for a corresponding if-else statement.
It is tempting to forget about the distinction between macros and functions, thinking that it can be ignored. To some extent this is true for absolute beginners, but it is not a good idea to hold on to. It should always be remembered that macros are substituted whole at every place where they are used in a program: this is potentially a very large amount of repetition of code. The advantage of a macro, however, is speed. No time is taken up in passing control over to a new function, because control never leaves the home function when a macro is used: it just makes the function a bit longer. There is a limitation with macros though. Function calls cannot be used as their parameters, such as:
ABS(function())
has no meaning. Only variables or number constants will be substituted. Macros are also severely restricted in complexity by the limitations of the preprocessor. It is simply not viable to copy complicated sequences of code all over programs.
Choosing between functions and macros is a matter of personal judgement. No simple rules can be given. In the end (as with all programming choices) it is experience which counts towards the final ends. Functions are easier to debug than macros, since they allow us to single step through the code. Errors in macros are very hard to find, and can be very confusing.
/************************************************************/ /* */ /* MACRO DEMONSTRATION */ /* */ /************************************************************/ #include <stdio.h> #define STRING1 "A macro definition\n" #define STRING2