Chapter 3. C: Pointers
The concept of pointer is relatively unique to C: It
allows you to have a variable that represents the memory address of some
data. The type name for such a variable is represented by the type name
for the data to which it points followed by an asterisk
(`*');
for example, an int* variable will hold the memory address of an
integer.
To get the memory address of a variable, you can use the ampersand
(`&') operator: For example, the value of the expression
is the memory address of &ii.
Conversely, to access the memory referenced by a pointer, you can use
the asterisk (`*') operator — this is called
dereferencing the pointer.
Consider the following example.
int i;
int *p;
i = 4;
p = &i;
*p = 5;
printf("%d\n", i);
In this fragment, we have declared two variables: i, which
holds an integer, and p, which holds the memory address of an
integer.
We first initialize i with the value 4
and p with the value &i.
Then we write
,
which alters the memory referenced by *p = 5;p (that is, i)
to hold 5.
Finally, we print the value of i, which is now 5.
This is a contrived example. A less contrived usage of pointers is when you want a function to change the value of a parameter. For example, say we want to write a function to swap two values. You might be tempted to write the following.
void swap(int i, int j) { /* This won't work!! */
int t;
t = i;
i = j;
j = t;
}
It won't work, though, because C (like Java) passes all parameters by
value: If I call swap(x, y), the values contained by x
and y are copied into the i and j variables.
The swap() function will swap the values contained by
i and j, but this will have no effect on x
and y.
We can get around this by passing pointers instead.
void swap(int *ip, int *jp) {
int t;
t = *ip;
*ip = *jp;
*jp = t;
}
Now we would have to call swap(&x, &y) (not
swap(x, y)). The following figure illustrates how
it works; an explanation is below.
Figure 3.1: The
swap()function in action.
(a) (b) (c)
The value copied into ip will be the address of
x, and the value copied into jp will be the address of
y (Figure 3.1(a)). The line
will
copy the value referenced by t = *ip;ip (that is, x) into
t. The next line will copy the value referenced by jp
(that is, y) into the memory referenced by ip (that is,
x) (Figure 3.1(b)). And the final line will copy the value of t (the
original value of x) into the memory referenced by jp
(that is, y) (Figure 3.1(c)). So the values contained by x and
y will be swapped. [This is the only way to write such a
function in C, where all parameters are passed by value.
Some languages have a feature where you can designate a
parameter to be an implicit pointer — it's called
call by reference as opposed to the
call by value used by C.
Such a feature was added into C++; it was not retained by
Java.]
Suppose the function said the following instead.
t = *ip;
ip = jp; /* Before, this said: *ip = *jp; */
*jp = t;
This would still compile, but the second line would in fact change the
pointer only, so that both ip and jp point to the same
place. After this line, memory would look like the
following.
Thus, the actual value of x would not change with this
attempt at implementation.
In C, the null pointer is called NULL. Its use is similar to
null in Java: It indicates a pointer that points to
nothing.
3.1. The scanf() function
We've already seen the printf() function that allows you to
output information to the screen.
printf("The value of i is %d.", i);
There is also a scanf function that allows you to read
information from the user. Suppose, for example, that you wanted to read
a number from the user.
You can write the following.
printf("Type a number. ");
scanf("%d", &i);
The scanf() function, like the printf() function,
takes a format string indicating what sort of data the function will
read from the user. The parameters following should be the memory
addresses where the data read from the user should be placed.
In this example, the format string
indicates that
the program should read an integer, written in decimal, from the user.
The second parameter, %d&i, indicates that the value read
should be placed into the i variable.
The important thing to remember about the scanf() function
is that it wants memory addresses of variables, not the value
of variables: Those ampersands are important. Of course, the reason
it wants memory addresses is so that scanf() can save the
user's typed data where the calling function wants them.
3.2. Arrays
Arrays in C must be given a fixed length at the time they are declared.
int main() {
int arr[50];
int i;
for(i = 0; i < 50; i++) arr[i] = i;
return 0;
}
Once you create the array variable, you're stuck with its length.
Also, C provides no facility for accessing the length of the array
(as with arr.length in Java).
In C, an array is basically a pointer whose value cannot be changed. In fact, when you pass an array as a parameter, the only thing that really gets passed is the memory address of the first element of the array. So you can write something like the following.
void setToZero(int *arr, int n) {
int i;
for(i = 0; i < n; i++) arr[i] = 0;
}
int main() {
int grades[50];
setToZero(grades, 50);
return 0;
}
In this program, the setToZero function takes a pointer to an
integer as its first parameter. When we call it with
, the address of the first number in
setToZero(grades, 50)grades is copied into the arr parameter
variable. The bracket operator can also be applied to pointers as if
they referenced the first item in an array, so the
line
is legal. (Alternatively, you could
write arr[i] = 0;
. Adding the integer *(arr + i) = 0;i
to the pointer arr would compute the address where index
i would be located if arr were an array, and the
asterisk would dereference this address.)
3.3. Writing outside an array
Actually, that brings up an important point. In Java, each access to an array is checked, and if you access an array out of bounds, you see the friendly ArrayIndexOutOfBoundsException message. C is not nearly so nice. When you access beyond an array's bounds, it blindly does it.
This can lead to peculiar behavior. For example, consider the following program.
int main() {
int i;
int vals[5];
for(i = 0; i <= 5; i++) vals[i] = 0;
printf("%d\n", i);
return 0;
}
Some systems (including at least one version of Linux)
would place i in memory just after the vals array;
thus, when i reaches 5 and the computer executes
it in fact resets the memory corresponding to
vals[i] = 0,i to 0.
As a result, the for loop has reset, and the program goes
through the loop again, and again, repeatedly.
The program never reaches the printf function call.
In more complicated programs, this can lead to very difficult bugs, where a variable's value changes mysteriously somewhere within hundreds of functions, and you as the programmer must determine where an array index was accessed out of bounds. This is the type of bug that takes a lot of time to uncover and repair.
That's why you should consider Java's ArrayIndexOutOfBoundsException message as friendly: Not only does it determine the cause of a problem, it even tells you exactly which line of the program was at fault. This saves you vast amounts of debugging time.
Every once in a while, you'll see a C program crash, with a message like
Segmentation Fault or Bus Error. (It won't helpfully include any
indication of what part of the program is at fault.) Such errors
usually mean that the
program attempts to access an invalid memory location.
This may indicate an attempt to access
an invalid array index, but typically the index needs to be pretty far
out of bounds for this to occur; more frequently, it indicates an
attempt to reference an uninitialized pointer or
a NULL pointer.
3.4. Strings
C includes very minimal support for strings. Basically, a string in C is simply an array of characters. You could easily write the following in a C program.
char *str;
str = "hello";
Here, we've made str be a pointer to a character. In the next
line, we made it point to the array of characters
.hello
Actually, there's also a hidden character to mark the end of the string.
This marker is NUL, the ASCII character whose value is 0. [Although they are spelled similarly,
the distinction between NUL and NULL is significant:
NUL is a character value, while NULL is a pointer
value.]
So, actually, "hello" refers to an array of six
characters, with the sixth character being '\0'
— that
is, NUL.
If you wanted to copy all the letters from the string src to
another string dst, you could use the following for
loop.
for(i = 0; src[i] != '\0'; i++) dst[i] = src[i];
dst[i] = '\0';
This copies all of the characters in src up to,
but not including, the NUL character.
Then it places a NUL character at the end of dst so that the
copied string has the terminator also.
In practice, I'd never write such a for loop in a program.
Instead, I'd use the built-in strcpy() function.
The string.h header file contains prototypes for many library
functions built into C for working with strings. Following are
three.
void strcpy(char *dst, char *src)- Copies all the characters of
srcintodst. int strlen(char *src)- Returns the number of characters in
src(not including the terminating NUL character). int strcmp(char *a, char *b)- Returns zero if
aandbare identical, a negative number ifacomes beforebin lexicographic order, and a positive number ifacomes afterb. (Lexicographic order refers to the ordering based on ASCII codes. For example,
comes lexicographically afterAbc
, since the first characters match but the second characters do not, and the ASCII value ofABC'B'(66) is less than the ASCII value for'b'(97).)
C has no support for strings of indefinite length. You can move the NUL character up the string to make it shorter, but you can't move it past the end of an array.
3.5. Example: Tokenizing a string
Figure 3.2 below shows a useful function that we'll explore
as an illustration of many of the concepts we've covered so far. It
defines a function that takes three parameters: a string referenced
by buf, an array of pointers to strings referenced by
argv, and an integer max_args. The function is to
split the string buf into separate words, placing pointers
to successive words into argv and returning the number of
words found. The max_args parameter indicates how long the
array is.
Figure 3.2: Splitting a string.
#include <ctype.h>
/* splitLine
* Breaks a string into a sequence of words. The pointer to each
* successive word is placed into an array. The function
* returns the number of words found.
*
* Parameters:
* buf - the string to be broken up into words
* argv - the array where pointers to the separate words should go.
* max_args - the maximum number of pointers that the array can hold.
*
* Returns:
* the number of words found in the string.
*/
int splitLine(char *buf, char **argv, int max_args) {
int arg;
while(isspace(*buf)) buf++; /* skip over initial spaces */
for(arg = 0; arg < max_args && *buf != '\0'; arg++) {
argv[arg] = buf;
while(*buf != '\0' && !isspace(*buf)) {
buf++; /* skip past letters in word */
}
if(*buf != '\0') { /* if we're not at sentence's end, */
*buf = '\0'; /* mark word's end and continue */
buf++;
}
while(isspace(*buf)) buf++; /* skip over extra spaces */
}
return arg;
}
For example, suppose we wanted to use this function to split the
sentence The dog is agog.
into words. We'd place this into an array of
characters and pass this string as buf into the function.
We'd also create an array of string pointers to pass as argv,
with max_args being the length of this array.
The function's job is to place pointers into argv to the
individual words.
In this case, the function should return 4, since there are four words in the sentence.
The function accomplishes this by replacing spaces in the sentence
with NUL characters and pointing the array entries referenced by
argv into the sentence's array.
It uses the isspace() function for identifying space
characters; this function's prototype is in the ctype.h
header file included on line 1.







