Course Notes C++ Notes Pointers, Classes, and Virtual Functions

Pointers, Classes, and Virtual Functions

Charles E. Oyibo

We should recall from Chapter 2 that C++'s data types are classified into three categories: simple, structured, and pointers. We have discussed the first two. Pointers are the subject matter of this chapter.

Pointer Data Type and Pointer Variables

Recall that Chapter 2 defined data types as a set of values together with a set of operations. Recall too that the set of values is called the domain of the data type. Furthermore, the data types we have hitherto encountered have the name of the data type associated with them. For example, the data type int has a set of integer values that range between -2147483648 and 2147483647. To manipulate numeric integer data within that range, we declare variables using the word int.

The value belonging to pointer data types are memory address of our computer. There is no name (like int or char) associated with the pointer data type in C++. Because the domain--that is, the values of a pointer data type--are addresses (memory locations), a pointer variable is a variable whose content is an address--that is, a memory location.

Declaring Pointer Variables

When we declare a pointer variable we invariably specify the data type of the value to be stored in the memory location pointed to by the pointer variable.

In C++, we declare a pointer variable by using the asterisk (*) symbol between the data type and the variable name. The general syntax to declare a pointer variable is:

dataType *identifier;

Consider:

int *p;
char *ch;

Here, both p and ch are pointer variable. The content of p (when properly assigned) points to a memory location of the type int, and the content of ch (also when properly assigned) points to a memory location of the type char. Usually, p is called a pointer variable of the type int, and ch is called a pointer variable of the type char.

In the first declaration above, the asterisk * can appear anywhere between int and p; however for consistency, and to eliminate confusion, we attach * to the (pointer) variable name.

C++ provides two operators--the address operator (&) and the dereferencing operator (*)--to work with pointers.

Addess of Operator (&)

In C++, the ampersand, &, called the address of operator, is a unary operator that returns the address of its operand. Given the statements:

int x;
int *p;

the statement:

p = &x;

assigns the address of x to p. That is, x and the value of p refer to the same memory location.

Dereferencing Operator (*)

We have used the asterisk character, *, as a binary multiplication operator previously. C++ also uses * as a unary operator. When used as a unary operator, *, commonly referred to as the dereferencing operator or indirection operator, refers to the object to which its operand (that is, the pointer) points. For example, given the statements:

int x = 25;
int *p;
p = &x; // store the address of x in p

the statement:

cout << *p << endl;

prints the value stored in the memory space pointed to by p, which is the value of x. Also, the statement:

*p = 55;

stores 55 in the location pointed to by p--that is, in x.

Given p, p&, and *p, where p is a pointer variable, the following are true:

p, &p, and *p all have different meanings.
p means the content of p
&p means the address of p
*p means the content of the memory location pointed to by p

Let us consider the declaration of an int pointer variable p thus: int *p. At this point:
-- the value returned by p is unknown (that is, p holds "garbage")
-- the value returned by &p is the memory address of the pointer variable p
-- the value returned by *p does not exist (or is undefined, as p does not point to any memory variable yet)

To reiterate,

int *p;
int x;

x = 50; //assigns 50 to x
p = &x; //assigns the address of x to p, so that *p and x refer to the same memory space, x
*p = 38; //assigns 38 to *p, which refers to x; i.e. *p and x refer to the same memory location, to which 38 is assigned

The declaration int *p; allocated memory for p only, not for *p. We will see how to allocate memory for *p shortly. We make this note because it is important to distinguish between the use of the * operator in the pointer variable declaration and when used as a part of a non-declaration statement.

Note, too, of the statements above, that after the declaration, int *p, the content of p points to an undefined memory location of the type int; at this point using *p in a (non-declaration) statement is invalid and unmeaningful. The assignment statement p = &x assigns a value, specifically the address of x, to p. After this assignment, *p is meaningful.

Classes, Structs, and Pointer Variables

Consider the following declaration of a struct:

struct studentType
{
    char name[27];
    double GPA;
    int studID;
    char grade;
};

studentType student;
studentType* studentPtr;

...continue discussion on classes, structs, and pointer variables.

Initializing Pointer Variables

Pointer values are initialized using the constant value 0, called the null pointer. Thus, the statement p = 0; stores the null pointer in p; that is, p points to nothing. We can also use the named constant NULL to initialize pointer variables thus: p = NULL; Which is equivalent to p = 0;.

The number 0 is the only number that can be directly assigned to a pointer variable.

Dynamic Variables

We now discuss how to allocate and deallocate memory during program execution using pointers.

Variables that are created during program execution are called dynamic variables. C++ provides two operators, new and delete, to create and destroy dynamic variables, respectively.

Operator new

The operator new has two forms: one to allocate a single variable, and the other to allocate an array of variables. The syntax:

new dataType; // to allocate a single (memory) variable
new dataType[intExp]; // to allocate an array of (memory) variables

The operator new allocates memory (a variable) of the designated type and returns a pointer to it--that is, the address of this allocated memory. Moreover, the allocated memory is uninitialized. The statement:

p = new int;

creates a variable during program execution somewhere in memory, and stores the address of the allocated memory in p. The allocated memory is access via pointer dereferencing--namely, *p. Similarly, the statement:

q = new char[16];

creates an array of 16 components of the type char and stores the base address of the array in q.

Because a dynamic array is unnamed, it cannot be accessed directly. It is access indirectly by the pointer returned by new.

Operator delete

Consider:

p = new int; //L1
*p = 54; //L2
p = new int; //L3
*p = 73; //L4

L1 allocates a memory location and stores its address in p. L2 assigns 54 to the memory location allocated by L1. L3 allocates a new memory location and stores its address in p (that is, it overwrites the address of the memory location allocated by L1. L4 assigns 73 to the new memory location.

Question: What happens to the memory location allocated by L1? Obviously p no longer points to it as that pointer variable now holds the address of the new memory location allocated in L3.

Answer: That memory location is now inaccessible, and moreover, cannot be reallocated. This is called a memory leak. That is, there is inaccessible memory that cannot be allocated.

How do we avoid memory leaks? When a dynamic variable is no longer needed, it can be destroyed; that is, its memory can be deallocated. The C++ operator delete is used to destroy or deallocate dynamic variables. The syntax has two forms:

delete pointerVariable; //to deallocate a single dynamic variable
delete [] pointerVariable; //to deallocate a dynamically created array

It is advisable to set pointers to NULL after the delete operation, as in: P = NULL;

Operations on Pointer Variables

The operations that are allowed on pointer variables are the assignment and relational operations and some limited arithmetic operations. Suppose we have int *p, *q;, the statement p = q copies the value of q into p. After this, both p and q point to the same memory location so that any changes made to *p automatically change the value of *q, and vice versa.

The expression p == q evaluates to true if p and q have the same value--that is, if they point to the same memory location. Similarly, p != q evaluates to true if p and q point to different memory locations.

The arithmetic operators that are allowed differ from the arithmetic operations on numbers. Consider:

int *p;
double *q;
char *chPtr;
studentType *stdPtr; //studentType is as defined before

Recall that the size of the memory allocated for an int variable is 4 bytes, a double variable is 8 bytes, and a char variable is 1 byte. The memory allocated for a variable of the type studentType is then 40 bytes.

p++; or p = p + 1 increments the value of p by 4 bytes because p is a pointer of the type int. Similarly, q++;, chPtr++;, and stdPtr++; increment the values of p and chPtr by 8, 1, and 40 bytes, respectively. Moreover, the statement p = p - 2; decrements the value of p by 8 bytes (i.e. 2 x 4 bytes).

Pointer arithmetic can be very dangerous, as the program could accidentally access the memory location of other variables (such as when we increment or decrement) and change their content without warning... It is important to exercise extra care if one must perform pointer arithmetic.

Dynamic Arrays

The arrays discussed in Ch9: Arrays and Strings are called static arrays because their sizes are fixed at compile time. A major limitation of the static arrays is that because the size of the array is fixed, it might not be possible to use the same array to process different data sets of the same type. One way to handle this limitation is by declaring an array that is large enough to process a variety of data sets. However, if the array is big and the data set is small, such a declaration would result in memory waste. On the other hand, it would be helpful, if during program execution, we could prompt the user to enter the size of the array and then create an array of the appropriate size...

An array created during the executionof the program is called a dynamic array. To create a dynamic array, we use the second form of the new operator:

int *p; //declares p to be a pointer variable of the type int.
p = new int[10];
/* allocates 10 contiguous memory locations, each of the type int, and stores the address of the first memory location (the base address) into p.*/
*p = 25; //stores 25 into the first memory location
p++ //p points to the next array component
*p // stores 35 in the second memory location

We can also use the array notation to access these memory locations:

p[0] = 25;
p[1] = 35;

In general, p[i] refers to the (i - 1)th array component.

Of course, we can manipulate dynamic arrays using for loops in the same way we can manipulated static arrays in Ch9.

The following illustrates how to obtain a user's response to get the array size and create a dynamic array during program execution:

int *intList;
int arraySize;

cout << "Enter the array size: ";
cin >> arraySize;
cout << endl;

intList = new int[arraySize];

Functions and Pointers

A pointer variable can be passed to as a parameter to a function either by value or by reference. To declare a pointer as a value parameter in a function heading, we use the same mechanism as we use to declare a variable. To make a formal paramter be a reference parameter, we use & when we declare the formal paramter in the function heading. We must also include a * to make the identifier a pointer. Consider:

void example(int* &p, double *q)
{
...
}

Here, both p and q are pointers. The parameter p is a reference parameter, the parameter q ia a value parameter.

Pointers and Function Return Values

In C++, a function can return the value of the type pointer. E.g., the return type of the function:

int* testExp(/*parameters*/)
{
...
}

is a pointer of the type int.

Shallow versus Deep Copy and Pointers

Consider:

int *first;
int *second;

first = new int[10];

/* some statement that stores some meaningful
data in the array pointed to by first */

The statement:

second = first

copies the value of first (that is, the base address of the array) into second, so that after this statement executes, both first and second point to the same array. This is called a shallow copy.

Formally, in shallow copy, two or more pointers of the same type point to the same memory location; that is they point to the same data.

Now, if we say:

delete [] second;

the array pointed to by second is deleted. But that array is the array also pointed to by first; so, both first and second become dangling pointers (that is, they are undefined). If the program later tries to access the memory pointed to by first, either the program will access the wrong memory or it will terminate in an error.

On the other hand, suppose we have:

second = new int[10]

for (int j = 0; j < 10; j++)
second[j] = first[j];

We create an array of 10 components of type int and the base address of the array is stored in second. Our loop then copies the array pointed to by first into the array pointed to by second. After the execution of the loop, both first and second now point to their own data. If second deletes its memory, there is no effect on first. This is called a deep copy.

Formally, in a deep copy, two or more pointers have their own data.

The preceding discussion underscores the important of knowing when to use a shallow copy and when to use a deep copy.

... discussions about classes and pointers; inheritance, pointers, and virtual functions, etc.

Top of page

Contact Information

Page Last Updated: Saturday February 12, 2005 10:21 AM