Friday, July 11, 2008

2.2 Data Types

Integers

C++ requires that all variables used in a program be given a data type. We have already seen the data type int. Variables of this type are used to represent integers (whole numbers). Declaring a variable to be of type int signals to the compiler that it must associate enough memory with the variable's identifier to store an integer value or integer values as the program executes. But there is a (system dependent) limit on the largest and smallest integers that can be stored. Hence C++ also supports the data types short int and long int which represent, respectively, a smaller and a larger range of integer values than int. Adding the prefix unsigned to any of these types means that you wish to represent non-negative integers only. For example, the declaration unsigned short int year_now, age_now, another_year, another_age;
reserves memory for representing four relatively small non-negative integers.
Some rules have to be observed when writing integer values in programs:
Decimal points cannot be used; although 26 and 26.0 have the same value, "26.0" is not of type "int".
Commas cannot be used in integers, so that (for example) 23,897 has to be written as "23897".
Integers cannot be written with leading zeros. The compiler will, for example, interpret "011" as an octal (base 8) number, with value 9.
Real numbers

Variables of type "float" are used to store real numbers. Plus and minus signs for data of type "float" are treated exactly as with integers, and trailing zeros to the right of the decimal point are ignored. Hence "+523.5", "523.5" and "523.500" all represent the same value. The computer also accepts real numbers in floating-point form (or "scientific notation"). Hence 523.5 could be written as "5.235e+02" (i.e. 5.235 x 10 x 10), and -0.0034 as "-3.4e-03". In addition to "float", C++ supports the types "double" and "long double", which give increasingly precise representation of real numbers, but at the cost of more computer memory.

Type Casting

Sometimes it is important to guarantee that a value is stored as a real number, even if it is in fact a whole number. A common example is where an arithmetic expression involves division. When applied to two values of type int, the division operator "/" signifies integer division, so that (for example) 7/2 evaluates to 3. In this case, if we want an answer of 3.5, we can simply add a decimal point and zero to one or both numbers - "7.0/2", "7/2.0" and "7.0/2.0" all give the desired result. However, if both the numerator and the divisor are variables, this trick is not possible. Instead, we have to use a type cast. For example, we can convert "7" to a value of type double using the expression "static_cast(7)". Hence in the expression answer = static_cast(numerator) / denominator
the "/" will always be interpreted as real-number division, even when both "numerator" and "denominator" have integer values. Other type names can also be used for type casting. For example, "static_cast(14.35)" has an integer value of 14.
Characters

Variables of type "char" are used to store character data. In standard C++, data of type "char" can only be a single character (which could be a blank space). These characters come from an available character set which can differ from computer to computer. However, it always includes upper and lower case letters of the alphabet, the digits 0, ... , 9, and some special symbols such as #, £, !, +, -, etc. Perhaps the most common collection of characters is the ASCII character set (see for example Savitch, page 978 or just click here).
Character constants of type "char" must be enclosed in single quotation marks when used in a program, otherwise they will be misinterpreted and may cause a compilation error or unexpected program behaviour. For example, "'A'" is a character constant, but "A" will be interpreted as a program variable. Similarly, "'9'" is a character, but "9" is an integer.
There is, however, an important (and perhaps somewhat confusing) technical point concerning data of type "char". Characters are represented as integers inside the computer. Hence the data type "char" is simply a subset of the data type "int". We can even do arithmetic with characters. For example, the following expression is evaluated as true on any computer using the ASCII character set:
'9' - '0' == 57 - 48 == 9
The ASCII code for the character '9' is decimal 57 (hexadecimal 39) and the ASCII code for the character '0' is decimal 48 (hexadecimal 30) so this equation is stating that
57(dec) - 48(dec) == 39(hex) - 30(hex) == 9
It is often regarded as better to use the ASCII codes in their hexadecimal form.
However, declaring a variable to be of type "char" rather than type "int" makes an important difference as regards the type of input the program expects, and the format of the output it produces. For example, the program #include
using namespace std;

int main()
{
int number;
char character;

cout << "Type in a character:\n"; cin >> character;

number = character;

cout << "The character '" <<>Program 2.2.1
produces output such as Type in a character:
9
The character '9' is represented as the number 57 in the computer.

We could modify the above program to print out the whole ASCII table of characters using a "for loop". The "for loop" is an example of a repetition statement - we will discuss these in more detail later. The general syntax is: for (initialisation; repetition_condition ; update) {
Statement1;
...
...
StatementN;
}
C++ executes such statements as follows: (1) it executes the initialisation statement. (2) it checks to see if repetition_condition is true. If it isn't, it finishes with the "for loop" completely. But if it is, it executes each of the statements Statement1 ... StatementN in turn, and then executes the expression update. After this, it goes back to the beginning of step (2) again.
We can also 'manipulate' the output to produce the hexadecimal code. Hence to print out the ASCII table, the program above can be modified to: #include
using namespace std;

int main()
{
int number;
char character;

for (number = 32 ; number <= 126 ; number = number + 1) { character = number; cout << "The character '" <<>Program 2.2.2
which produces the output: The character ' ' is represented as the number 32 decimal or 20 hex.
The character '!' is represented as the number 33 decimal or 21 hex.
...
...
The character '}' is represented as the number 125 decimal or 7D hex.
The character '~' is represented as the number 126 decimal or 7E hex.
Strings

Our example programs have made extensive use of the type "string" in their output. As we have seen, in C++ a string constant must be enclosed in double quotation marks. Hence we have seen output statements such as
cout << "' is represented as the number "; in programs. In fact, "string" is not a fundamental data type such as "int", "float" or "char". Instead, strings are represented as arrays of characters, so we will return to subject of strings later, when we discuss arrays in general.

User Defined Data Types

Later in the course we will study the topic of data types in much more detail. We will see how the programmer may define his or her own data types. This facility provides a powerful programming tool when complex structures of data need to be represented and manipulated by a C++ program.

0 comments: