5.1 introduction to integer security: integers consist of natural numbers (0, 1, 2, 3,...) including 0 )And negative numbers of nonzero natural numbers (- 1, - 2, - 3 )Composition.

5.2 integer data type: integer type provides a model of a finite subset of integer mathematical sets. The value of an object of type integer is the mathematical value attached to the object. The representation of the value of an object with integer type is the location-specific encoding of the value in the storage space allocated for the object.

Each integer type object in C requires a fixed number of bytes to store. Constant expression char in < limits. H > header file_ Bit, which gives the number of bits in a byte, must be at least 8, but may be larger, depending on the implementation. With the exception of the unsigned char type, not all bits must be used to represent values. Unused bits are called padding.

The standard integer type consists of a set of signed integer types and corresponding unsigned integer types.

Unsigned integer type: C requires that the value of the unsigned integer type be represented by a pure binary system without offset. Unsigned integers are the natural choice of counters. The standard unsigned integer types (sorted by their length) are: unsigned char, unsigned short int, unsigned int, unsigned long int, unsigned long int. the keyword int can be omitted unless it is the only keyword of integer type.

Compiler and platform specific integer extremum values are recorded in the < limits. H > header file. Keep in mind that these values are platform specific. For portability reasons, named constants should be used instead of actual values in your code.

Wrap around: calculations involving unsigned operands will never overflow, because the result value that cannot be represented by the result type of an unsigned integer is modulo reduced by the sum of the maximum value that the type can represent plus 1. Because of rewinding, an unsigned integer expression can never find a value less than zero.

`// Rewind: calculations involving unsigned operands never overflow void test_integer_security_wrap_around() { unsigned int ui = UINT_MAX; fprintf(stdout, "ui value 1: %u\n", ui); // 4294967295 ui++; fprintf(stdout, "ui value 2: %u\n", ui); // 0 ui = 0; fprintf(stdout, "ui value 3: %u\n", ui); // 0 ui--; fprintf(stdout, "ui value 4: %u\n", ui); // 4294967295 //For (unsigned I = n; -- I > = 0;) / / this loop will never terminate unsigned int i = 0, j = 0, sum = 0; // ... some assignment operations on i, j, sum if (sum + i > UINT_MAX) { } // It won't happen, because sum+i rewind if (i > UINT_MAX - sum) { } // A lot better if (sum - j < 0) { } // It's not going to happen, because sum-j is circling if (j > sum) { } // correct }`

Unless you use the type of exact width explicitly specified in < stdint. H >, the width used for rewinding depends on the implementation, which means the results will vary from platform to platform.

Signed integer type: signed integers are used to represent positive and negative values. The range of values depends on the number of digits assigned to the type and how they are represented. In C, except_ Except for Bool type, each unsigned type has a corresponding signed type occupying the same storage space. Standard signed integer types (sorted non decrementally by length, for example, long long int cannot be shorter than long int) include the following types: signed char, short int, int, long int, long long int, except for the char type, signed can be ignored (an unmodified char either behaves as unsigned char or as signed Char, which depends on the implementation, and is considered a separate type for historical reasons). Int can be omitted unless it is the only keyword that exists.

All non negative values that are small enough have the same representation in the corresponding signed and unsigned types. A bit called a sign bit is treated as the highest bit to indicate whether the value represented is negative. There are three ways to express negative numbers in C standard, namely, sign and magnitude, one's complement and two's complement

(1) . original code representation: the symbol bit indicates whether the value is negative (the symbol bit is set to 1) or positive (the symbol bit is set to 0), and other value bits (not filled) indicate the magnitude of the value expressed in pure binary representation (the same as the unsigned type). To get the opposite number of the original code, just change the sign bit. For example, in the pure binary representation or source code, the binary number 000010101011 is equal to the decimal number 43. To take the opposite number of the value, just set the symbol bit: the binary number 1000101011 is equal to the decimal number - 43.

(2) . inverse notation: the sign bit has the weight - (2^(N-1) - 1), and the weight of other value bits is the same as the unsigned type. For example, in the inverse code, the binary number 1111010100 is equal to the decimal number - 43. Assuming that the width is 10 bits, the sign bits have the weight - (2 ^ 9 - 1) that is - 511, and the rest bits are equal to 468, so 468-511 = - 43. To take the opposite number of an inverse, you need to change each bit (including the sign bit).

(3) . complement representation: the sign bit has the weight - (2^(N-1)), and the weight of other value bits is the same as the unsigned type. For example, in complement, the binary number 1111010101 is equal to the decimal number - 43. Assuming that the width is 10 bits, the sign bit has the weight - (2 ^ 9), that is - 512, and the rest bits are equal to 469, so 469-512 = - 43. To take the opposite number of a complement, first construct the opposite number of the inverse code, and then add 1 (carry if necessary).

For a mathematical value of 0, both the source code and the inverse code can be expressed in two ways: normal 0 and negative zero. A logical operation may produce a negative 0, but any arithmetic operation does not allow a negative 0 result unless one of the operands has a negative 0 representation. The following table shows the source code, inverse code, and complement representation of some interesting values when the 10 bit width is assumed and the fill bit is ignored: on computers using the complement representation, the value range of signed integers is - 2 ^ (n-1) - 2 ^ (n-1) - 1. When using the inverse code representation and the original code representation, the lower bound of the value range becomes - 2^(N-1) + 1, while the upper bound remains unchanged.

Value range of signed integers: the minimum value column in the following table determines the portable range guaranteed by each standard signed integer type. These amplitudes are replaced by implementation defined amplitudes with the same symbols, as those shown for the x86-32 architecture. C standard requires that the minimum width of standard signed types are: signed char(8), short(16), int(16), long(32), long long(64). The actual width of a given implementation can be referenced by the maximum representable value defined in < limits. H >. The size of these types of objects (the number of bytes stored) can be determined by sizeof(typename), which contains the padding bits, if any. The minimum and maximum values of an integer type depend on the representation, symbolism, and width of the type.

Integer overflow: overflow occurs when the result value of a signed integer operation cannot be represented by the result type. A signed integer overflow in C is an undefined behavior that allows the implementation to silently rewind (the most common behavior), trap, or both. The opposite of the minimum negative value of a given type represented by a complement cannot be represented by that type.

`// Signed integer overflow void test_integer_security_overflow() { int i = INT_MAX; // 2147483647, int Max i++; fprintf(stdout, "i = %d\n", i); // -2147483648, int min i = INT_MIN; // -2147483648, int min i--; fprintf(stdout, "i = %d\n", i); // 2147483647, int Max std::cout << "abs(INT_MIN): " << std::abs(INT_MIN) << std::endl; // -2147483648 // Because the binary complement representation is asymmetric and the value 0 is expressed as a "positive" number, the opposite number of the minimum negative value of a given type expressed by the complement cannot be expressed in that type // For the smallest negative value, the result is undefined or incorrect #define abs(n) ((n) < 0 ? -(n) : (n)) #undef abs }`

Character type: only explicit signed or unsigned char is used when char type is used for numerical value. It is recommended to use only the signed char and unsigned char types to store and use small values (that is, the range is in schar, respectively_ Min and schar_ Between Max, or 0 and UCHAR_MAX), because this is the only way to guarantee the symbolic character type of data portability. A trivial char should not be used to store values, because the compiler has the freedom to define a char with the same range, representation, and behavior as a signed or unsigned char.

`// Character type void test_integer_security_char() { { // Variable c of type char may be signed or unsigned // The initial value 200, which has the signed char type, cannot be represented in the (signed) char type (this is undefined behavior) // Many compilers will convert 200 to - 56 using the standard modular word size rule from unsigned to signed char c = 200; int i = 1000; fprintf(stdout, "i/c = %d\n", i / c); // On windows/linux, it will output - 17, 1000 / - 56 = - 17 } { // The variable c of unsigned char type is declared so that the subsequent division operation is independent of the sign of char, so it has a predictable result unsigned char c = 200; int i = 1000; fprintf(stdout, "i/c = %d\n", i / c); // 5 } }`

Data model: for a given compiler, the data model defines the size assigned to the standard data type. These data models are usually named with a XXXn pattern, where each X refers to a C type, while n refers to a size (usually 32 or 64), usually named ILP64: int, long and pointer types are 64 bit wide; LP32: long and pointer types are 32 bit wide.

Other integer types: C also defines other integer types in standard header files < stdint. H >, < stdtypes. H > and < stddef. H >. These types include the extended integer type. They are optional, implementation defined and fully supported extensions. Together with standard integer types, they form a general class of integer types. Standard header files such as what_ T defines identifiers that are typedef, that is, they are synonyms for existing types, not new ones.

size_t: Is the result of the sizeof operator of the unsigned integer type, which is defined in the standard header file < stddef. H >. size_ Variables of type t ensure enough precision to represent the size of an object. size_ The maximum value of T is determined by SIZE_MAX macro.

ptrdiff_t: Is a signed integer type that represents the result of subtraction of two pointers and is defined in the standard header file < stddef. H >. When two pointers are subtracted, the result is the difference between the subscripts of two array elements. The size of the result is implementation defined and its type, a signed integer type, is ptrdiff_t. ptrdiff_ The lower and upper limits of T are determined by PRTDIFF_MIN and PTRDIFF_MAX definition.

`void test_integer_security_ptrdiff_t() { int i = 5, j = 6; typedef int T; T *p = &i, *q = &j; ptrdiff_t d = p - q; fprintf(stdout, "pointer diff: %lld\n", d); fprintf(stdout, "sizeof(ptrdiff_t): %d\n", sizeof(ptrdiff_t)); // 8 }`

intmax_t and uintmax_t: Is an integer type with the maximum width, which can represent any value that any other integer type with the same sign can represent. It is allowed to define the integer type (the same sign) and intmax_t and uintmax_ Conversion between T types.

`void test_integer_security_intmax_t() { typedef unsigned long long mytypedef_t; // Suppose mytypedef_t is a 128 bit unsigned integer, but it's not fprintf(stdout, "mytypedef_t length: %d\n", sizeof(mytypedef_t)); mytypedef_t x = 0xffff; uintmax_t temp; temp = x; // Always safe mytypedef_t x2 = 0xffffffffffffffff; fprintf(stdout, "x2: %ju\n", (uintmax_t)x2); // The correct x2 value will be printed regardless of its length }`

The format I/O function can be used to input and output integer type values of maximum width. The j length modifier i n the format string indicates that the following d, I, o, u, x, x, or n conversion specifier will apply to a type of intmax_t or unitmax_ Parameters for t.

intptr_t and uintptr_t: The C standard does not guarantee the existence of an integer type that is large enough to hold a pointer to an object. However, if such a type does exist, its signed version is called intptr_t. Its unsigned version is called uintptr_t. These types of arithmetic operations do not guarantee a useful value.

Platform independent integer type of control width: C language introduces integer type in header files < stdint. H > and < inttypes. H >, which provides typedef for programmers to better control width. These integer types are implementation defined and include the following types:

(1).int#_t,uint#_t: Where X represents an exact width, such as int8_t,uint32_t.

(2).int_least#_t,uint_least#_t: Where ා represents width value, such as int_least32_t,uint_least16_t.

(3).int_fast#_t,uint_fast#_t: Where ා represents the value of the fastest integer type width, such as int_fast16_t,uint_fast64_t.

The header file < stdint. H > also defines a constant macro for extension types that represents the corresponding maximum (and minimum for signed types).

Platform specific integer types: in addition to the integer types defined in the C standard, vendors usually define platform specific integer types. For example, the Microsoft Windows API defines a number of integer types, including__ int8,__ int16, BOOL, CHAR, LONG64, etc.

5.3 integer conversion

Cast integer: a cast is a change in the underlying data type used to represent the result value of an assignment, type cast, or calculation. A conversion from a type with a certain width to a type with a larger width usually preserves mathematical values. However, conversions in the opposite direction can easily result in high losses (even worse when it comes to signed integer types) unless the magnitude of the value is always small enough to be represented correctly. Conversion occurs either explicitly when the conversion is cast or implicitly as a requirement of an operation. Although implicit transformations simplify programming, they can also cause data loss or misinterpretation.

The C standard specifies how the C compiler should handle conversion operations, including integer promotion, integer conversion range, and normal arithmetic conversion.

Integer conversion level: each integer type has a corresponding integer conversion level, which determines how the conversion operation will be performed. The rules defined by the C standard to determine the level of integer conversion are listed below:

(1) . no two different signed integer types have the same level, even if they have the same representation.

(2) . the signed integer type has a higher level than any signed integer type with lower precision.

(3).long long int type has higher level than long int; long int has higher level than int; int has higher level than short int; short int has higher level than signed char.

(4) . the level of the unsigned integer type is the same as that of the corresponding signed integer type (if the corresponding signed integer type exists).

(5) . standard integer types have a higher level than extended integer types with the same width.

(6)._Bool types should be at a lower level than all other standard integer types.

(7).char, signed char and unsigned char have the same level.

(8) . the level of any extended signed integer type related to "other extended signed integer types with the same precision" is defined by the implementation, but they are still subject to other rules used to determine the level of integer conversion.

(9) For the three integer types T1, T2 and T3, if T1 is higher than T2 and T2 is higher than T3, then T1 is also higher than T3.

C standard recommended for size_t and ptrdiff_ Integer conversion level of type t should not be higher than signed long int unless the implementation supports large enough objects to make this necessary.

Integer type promotion: if an integer type has an integer conversion level lower than or equal to int or unsigned int, its object or expression will be promoted when it is used for an expression requiring int or unsigned int. Integer type promotion is part of the normal arithmetic conversion.

`void test_integer_security_promotion() { { int sum = 0; char c1 = 'a', c2 = 'b'; // Integer type promotion rules require both c1 and c2 to be promoted to int type // Then add the data of the two int types to get a value of int type, and the result is saved in the integer type variable sum sum = c1 + c2; fprintf(stdout, "sum: %d\n", sum); // 195 } { signed char cresult, c1, c2, c3; c1 = 100; c2 = 3; c3 = 4; // On platforms where signed char is represented by 8-bit complement, the result of multiplying c1 and c2 may exceed the maximum value (+ 127) of signed char type on these platforms // However, due to integer type promotion, c1, c2 and c3 are all converted to int, so the result of the whole expression is // It can be calculated successfully. The result is then truncated and stored in cresult. Because the result is within the value range of signed char type, because // This truncation does not result in data loss or data interpretation errors cresult = c1 * c2 / c3; fprintf(stdout, "cresult: %d\n", cresult); // 75 } { unsigned char uc = UCHAR_MAX; // 0xFF // When uc is used as the operand of the inverse operator "~", it is extended to 32 bits by using zero extension, and it is promoted to the signed int type. Therefore, in the // In x86-32 architecture platform, this operation always produces a negative value of type signed int int i = ~uc; fprintf(stdout, "i: %0x\n", i); // 0xffffff00 } }`

Integer promotes reserved values, including symbols. If the smaller type can be represented as an int in all the original values, then: the type with the smaller original value will be converted to int; otherwise, it will be converted to unsigned int.

The reason why integer type promotion is needed is mainly to prevent arithmetic errors caused by overflow of intermediate results in the operation process, and also to perform operations with natural size in the architecture.

Common arithmetic conversion: a set of rules. A consistency conversion involves two operands of different types. Either or both operands can be converted. Many operators that accept integer operands use the usual arithmetic conversion to convert their operands. These operators include *, /,%, +, -, <, >, < =, > =, = =,! =, &, ^, |, and conditional operators (?:). When integer type promotion rules are applied to two operands at the same time, the following rules will be applied to the promoted operands:

(1) . if both operands have the same type, no further conversion is required.

(2) . if two operands have the same integer type (signed or unsigned), operands of the type with the lower integer conversion level are converted to the type with the higher one. For example, if a signed int operand is tied to a signed long operand, the signed int operand is converted to signed long.

(3) . if the level of an unsigned integer type operand is greater than or equal to the level of another operand type, the signed integer type operand is converted to the type of the unsigned integer type operand. For example, if a signed int operand is tied to an unsigned int operand, the signed int operand is converted to an unsigned int.

(4) . if the signed integer type operand type can represent all possible values of the unsigned integer type operand type, the unsigned integer type operand will be converted to the type of the signed integer type operand. For example, if a 64 bit complement signed long operand is tied to a 32-bit complement unsigned int operand, the unsigned int operand is converted to signed long.

(5) . otherwise, both operands are converted to the unsigned integer type corresponding to the signed integer type operand type.

From unsigned integer type conversion: it is always safe to convert from a smaller unsigned integer type to a larger one, usually by zero extending its value. When an expression contains unsigned integer operands of different widths, the C standard requires that the results of each operation have the type (and representation range) of the wider operands. If the corresponding mathematical operation produces a result within the range that the result type can represent, the resulting representation value is that mathematical value. If the mathematical result value cannot be represented by the result type, there are two kinds of situations: no sign, loss of precision; no sign value converted to signed value:

`void test_integer_security_unsigned_conversion() { { // No sign, loss of accuracy unsigned int ui = 300; // When uc is given a value stored in the ui, the value 300 is redundant by modulus 2 ^ 8, or 300-256 = 44 unsigned char uc = ui; fprintf(stdout, "uc: %u\n", uc); // 44 } { // Convert unsigned value to signed value unsigned long int ul = ULONG_MAX; signed char sc; sc = ul; // May cause truncation errors fprintf(stdout, "sc: %d\n", sc); // -1 } { // When converting from an unsigned type to a signed type, verify the scope unsigned long int ul = ULONG_MAX; signed char sc; if (ul <= SCHAR_MAX) { sc = (signed char)ul; // Use casts to eliminate warnings } else { // Handling error conditions fprintf(stderr, "fail\n"); } } }`

(1) . unsigned, loss of precision: for unsigned integer types only, C specifies that the value is a remainder of modulo 2^w(type), where 2^w(type) is a number 1 greater than the maximum value that can be represented by the result type. Converting an unsigned integer type value to a narrower width value is well defined as taking the narrower width as a modulus. This is achieved by truncating the larger value and keeping its lower order. If the value cannot be represented in the new type, the data is lost. When a value cannot be represented in a new type, the conversion between signed and unsigned integer types of any size can cause data loss or misinterpretation.

(2) . conversion of unsigned value to signed value: when a large unsigned value is converted to a signed type with the same width, the C standard stipulates that when the starting value cannot be represented in a new (signed) type: the result is defined by the implementation, or an implementation defined signal is sent. When converting from an unsigned type to a signed type, type range errors can occur, including loss of data (truncation) and loss of symbols (symbol errors). When a large unsigned integer is converted to a smaller signed integer, the value is truncated and the highest bit becomes the sign bit. The resulting value may be negative or positive, depending on the high-order value after truncation. If the value cannot be represented in the new type, the data is lost (or misinterpreted). Scope should be verified when converting from an unsigned type to a signed type.

The following table summarizes the conversion of unsigned integer types in x86-32 architecture:

From signed integer type conversion: it is always safe to convert from a smaller signed integer type to a larger signed integer type, and it can be realized in complement representation by using the method of symbol extension for this value:

`void test_integer_security_signed_conversion() { { // Signed, loss of accuracy signed long int sl = LONG_MAX; signed char sc = (signed char)sl; // Casts eliminate warnings fprintf(stdout, "sc: %d\n", sc); // -1 } { // When converting from a signed type to a less precise signed type, verify the scope signed long int sl = LONG_MAX; signed char sc; if ((sl < SCHAR_MIN) || (sl > SCHAR_MAX)) { // Handling error conditions fprintf(stderr, "fail\n"); } else { sc = (signed char)sl; // Use casts to eliminate warnings fprintf(stdout, "sc: %d\n", sc); } } { // The comparison of negative value and unsigned value unsigned int ui = UINT_MAX; signed char c = -1; // Due to integer promotion, c is converted to a value of type unsigned int 0xFFFFFF, i.e. 4294967295 if (c == ui) { fprintf(stderr, "why is -1 = 4294967295\n"); } } { // Type range errors, including data loss (truncation) and loss of symbols (symbol errors), can occur when converting from signed to unsigned signed int si = INT_MIN; // Loss causing symbol unsigned int ui = (unsigned int)si; // Casts eliminate warnings fprintf(stderr, "ui: %u\n", ui); // 2147483648 } { // When converting a signed type to an unsigned type, verify the value range signed int si = INT_MIN; unsigned int ui; if (si < 0) { // Handling error conditions fprintf(stderr, "fail\n"); } else { ui = (unsigned int)si; // Casts eliminate warnings fprintf(stdout, "ui: %u\n", ui); } } }`

(1) . signed, loss of precision: the result of converting a signed integer type value to a narrower width is implementation defined, or may trigger an implementation defined signal. A common implementation is to truncate to the smaller size. In this case, the value obtained may be negative or positive, depending on the high value after truncation. If the value cannot be represented in the new type, the data will be lost (or misinterpreted). Scope should be verified when converting from a signed type to a signed type with lower precision. Converting from a higher precision signed type to a lower precision signed type requires both upper and lower limits to be checked.

(2) . convert from signed to unsigned: when mixed operation of signed and unsigned integer types, common types are determined by ordinary arithmetic conversion, which will have at least the widest width of the types involved. C requires that if the mathematical result can be expressed in that width, that value will be generated. When converting a signed integer type to an unsigned integer type, adding or subtracting the width of the new type (2^N) repeatedly will make the result fall within the range that can be represented. When the value of a signed integer is converted to the value of an unsigned integer of equal or greater width and the value of the signed integer is not negative, the value is unchanged.

When converting a signed integer type to an unsigned integer type of equal width, no data is lost because the bit pattern is preserved. However, the high bit loses its sign bit function. If the value of a signed integer is not negative, the value does not change. If the value is negative, the resulting unsigned value is evaluated as a large signed integer. If the signed value is - 2, the corresponding unsigned int value is UINT_MAX-1. When converting from a signed type to an unsigned type, you should verify the value range.

The following table summarizes the conversion of signed integer types on x86-32 platforms:

The impact of transformation: implicit transformation simplifies C programming. However, there are potential data loss or misinterpretation problems in the transformation. The conversion that results in the following results should be avoided: (1) loss value: a type that cannot be represented by the size of the converted value; (2) loss symbol: the conversion from signed type to unsigned type results in loss symbol.

The only safe implementation of all data values and all symbol standards for integer type conversions is to convert to types with the same symbol and wider width.

5.4 integer operation: may cause errors under abnormal conditions, such as overflow, wrap and truncation. An exception occurs when the result of an operation cannot be represented in the operation result type. The following table shows the possible exceptions during integer operation, excluding the errors caused by applying ordinary arithmetic conversion when the operands are unified to common types:

Assignment: in a simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value of the object stored in the left operand. Using a signed integer to assign a value to an unsigned integer, or an unsigned integer to assign a value to a signed integer of equal width, may lead to misunderstanding of the resulting value. Truncation occurs when a type with a larger width is assigned or cast to a type with a smaller width. If the value cannot be represented by a result type, the data may be lost.

`int f_5_4(void) { return 66; } void test_integer_security_assignment() { { char c; // Function f_5_4 the returned int value may be truncated when stored in char and then converted back to int width before comparison // In the implementation of "normal" char with the same value range as the unsigned char, the result of conversion cannot be negative, so the operands to be compared below // You can never compare to be equal, so for full portability, variable c should be declared as int if ((c = f_5_4()) == -1) {} } { char c = 'a'; int i = 1; long l; // The value of i is converted to the type of c=i assignment expression, that is, char type, and then the value of the expression contained in parentheses is converted to the assignment outside of parentheses // The type of expression is long int. if the value of i is not within the value range of char, then after this series of assignments, compare the expression // l == i is not true l = (c = i); } { // Assign a signed integer to an unsigned integer, or an unsigned integer to a signed integer of equal width, // May cause the resulting value to be misunderstood int si = -3; // Because the new type is unsigned, the value can be converted by repeatedly increasing or subtracting a number greater than the maximum value that the new type can represent, // Until the value falls within the value range of the new type. If it is accessed as an unsigned value, the resulting value will be misunderstood as a large positive value unsigned int ui = si; fprintf(stdout, "ui = %u\n", ui); // 4294967293 fprintf(stdout, "ui = %d\n", ui); // -3 // In most implementations, the original value can be easily restored by reverse operation si = ui; fprintf(stdout, "si = %d\n", si); // -3 } { unsigned char sum, c1, c2; c1 = 200; c2 = 90; // The added value of c1 and c2 is outside the range of unsigned char, and the result will be truncated when assigned to sum sum = c1 + c2; fprintf(stdout, "sum = %u\n", sum); // 34 } }`

Add: can be used to add two arithmetic operands or a pointer to an integer. If both operands are of arithmetic type, a normal arithmetic conversion is performed on them. The result of a binary "+" operator is the sum of its operands. Increasing is equivalent to adding 1. If an expression adds an integer type to a pointer, the result is a pointer, which is called a pointer arithmetic operation. The result of adding two integers can always be represented by a number one bit larger than the width of the larger of the two operands. The result of any integer operation can be represented by any type 1 wider than the larger one. If the resulting integer type does not take up enough digits to represent its result, integer addition can cause overflow or wrap.

`void test_integer_security_add() { { // Prior condition test, complement representation: used to detect symbol overflow. This solution is only applicable to the architecture represented by complement signed int si1, si2, sum; si1 = -40; si2 = 30; unsigned int usum = (unsigned int)si1 + si2; fprintf(stdout, "usm = %x, si1 = %x, si2 = %x, int_min = %x\n", usum, si1, si2, INT_MIN); // XOR can be regarded as a bitwise "unequal" operation. Because it only cares about the symbol position, it uses int as the expression_ Min mask, // This causes only the symbol bits to be set if ((usum ^ si1) & (usum ^ si2) & INT_MIN) { // Handling error conditions fprintf(stderr, "fail\n"); } else { sum = si1 + si2; fprintf(stdout, "sum = %d\n", sum); } } { // General prior condition test signed int si1, si2, sum; si1 = -40; si2 = 30; if ((si2 > 0 && si1 > INT_MAX - si2) || (si2 < 0 && si1 < INT_MIN - si2)) { // Handling error conditions fprintf(stderr, "fail\n"); } else { sum = si1 + si2; fprintf(stdout, "sum = %d\n", sum); } } { // Prior condition test: ensure no possibility of rewinding unsigned int ui1, ui2, usum; ui1 = 10; ui2 = 20; if (UINT_MAX - ui1 < ui2) { // Handling error conditions fprintf(stderr, "fail\n"); } else { usum = ui1 + ui2; fprintf(stdout, "usum = %u\n", usum); } } { // Posterior condition test unsigned int ui1, ui2, usum; ui1 = 10; ui2 = 20; usum = ui1 + ui2; if (usum < ui1) { // Handling error conditions fprintf(stderr, "fail\n"); } } }`

Avoid or detect signed overflow caused by addition: signed overflow in C is an undefined behavior, which allows to silently rewind (the most common behavior), trap, saturate (fixed in Max / min), or perform any other behavior that implements the selection.

Casts downward from a larger type: the true sum of any two signed integer values of width W can always be expressed as w+1 bits. As a result, performing an addition in another wider type will always succeed. The resulting value can be scoped and then cast down to the original type. In general, this solution depends on implementation, because the C standard does not guarantee that any standard integer type is larger than another collation type.

Avoid or detect the convolution caused by addition: when two unsigned values are added, if the sum of operands is greater than the maximum value that the result type can store, the convolution will occur. Although unsigned integer rewinding is well defined as modulo behavior in C standard, unexpected rewinding has led to many software vulnerabilities.

A posteriori condition test: after the operation is executed, it tests the value obtained by the operation to determine whether it is within the valid range. This is not valid if an exception could result in an apparently valid value, however, an unsigned addition can always be used to test the wrap.

Subtraction: with addition type, subtraction is also an addition operation. For subtraction, both operands must be of arithmetic type or a pointer to a compatible object type. It is also legal to subtract an integer from a pointer. The decrement operation is equivalent to the decrement 1 operation. If the difference between the two operations is a negative number, the unsigned subtraction produces a rewind.

`void test_integer_security_substruction() { { // A priori condition test: no overflow occurs when two positive numbers are subtracted or two negative numbers are subtracted signed int si1, si2, result; si1 = 10; si2 = -20; // If two operands are signed differently and the resulting symbol is different from the first operand, a subtraction overflow has occurred // XOR is used as a bitwise "unequal" operation. To test the symbol position, the expression uses INT_MIN is masked so that only symbol bits are set // This solution is only applicable to the architecture of complement representation if ((si1 ^ si2) & (((unsigned int)si1 - si2) ^ si1) & INT_MIN) { // Handling error conditions fprintf(stderr, "fail\n"); } else { result = si1 - si2; fprintf(stdout, "result = %d\n", result); } // Portable prior condition test if ((si2 > 0 && si1 < INT_MIN + si2) || (si2 < 0 && si1 > INT_MAX + si2)) { // Handling error conditions fprintf(stderr, "fail\n"); } else { result = si1 - si2; fprintf(stdout, "result = %d\n", result); } } { // A priori condition test of the subtraction operation of an unsigned operand to ensure that there is no sign wrap unsigned int ui1, ui2, udiff; ui1 = 10; ui2 = 20; if (ui1 < ui2) { // Handling error conditions fprintf(stderr, "fail\n"); } else { udiff = ui1 - ui2; fprintf(stdout, "udiff = %u\n", udiff); } } { // Posterior condition test unsigned int ui1, ui2, udiff; ui1 = 10; ui2 = 20; udiff = ui1 - ui2; if (udiff > ui1) { // Handling error conditions fprintf(stderr, "fail\n"); } } }`

Multiplication: multiplication in C can get the product of operands by using the binary operator "*". Each operand of the binary operator '*' is of arithmetic type. Operands perform a normal arithmetic conversion. Multiplication is prone to overflow errors, because the multiplication of relatively small operands can cause a specified integer type overflow. In general, the product of the operands of two integers can always be expressed as twice the number of digits used by the larger of the two operands. This means, for example, that the product of two 8-bit operands can always be represented by a 16 bit class, while the product of two 16 bit operands can always be represented by a 32-bit class.

`void test_integer_security_multiplication() { { // In the case of unsigned multiplication, if the high order is needed to represent the product of two operands, then the result and the convolution unsigned int ui1 = 10; unsigned int ui2 = 20; unsigned int product; static_assert(sizeof(unsigned long long) >= 2 * sizeof(unsigned int), "Unable to detect wrapping after multiplication"); unsigned long long tmp = (unsigned long long)ui1 * (unsigned long long)ui2; if (tmp > UINT_MAX) { // Handle unsigned rewinding fprintf(stderr, "fail\n"); } else { product = (unsigned int)tmp; fprintf(stdout, "product = %u\n", product); } } { // Ensure that symbol overflow is not possible on systems where the long long width is at least twice the int width signed int si1 = 20, si2 = 10; signed int result; static_assert(sizeof(long long) >= 2 * sizeof(int), "Unable to detect overflow after multiplication"); long long tmp = (long long)si1 * (long long)si2; if ((tmp > INT_MAX) || (tmp < INT_MIN)) { // Handle signed overflow fprintf(stderr, "fail\n"); } else { result = (int)tmp; fprintf(stdout, "result = %d\n", result); } } { // General prior commissioning test unsigned int ui1 = 10, ui2 = 20; unsigned int product; if (ui1 > UINT_MAX / ui2) { // Handle unsigned rewinding fprintf(stderr, "fail\n"); } else { product = ui1 * ui2; fprintf(stdout, "product = %u\n", product); } } { // Prevents sign overflow without having to cast up to twice the number of existing integer types signed int si1 = 10, si2 = 20; signed int product; if (si1 > 0) { // si1 is a positive number if (si2 > 0) { // si1 and si2 are both positive numbers if (si1 > (INT_MAX / si2)) { // Handling error conditions fprintf(stderr, "fail\n"); } } // end if si1 and si2 are both positive numbers else { // si1 is a positive number, si2 is not a positive number if (si2 < (INT_MIN / si1)) { // Handling error conditions fprintf(stderr, "fail\n"); } } // end if si1 is a positive number, si2 is not a positive number } // end fif si1 is a positive number else { // si1 is not a positive number if (si2 > 0) { // si1 is not a positive number, si2 is a positive number if (si1 < (INT_MIN / si2)) { // Handling error conditions fprintf(stderr, "fail\n"); } } // end if si1 is not a positive number, si2 is a positive number else { // si1 and si2 are not positive numbers if ((si1 != 0) && (si2 < (INT_MAX / si1))) { // Handling error conditions fprintf(stderr, "fail\n"); } } // end if si1 and si2 are not positive numbers } // end if si1 is not a positive number product = si1 * si2; fprintf(stdout, "product = %d\n", product); } }`

Using static assertion static_assert to test the value of a constant expression.

Division and remainder: when an integer is divided, the result of the "/" operator is the integer part of the algebraic quotient, any decimal part is discarded, and the result of the "%" operator is the remainder. This is often called truncation toward zero. In both operations, if the value of the second operand is 0, the behavior is undefined. It is impossible for an unsigned integer division to produce a rewind because the quotient is always less than or equal to the dividend. But it's not always obvious that signed integer division can also cause overflows, because you might think that the quotient is always less than the dividend. However, an integer overflow occurs when the minimum value of a complement is divided by - 1.

`void test_integer_security_division_remainder() { // A priori condition: the overflow of signed integer division can be prevented by checking whether the numerator is the minimum value of integer type and whether the denominator is - 1 // As long as you make sure that the divisor is not 0, you can make sure that there is no divide by zero error signed long sl1 = 100, sl2 = 5; signed long quotient, result; // This prior condition also tests the remainder operand to ensure that there is no possibility of a divide by zero or (internal) overflow error if ((sl2 == 0) || ((sl1 == LONG_MIN) && (sl2 == -1))) { // Handling error conditions fprintf(stderr, "fail\n"); } else { quotient = sl1 / sl2; result = sl1 % sl2; fprintf(stdout, "quotient = %ld, result = %ld\n", quotient, result); } }`

C11 standard stipulates that if the quotient of a/b can be expressed, then the expression (a/b)*b + a%b should be equal to a, otherwise, the behaviors of a/b and a%b are undefined.

Many hardware platforms implement remainder as part of division operator, which may cause overflow. When the divisor is equal to the minimum value (negative) of the signed integer type and the divisor is equal to - 1, overflow may occur in the process of remainder operation.

A posteriori condition: the normal C + + exception handling mechanism does not allow applications to recover from a hardware exception, such as an access violation or a divide by zero error. Microsoft does provide a facility called structured exception handling (SEH) for handling such hardware and other exceptions. Structured exception handling is a facility provided by operating system, which is different from C + + exception handling mechanism. Microsoft provides a set of extensions for C language, so that C program can handle Win32 structured exception. In Linux environment, hardware exceptions such as division errors are handled by signaling mechanism. In Linux environment, hardware exceptions such as division errors are handled by signaling mechanism. In particular, if the divisor is 0 or the quotient is too large for the destination register, the system will generate a floating point exception (SIGFPE). Even an exception generated by an integer operation, rather than a floating-point operation, causes this type of signal. In order to prevent the program from terminating abnormally in this case, you can use the signal function call to install a signal processor.

Unary inverse (-): to negate a signed integer represented by a complement, a sign error may also occur, because the possible value range of signed integer type is asymmetric.

Shift: this operation includes left shift and right shift. Shift performs integer promotion on operands, each of which has an integer type. The result type is the left operand type after promotion. The operand to the right of the shift operator provides the number of bits to move. If the value is negative or greater than or equal to the number of digits of the result type, the behavior is undefined. In almost all cases, trying to move a negative number of bits or trying to move more than the number of bits present in the operand indicates an error (logical error). This is different from overflow, which is a lack of representation. Do not move a negative number of bits or move more bits than are present in the operand.

`void test_integer_security_shift() { { // Eliminate the possibility of undefined behavior caused by the left shift operation of unsigned integers unsigned int ui1 = 1, ui2 = 31; unsigned int uresult; if (ui2 >= sizeof(unsigned int) * CHAR_BIT) { // Handling error conditions fprintf(stderr, "fail\n"); } else { uresult = ui1 << ui2; fprintf(stdout, "uresult = %u\n", uresult); } } { int rc = 0; //int stringify = 0x80000000; // windows/liunx will crash in sprintf function unsigned int stringify = 0x80000000; char buf[sizeof("256")] = {0}; rc = sprintf(buf, "%u", stringify >> 24); if (rc == -1 || rc >= sizeof(buf)) { // Handling errors fprintf(stderr, "fail\n"); } else { fprintf(stdout, "value: %s\n", buf); // 128 } } }`

Shift left: the result of E1 < < E2 is that E1 shifts the position of E2 bit left, and the empty bit is filled with 0. If E1 is a signed type and non negative value, and E1 * 2E2 can be represented in the result type, then this is the result value, otherwise, the behavior is undefined. Shift operators and other bit operators should only be used with unsigned integer operands. Left shift can be used instead of multiplication of power of 2. Shifting is faster than multiplication, and it is best to use left shifting only when the target is a bit operation.

Shift right: the result of E1 > > E2 is that E1 shifts the position of E2 bit to the right. If E1 is a non negative value of an unsigned or signed type, the result of that value is the integer part of the quotient of E1/2E2. If E1 is a signed negative value, the resulting value is implementation defined and can be an arithmetic (signed) shift.

Because the left shift can replace the multiplication of the power of 2, people usually think that the right shift can replace the division of the power of 2. However, this is true only when the value of the shift is positive. There are two reasons. First, whether the negative right shift is arithmetic or logical shift is defined. Second, even on a known platform for performing arithmetic right shifts, the result is different from that of division. In addition, modern compilers can determine when it is safe to use shift instead of division, and will do so when shifts are faster on their target architecture. For these reasons, and in order to keep the code clear and easy to read, we should use left shift only when our goal is bit operation, and use division when performing traditional arithmetic.

5.5 integer vulnerability: the security flaw may be caused by integer error in the hardware layer or incomplete logic related to integers. When these security flaws are combined with other situations, vulnerabilities may occur.

Rewind: not all unsigned integer rewinds are security flaws. The well-defined arithmetic modulo attribute of unsigned integers is often used intentionally, for example, in the hash algorithm and the example implementation of rand() in the C standard.

`void test_integer_security_wrap_around2() { { // An example of a real vulnerability caused by an unsigned integer rewind is shown size_t len = 1; char* src = "comment"; size_t size; size = len - 2; fprintf(stderr, "size = %u, %x, %x, %d\n", size, size, size+1, size+1); // 4294967295, ffffffff, 0, 0 char* comment = (char*)malloc(size + 1); //memcpy(comment, src, size); // crash free(comment); } { int element_t; int count = 10; // The library function calloc takes two parameters: the space required to store the element type and the number of elements. In order to find the size of the required memory, the number of elements is used // It is calculated by multiplying the unit space required by the element type. If the result cannot be calculated with the type of size_ The unsigned integer representation of T, then // As a result, the application writes to the allocated buffer // May be out of bounds, resulting in a heap based buffer overflow char* p = (char*)calloc(sizeof(element_t), count); free(p); } { int off = 1, len = 2; int type_name; // Here, off and len are declared as signed int. because according to the definition of C standard, sizeof operator returns an unsigned integer type (size_t) // The integer conversion rule requires that the width of signed int is equal to size_ In the implementation of the width of T, len - sizeof(type_name) is calculated as unsigned // If len is smaller than the value returned by sizeof, the subtraction will wrap around and produce a large positive value std::cout<<"len - sizeof(type_name): "<<len - sizeof(type_name)<<std::endl; // 18446744073709551614 if (off > len - sizeof(type_name)) return; // To eliminate the above problems, you can write the integer range check as the following alternative // The programmer still has to ensure that the addition operation here will not cause rewinding, which is achieved by ensuring that the value of off is within a defined range. In order to eliminate // Potential conversion errors, in this case both off and len should be declared as size_t type if ((off + sizeof(type_name)) > len) return; } }`

Conversion and truncation errors:

`void test_integer_security_conversion_truncation() { { // Security vulnerability due to conversion errors int size = 5; int MAX_ARRAY_SIZE = 10; // If the size is negative, this check will pass, and the malloc() function will be passed in a negative size. Because malloc() needs size_ Parameters of type T, // So size will be converted to a huge unsigned number. When the signed integer type is converted to an unsigned integer type, it will be added or subtracted repeatedly // The width of the new type (2^N) so that the result falls within the representable range. Therefore, this conversion may result in a value greater than max_ ARRAY_ The value of size // Errors can be made by declaring size as size_t instead of int to eliminate if (size < MAX_ARRAY_SIZE) { // Initialize array char* array = (char*)malloc(size); free(array); } else { // Handling errors fprintf(stderr, "fail\n"); } } { // Buffer overflow due to integer truncation error char* argv[3] = {"", "abc", "123"}; unsigned short int total; // The attacker may provide two characters whose total length cannot be represented by the unsigned short integer total as parameters, so that the total length value will use the ratio result // The function strlen returns an unsigned integer type size_ The result of T, for most implementations, // size_ The width of T is larger than the width of unsigned short, so it is necessary to downgrade. The execution of strcpy and strcat will lead to buffer overflow total = strlen(argv[1]) + strlen(argv[2]) + 1; char* buff = (char*)malloc(total); strcpy(buff, argv[1]); strcat(buff, argv[2]); fprintf(stdout, "buff: %s\n", buff); free(buff); } }`

Non unexpected integer logic error:

`void test_integer_security_integer_logic() { int* table = nullptr; int pos = 50, value = 10; if (!table) { table = (int*)malloc(sizeof(int) * 100); } // Due to the lack of necessary scope checking for the insertion position pos, a vulnerability will result. Because pos is declared as a signed integer at the beginning, i.e. passed // Values in functions can be both positive and negative if (pos > 99) return; // If pos is a negative value, then value will be written to the position before the actual buffer start address pos*sizeof(int) bytes // Eliminate security flaws: declare the formal parameter pos as an unsigned integer type, or check both the previous and lower bounds as part of the scope check table[pos] = value; // Equivalent to: * (int * ((char *) Table + (POS * sizeof (int))) = value; free(table); }`

5.6 mitigation strategy: integer vulnerability is caused by integer type range error. For example, an integer overflow occurs because an integer operation results in a number that exceeds the range represented by a specific integer type. A truncation error occurs because the result is stored in a type that is too small for it. Data conversions, especially those resulting from assignments or casts, can cause the converted value to be out of the result type range.

Selection of integer type: the unsigned integer shall be used to represent the integer value that cannot be negative, and the signed integer value shall be used to represent the value that can be negative. In general, you should use the smallest signed or unsigned type that can fully represent the range of possible values for any particular variable to save memory. When memory consumption is not an issue, you can decide to declare variables as signed int or unsigned int to minimize potential conversion errors.

`void test_integer_security_type_selection() { char* argv = ""; // Suboptimal: first, the size will not be negative, so there is no need to use a signed integer type; second, the short integer type for possible objects // Size may not have enough range short total1 = strlen(argv) + 1; // Unsigned size_ Type T is introduced by C Standard Committee to represent the size of an object. All variables of this type are guaranteed to have enough precision to represent the size of an object size_t total2 = strlen(argv) + 1; // C11 appendix K introduces a new type rsize_t. It's defined as size_t. But explicitly used to hold the size of a single object #ifdef _MSC_VER rsize_t total3 = strlen(argv) + 1; #endif }`

Any variable used to represent the size of an object, including integer values used as size, index, loop counter, and length, should be declared rsize if possible_ t. Or declared as size_t.

Abstract data type: data abstraction can support the range of data in a way that standard and extended integer types cannot.

Arbitrary precision arithmetic: effectively provides a new integer type whose width is limited only by the available memory of the host system. There are many arbitrary precision arithmetic packages available. Although they are mainly used for scientific calculation, they can also be used to solve the problem of integer type range error caused by the lack of precision of representation.

GNU multiple precision arithmetic library (GMP): GNU multiple precision arithmetic library is a portable library written in C, which is used to perform arbitrary precision arithmetic operations on integers, rational numbers and floating-point numbers.

C language solution: a language solution to prevent integer arithmetic overflow can be implemented by adding arbitrary precision integers to the compiler's type system.

Scope check: there are some rules to prevent scope error in C security coding standard:

(1) . ensure that unsigned integer operations do not rewind;

(2) . ensure that the conversion of integers does not result in data loss or incorrect interpretation;

(3) . ensure that operations on signed integers do not cause overflows.

It is not important to provide a scope check when it is not possible to have a scope error.

Precondition and posttest condition test:

`void test_integer_security_conditions_test() { { // A priori condition test of whether the addition of two unsigned integers is rewind unsigned int ui1, ui2, usum; ui1 = 10; ui2 = 20; if (UINT_MAX - ui1 < ui2) { // Handling error conditions fprintf(stderr, "fail\n"); } else { usum = ui1 + ui2; } } { // Strict compliance testing to ensure that signed multiplication does not lead to overflow signed int si1, si2, result; si1 = 10; si2 = -20; if (si1 > 0) { if (si2 > 0) { if (si1 > (INT_MAX / si2)) { // Handling error conditions fprintf(stderr, "fail\n"); } } else { if (si2 < (INT_MIN / si1)) { // Handling error conditions fprintf(stderr, "fail\n"); } } } else { if (si2 > 0) { if (si1 < (INT_MAX / si2)) { // Handling error conditions fprintf(stderr, "fail\n"); } } else { if ((si1 != 0) && (si2 < (INT_MAX / si1))) { // Handling error conditions fprintf(stderr, "fail\n"); } } } result = si1 * si2; } { // A posteriori conditional test can be used to detect unsigned integer rewinding because these operations are defined as modulo operations unsigned int ui1, ui2, usum; ui1 = 10; ui2 = 20; usum = ui1 + ui2; // The cost of detecting range errors in this way may be relatively high if (usum < ui1) { // Handling error conditions fprintf(stderr, "fail\n"); } } }`

Safe integer Library: can be used to provide safe integer operations that either succeed or report errors.

Overflow detection: the C standard defines < fenv. H > header file to support the directional rounding control mode required for floating-point exception status flag, IEC60559 and similar floating-point status information.

Compiler generated runtime checks:

(1) . Microsoft Visual Studio runtime error checking: use the / RTCc compile flag to enable native runtime checking, which detects assignments that result in data loss. Runtime error checking does not work in the release (optimized) build of the program.

(2).GCC -ftrapv flag: GCC provides a - ftrapv compiler option with limited support for detecting integer overflows at run time.

Verifiable range operations: saturation and modulo wrap algorithms and techniques used within limits always produce integer results within the defined range. This range lies between the integer values MIN and MAX (inclusive), where MIN and MAX are two representable integers, and MIN is smaller than MAX.

As if infinite range integer model: in order to make the program behavior more consistent with the mathematical reasoning commonly used by programmers, as if the as if infinite range (air) integer model guarantees that either the integer value is equivalent to that obtained by using the infinite range integer, or the runtime exception occurs.

Test and analysis: static analysis, whether performed by a compiler or a static analyzer, can be used to detect potential integer range errors in source code. Once these problems are identified, you can fix them by modifying your program with the appropriate integer type or adding logic to ensure that the range of possible values is within the range of the type you are using. Static analysis is prone to false positive. False positives are programming structures that are incorrectly diagnosed as errors by the compiler or analyzer. It is difficult (or impossible) to provide a reliable (no false alarm) and complete (no false alarm) analysis. Two examples of the freely available open source static analysis tools are ROSE and split.

For the complete code of the above code segment, see: GitHub/Messy_Test