Unexpected value when dereferencing pointer in C

ghz 8months ago ⋅ 124 views

The following code

#include <stdio.h>

int main()
{
    long long data = 0xFFFEABCD11112345;

    char *pData = (char *)&data;
    printf("Value at address %p is %x\n", pData, *pData);

    pData = pData + 5;
    printf("Value at address %p is %x\n", pData, *pData);

    return 0;
}

produces an output similar to

Value at address 00000023515FFC00 is 45
Value at address 00000023515FFC05 is ffffffab

Given that pData is a char *, I was expecting the second value to be ab instead of ffffffab. I believe that the %x format specifier might be the culprit but I do not fully understand it. Where do the leading f's come from?

Answers

The leading 'f's in the second output are due to sign extension when interpreting the byte as a signed integer. In C, when you use the %x format specifier with printf to print a signed integer, the value is sign-extended to the width of the argument type before printing it as hexadecimal.

In your case, *pData holds the byte 0xab, which is a negative number in two's complement representation when interpreted as a signed char. When this byte is promoted to an integer (due to default argument promotion rules in C), it gets sign-extended to the width of an int or long long. Since char is typically signed by default, the byte 0xab is interpreted as -85 in two's complement signed representation, which is 0xffffffab when sign-extended to a 32-bit or 64-bit integer.

To print the byte as an unsigned integer without sign extension, you can explicitly cast it to an unsigned char before printing it. Here's the modified code:

#include <stdio.h>

int main()
{
    long long data = 0xFFFEABCD11112345;

    char *pData = (char *)&data;
    printf("Value at address %p is %x\n", pData, (unsigned char)*pData);

    pData = pData + 5;
    printf("Value at address %p is %x\n", pData, (unsigned char)*pData);

    return 0;
}

This will print:

Value at address 00000023515FFC00 is 45
Value at address 00000023515FFC05 is ab

Now, the byte is interpreted as an unsigned char, so there is no sign extension, and you get the expected hexadecimal value.