printf anomaly after "fork()"

ghz 1years ago ⋅ 7882 views

Question

OS: Linux, Language: pure C

I'm moving forward in learning C programming in general, and C programming under UNIX in a special case.

I detected a strange (for me) behaviour of the printf() function after using a fork() call.

Code

#include <stdio.h>
#include <system.h>

int main()
{
    int pid;
    printf( "Hello, my pid is %d", getpid() );

    pid = fork();
    if( pid == 0 )
    {
            printf( "\nI was forked! :D" );
            sleep( 3 );
    }
    else
    {
            waitpid( pid, NULL, 0 );
            printf( "\n%d was forked!", pid );
    }
    return 0;
}

Output

Hello, my pid is 1111
I was forked! :DHello, my pid is 1111
2222 was forked!

Why did the second "Hello" string occur in the child's output?

Yes, it is exactly what the parent printed when it started, with the parent's pid.

But! If we place a \n character at the end of each string we get the expected output:

#include <stdio.h>
#include <system.h>

int main()
{
    int pid;
    printf( "Hello, my pid is %d\n", getpid() ); // SIC!!

    pid = fork();
    if( pid == 0 )
    {
            printf( "I was forked! :D" ); // removed the '\n', no matter
            sleep( 3 );
    }
    else
    {
            waitpid( pid, NULL, 0 );
            printf( "\n%d was forked!", pid );
    }
    return 0;
}

Output :

Hello, my pid is 1111
I was forked! :D
2222 was forked!

Why does it happen? Is it correct behaviour, or is it a bug?


Answer

I note that <system.h> is a non-standard header; I replaced it with <unistd.h> and the code compiled cleanly.

When the output of your program is going to a terminal (screen), it is line buffered. When the output of your program goes to a pipe, it is fully buffered. You can control the buffering mode by the Standard C function setvbuf() and the _IOFBF (full buffering), _IOLBF (line buffering) and _IONBF (no buffering) modes.

You could demonstrate this in your revised program by piping the output of your program to, say, cat. Even with the newlines at the end of the printf() strings, you would see the double information. If you send it direct to the terminal, then you will see just the one lot of information.

The moral of the story is to be careful to call fflush(0); to empty all I/O buffers before forking.


Line-by-line analysis, as requested (braces etc removed - and leading spaces removed by markup editor):

  1. printf( "Hello, my pid is %d", getpid() );
  2. pid = fork();
  3. if( pid == 0 )
  4. printf( "\nI was forked! :D" );
  5. sleep( 3 );
  6. else
  7. waitpid( pid, NULL, 0 );
  8. printf( "\n%d was forked!", pid );

The analysis:

  1. Copies "Hello, my pid is 1234" into the buffer for standard output. Because there is no newline at the end and the output is running in line-buffered mode (or full-buffered mode), nothing appears on the terminal.
  2. Gives us two separate processes, with exactly the same material in the stdout buffer.
  3. The child has pid == 0 and executes lines 4 and 5; the parent has a non-zero value for pid (one of the few differences between the two processes - return values from getpid() and getppid() are two more).
  4. Adds a newline and "I was forked! :D" to the output buffer of the child. The first line of output appears on the terminal; the rest is held in the buffer since the output is line buffered.
  5. Everything halts for 3 seconds. After this, the child exits normally through the return at the end of main. At that point, the residual data in the stdout buffer is flushed. This leaves the output position at the end of a line since there is no newline.
  6. The parent comes here.
  7. The parent waits for the child to finish dying.
  8. The parent adds a newline and "1345 was forked!" to the output buffer. The newline flushes the 'Hello' message to the output, after the incomplete line generated by the child.

The parent now exits normally through the return at the end of main, and the residual data is flushed; since there still isn't a newline at the end, the cursor position is after the exclamation mark, and the shell prompt appears on the same line.

What I see is:

Osiris-2 JL: ./xx
Hello, my pid is 37290
I was forked! :DHello, my pid is 37290
37291 was forked!Osiris-2 JL: 
Osiris-2 JL: 

The PID numbers are different - but the overall appearance is clear. Adding newlines to the end of the printf() statements (which becomes standard practice very quickly) alters the output a lot:

#include <stdio.h>
#include <unistd.h>

int main()
{
    int pid;
    printf( "Hello, my pid is %d\n", getpid() );

    pid = fork();
    if( pid == 0 )
        printf( "I was forked! :D %d\n", getpid() );
    else
    {
        waitpid( pid, NULL, 0 );
        printf( "%d was forked!\n", pid );
    }
    return 0;
}

I now get:

Osiris-2 JL: ./xx
Hello, my pid is 37589
I was forked! :D 37590
37590 was forked!
Osiris-2 JL: ./xx | cat
Hello, my pid is 37594
I was forked! :D 37596
Hello, my pid is 37594
37596 was forked!
Osiris-2 JL:

Notice that when the output goes to the terminal, it is line-buffered, so the 'Hello' line appears before the fork() and there was just the one copy. When the output is piped to cat, it is fully-buffered, so nothing appears before the fork() and both processes have the 'Hello' line in the buffer to be flushed.