read subprocess stdout line by line

ghz 6months ago ⋅ 131 views

My python script uses subprocess to call a linux utility that is very noisy. I want to store all of the output to a log file and show some of it to the user. I thought the following would work, but the output doesn't show up in my application until the utility has produced a significant amount of output.

# fake_utility.py, just generates lots of output over time
import time
i = 0
    while True:
        print(hex(i)*512)
        i += 1
        time.sleep(0.5)

In the parent process:

import subprocess

proc = subprocess.Popen(['python', 'fake_utility.py'], stdout=subprocess.PIPE)
for line in proc.stdout:
    # the real code does filtering here
    print("test:", line.rstrip())

The behavior I really want is for the filter script to print each line as it is received from the subprocess, like tee does but within Python code.

What am I missing? Is this even possible?

Answers

The issue you're facing is due to the buffering behavior of the subprocess. The output from fake_utility.py is buffered, which means it doesn't get sent to the parent process until the buffer is full or the process ends.

To achieve the desired behavior, you need to disable the buffering in the subprocess. You can do this by using the -u option with the Python interpreter to force the subprocess to run in unbuffered mode.

Here’s how you can modify your parent process code:

import subprocess

# Start the subprocess with unbuffered output
proc = subprocess.Popen(['python', '-u', 'fake_utility.py'], stdout=subprocess.PIPE, text=True)

# Read and process each line as it is produced by the subprocess
for line in proc.stdout:
    # The real code does filtering here
    print("test:", line.rstrip())

Additionally, you can ensure that fake_utility.py is producing unbuffered output by using sys.stdout.flush() after each print statement. Here's the updated fake_utility.py:

import time
import sys

i = 0
while True:
    print(hex(i) * 512)
    sys.stdout.flush()
    i += 1
    time.sleep(0.5)

By running the subprocess with the -u flag and adding sys.stdout.flush() in the fake_utility.py script, you can ensure that each line of output is immediately flushed and available to the parent process. This way, the parent process will print each line as soon as it is received from the subprocess.