How to copy text from a cell on Excel file to a PDF form with py

ghz 7months ago ⋅ 44 views

How to copy text from a cell on Excel file to a PDF form with python?

I'm developing a Python script that reads a specific cell from an Excel file, say B6, and transcribes it to a text box on a PDF form, let's say the text box is called Formulary unite_3.

Python correctly reads the text inside the Excel cell, even prints it, and transfers it to the form, but when I open the pdf file it is not visible until I click on the file.

I already tried to fix the problems, I even made a mini script that allows you to see the properties of the text box to see the behavior, and what I deduce is that what you put in the form you put layers behind. I attach my analysis below:

With this, the text doesn't appear (done indirectly with Python):

{'/AP': {'/N': IndirectObject(113, 0)}, '/DA': '/Helv  0 Tf 0 g', '/DR': {'/Encoding': {'/PDFDocEncoding': IndirectObject(110, 0)}, '/Font': {'/Helv': IndirectObject(111, 0)}}, '/F': 4, '/FT': '/Tx', '/P': IndirectObject(106, 0), '/Rect': [120.84, 600.24, 293.28, 617.16], '/Subtype': '/Widget', '/T': 'Formulary unite_3', '/TU': 'First Name', '/Type': '/Annot', '/V': 'Hola Mundo'}

With this, the text appears (done directly with Adobe Acrobat Reader):

#{'/AP': {'/N': IndirectObject(48, 0)}, '/DA': '/Helv  0 Tf 0 g', '/DR': {'/Encoding': {'/PDFDocEncoding': IndirectObject(772, 0)}, '/Font': {'/Helv': IndirectObject(773, 0)}}, '/F': 4, '/FT': '/Tx', '/P': IndirectObject(1, 0), '/Rect': [120.84, 600.24, 293.28, 617.16], '/Subtype': '/Widget', '/T': 'Formulary unite_3', '/TU': 'First Name', '/Type': '/Annot', '/V': 'Hola Mundo'}

I have tried many libraries but nothing has changed. I am currently using PyPDF2 which has given me better results. I leave you my script and if someone could help me please, I am very desperate for this.

I purposely left out the Excel part because I could already get that to work, and replaced it with Hola Mundo for practicality purposes.

Python Script

Part of form where it should appear at a glance

Part of the form where it can only be viewed by clicking

Answers

It seems like the issue might be related to how the text is being added to the PDF form. It's possible that the text is being added as an annotation (a form field) rather than as actual content in the PDF page.

To ensure that the text appears directly on the PDF page and is visible without requiring user interaction, you can try using a library like reportlab to generate a new PDF with the text added as content.

Here's a basic example of how you can use reportlab to create a new PDF with the text:

from reportlab.pdfgen import canvas

# Create a new PDF
pdf_path = 'output.pdf'
c = canvas.Canvas(pdf_path)

# Set font and size
c.setFont("Helvetica", 12)

# Set position to place the text
x, y = 100, 600

# Add the text to the PDF
text = "Hola Mundo"
c.drawString(x, y, text)

# Save the PDF
c.save()

This script will create a new PDF file (output.pdf) with the text "Hola Mundo" added at position (100, 600) on the page.

Once you have the text added as content in the PDF, it should be visible without requiring user interaction.

You can then combine this with your existing code that reads from the Excel file to dynamically populate the text content before generating the PDF.