在python webapp2中读取excel文件的内容

ghz 11hours ago ⋅ 2 views

I have two files namely sample.csv and sample.xlsx, all those files are stored in blobstore.I am able to read the records of csv file(which is in the blobstore) using the following code

   blobReader = blobstore.BlobReader(blob_key)
   inputFile = BlobIterator(blobReader)
   if inputFile is None:
      values = None
   else:
      try:
         stringReader = csv.reader(inputFile)
         data = []
         columnHeaders = []
         for rowIndex, row in enumerate(stringReader):
            if(rowIndex == 0):
               columnHeaders = row
            else:
               data.append(row)
         values = {'columnHeaders' : columnHeaders, 'data' : data}
      except:
         values = None

      self.response.write(values) 

The output of the above code of a sample.csv file is

{'columnHeaders': ['First Name', 'Last Name', 'Email', 'Mobile'], 'data': [['fx1', 'lx2', 'flx1x2@xxx.com', 'xxx-xxx-xxxx'], ['fy1', 'ly2', 'fly1y2@yyy.com', 'yyy-yyy-yyyy'], ['fz1', 'lz2', 'flz1z2@zzz.com', 'zzz-zzz-zzzz']]}   

Using the xlrd package, i am able to read the excel file contents, but in this i have to specify the exact file location

   book = xlrd.open_workbook('D:/sample.xlsx') 
   first_sheet = book.sheet_by_index(0)
   self.response.write(first_sheet.row_values(0)) 
   cell = first_sheet.cell(0,0) 
   self.response.write(cell.value) 

Is there any way to read the excel file contents from the blobstore, i have tried it with the following code

   blobReader = blobstore.BlobReader(blobKey)
   uploadedFile = BlobIterator(blobReader) 
   book = xlrd.open_workbook(file_contents=uploadedFile)
                    (or)
   book = xlrd.open_workbook(file_contents=blobReader)

But it throws some error TypeError: 'BlobReader' object has no attribute 'getitem'.

Answer

To read an Excel file stored in the blobstore, the issue arises because xlrd.open_workbook() expects a byte-like object or a file path, but you are passing it a BlobReader or an iterator, which is causing the error.

You need to make sure that the content of the Excel file is read as a byte stream, not as an iterator, and then pass it to xlrd.open_workbook().

Here's how you can modify your code to read the Excel file directly from the blobstore:

Solution:

You can use the BlobReader to read the file content into a byte object and then pass that byte object to xlrd.open_workbook().

Here’s an updated version of your code:

import xlrd
from google.appengine.api import blobstore

def read_excel_from_blob(blob_key):
    try:
        # Open the blob from the blobstore
        blob_reader = blobstore.BlobReader(blob_key)
        
        # Read the file content into a byte string
        file_contents = blob_reader.read()
        
        # Now pass the byte string to xlrd to open the Excel file
        book = xlrd.open_workbook(file_contents=file_contents)
        
        # Read the first sheet
        first_sheet = book.sheet_by_index(0)
        
        # Example: Get the first row values
        row_values = first_sheet.row_values(0)
        
        # Example: Get the value of a specific cell
        cell_value = first_sheet.cell(0, 0).value
        
        # Return or process the data
        return {
            'row_values': row_values,
            'cell_value': cell_value
        }
    except Exception as e:
        # Handle any errors
        return {'error': str(e)}

# Example usage
blob_key = "your_blob_key_here"
result = read_excel_from_blob(blob_key)
print(result)

Key Changes:

  1. Reading the Blob Content: blob_reader.read() reads the entire content of the blob and returns it as a byte string. This is what xlrd.open_workbook() expects.

  2. Passing the Byte String to xlrd.open_workbook(): Instead of passing a BlobReader or BlobIterator, you pass the byte string obtained from blob_reader.read() to xlrd.open_workbook().

Explanation:

  • BlobReader(blob_key) opens the blob with the given key.
  • blob_reader.read() reads the entire file content as bytes, which is the format that xlrd expects when using file_contents.
  • xlrd.open_workbook(file_contents=file_contents) opens the Excel file directly from the byte content.
  • You can then interact with the Excel file, just as you would with a local file.

Potential Issues to Keep in Mind:

  1. File Size: Ensure that the file size is manageable, as read() will load the entire content into memory. For very large files, you may need a more memory-efficient approach.

  2. Error Handling: Ensure proper error handling if the file format is incorrect or the file cannot be read, as Excel files can sometimes be corrupted or improperly formatted.

With this approach, you should be able to read Excel files from the Google App Engine Blobstore without any issues.