Writing and Reading numpy objects an plain text

ghz 昨天 ⋅ 1 views

In Python, I would like to store numpy arrays, matrices and possibly later other objects in plain text format.

My idea was to use ConfigParser and define parser array2string, matrix2string, string2array and string2matrix (there is numpy.array2string and matrix2string could be implemented based on that, but I couldn't find functions for the reverse). Then writing looks like:

config.set('calibration', 'center', array2string(center))
config.set('calibration', 'trans_matrix', matrix2string(trans_matrix))

and reading like:

center = string2array(config.get('calibration', 'center'))
trans_matrix = string2matrix(config.get('calibration', 'trans_matrix'))

What is the best way to write and read the numpy objects?

Answer

To store and read NumPy arrays and matrices in plain text format using Python, you're on the right track with the idea of using ConfigParser to store the string representation of the arrays and matrices. However, you are correct that numpy.array2string() and matrix2string() can be used for serialization to strings, but you'll need to implement the reverse conversion functions (string2array and string2matrix).

Here’s a detailed approach to implement this:

1. Storing NumPy Arrays and Matrices

NumPy has built-in functions like numpy.array2string() to convert an array to a string representation. You can define a helper function for arrays and matrices.

For matrices, numpy.array2string() will work since numpy.matrix is just a subclass of numpy.ndarray, but there may be edge cases to handle depending on your matrix dimensions.

2. Reading NumPy Arrays and Matrices

The reverse process involves using numpy.fromstring() or numpy.genfromtxt() to convert the string back into an array. For matrices, you can first convert the string to a NumPy array and then convert that array to a matrix if needed.

3. Full Implementation

Let’s start with the helper functions for conversion:

Helper functions

import numpy as np

# Converts an array to a string
def array2string(array):
    return np.array2string(array, separator=',').replace('\n', ' ')

# Converts a matrix to a string (just in case you want to handle matrices separately)
def matrix2string(matrix):
    return np.array2string(matrix, separator=',').replace('\n', ' ')

# Converts a string back to an array
def string2array(string):
    # Strip extra spaces and convert string back to an array
    return np.fromstring(string, sep=',').reshape(-1)

# Converts a string back to a matrix
def string2matrix(string):
    # We use fromstring to convert to an array first, then reshape to a 2D matrix
    array = np.fromstring(string, sep=',')
    # Assuming square matrix, reshape accordingly
    size = int(np.sqrt(array.size))  # Assuming the matrix is square for simplicity
    return array.reshape(size, size)

4. Using ConfigParser

Now that we have the conversion functions, we can use ConfigParser to store and retrieve the string representations of arrays and matrices:

import configparser

# Initialize ConfigParser
config = configparser.ConfigParser()

# Some test data
center = np.array([1.1, 2.2, 3.3])
trans_matrix = np.matrix([[1, 2], [3, 4]])

# Writing to the config file
config['calibration'] = {}
config.set('calibration', 'center', array2string(center))
config.set('calibration', 'trans_matrix', matrix2string(trans_matrix))

# Save to a file (optional)
with open('config.ini', 'w') as configfile:
    config.write(configfile)

# Reading back from the config file
config.read('config.ini')

# Retrieve and convert back to numpy objects
center_from_config = string2array(config.get('calibration', 'center'))
trans_matrix_from_config = string2matrix(config.get('calibration', 'trans_matrix'))

# Verifying
print(center_from_config)
print(trans_matrix_from_config)

5. Explanation of Code:

  • array2string and matrix2string: These functions convert NumPy arrays and matrices into strings using numpy.array2string(). The newlines are replaced with spaces to make the data easier to read and write as a single line.

  • string2array and string2matrix: These functions convert a string back into a NumPy array or matrix. We use np.fromstring() to parse the comma-separated string back into an array and then reshape it if necessary. For matrices, we assume that the matrix is square and compute the size based on the total number of elements.

  • ConfigParser usage: We store the string representations of arrays and matrices in the config file using config.set(), and retrieve them with config.get(). Then, we use string2array and string2matrix to convert the strings back into NumPy objects.

6. Further Enhancements:

  1. For 2D or higher-dimensional matrices: If your matrices are not square, you may need to adjust the reshaping logic in string2matrix to handle arbitrary dimensions. For example, you could store the shape of the matrix in the config file and reshape accordingly.

  2. Error Handling: You may want to add error handling, especially when converting from a string back to an array or matrix, to ensure the format is correct.

  3. Using JSON: If the data is more complex, you might consider using a JSON format (with json.dumps()/json.loads()) instead of plain text. JSON would allow you to store NumPy objects as lists, making it more readable and easier to handle complex objects.


Example Output:

Assuming you have the following data:

center = np.array([1.1, 2.2, 3.3])
trans_matrix = np.matrix([[1, 2], [3, 4]])

Your config.ini would look like this after running the script:

[calibration]
center = [1.1, 2.2, 3.3]
trans_matrix = [[1, 2], [3, 4]]

When you read it back, you'll retrieve the NumPy objects:

center_from_config = np.array([1.1, 2.2, 3.3])
trans_matrix_from_config = np.matrix([[1, 2], [3, 4]])

Conclusion

This method allows you to easily store and retrieve NumPy arrays and matrices in plain text format using ConfigParser. You can adapt this approach to work with higher-dimensional arrays or more complex objects if needed.