Dynamic allocation of an unknown matrix in C

ghz 1years ago ⋅ 8375 views

Question

I need to take a file that is inputted by the user and multiply it by another file. That much I know how to do.

The problem is one file is an array and the other is a matrix.

I need to scan in the first line of the matrix to find the size of the matrix and I then need to dynamically allocate the matrix and array from the files.

This is what I have so far:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
int main()
{       
    int row1, col1;
        //These values need to be pulled from the first file//
    char filename1[100];
        //Setting the file name for entry and setting the limit to 100//
    FILE* fp1;
        //FILE must be set as a pointer (FILE must also be capitalized)//

    printf("Enter file name including file extension: \n");
        //This will pull in the name entered by the user//
    scanf("%s", filename1);
        //Scans in the name of the first file//

    fp1 = fopen(filename1, "r");
        //This will open the file as entered by the user//
    if (fp1 == NULL)
    {
        printf("\nError, file not found\n");
        exit(0);
    }
        //This is for the first file//

    char filename2[100];
        //Setting the file name for entry and setting the limit to 100//
    FILE* fp2;
        //FILE must be set as a pointer (FILE must also be capitalized)//

    printf("Enter file name including file extension: \n");
        //This will pull in the name entered by the user//
    scanf("%s", filename2);
        //Scans in the name of the first file//

    fp2 = fopen(filename2, "r");
        //This will open the file as entered by the user//
    if (fp2 == NULL)
    {
        printf("\nError, file not found\n");
        exit(0);
    }
        //This is for the second file//

        //**I need to now dynamically allocate the input files**//

    return 0;
} 

Also sorry for seeming like I just left after posting my question as some members have shared in the comments saying that I was code fishing. I'm not; I just didn't realize how active this community is. Thanks for the input so far.

Here is the screenshot of all I have so far including the files that are going to be read in.

Thanks for the suggestions. I was able to figure out the "fgets" function and I used that to pull in the size of the matrix from the first file. After I had that, dynamically allocating that was easy.


Answer

My recommendation is to consider your matrix as having some abstract data type that you want to implement.

A common way might be to use an array of pointers (to arrays representing rows of your matrix). But I feel it is confusing and inefficient.

So what are the operations you want on your matrixes?

  • create a matrix of given dimensions

  • destroy a previously created matrix

  • access some element in a given matrix with given row and column indexes

  • change the value of an element in a given matrix with given row and column indexes

  • etc....

BTW, you might have several variants of them. For example, you could do error checking (e.g. reject a negative index) or you could have unsafe (but slightly faster) functions capable of undefined behavior (and this is very scary). Of course you could define more operations (using other ones), for example matrix multiplication etc.

You should list -on paper or board- all the operations you want on your matrixes and explain them in your documentation (or your comments). In practice you might have many dozens -or even hundreds- of operations on your abstract data type. Document also what happens in error cases.

I usually recommend keeping the dimensions with the matrix (unless you know that some of the dimension is a constant). A common way of implementing abstract data types in C is to encapsulate them in some struct and use pointers to these.

So I suggest to use a flexible array member (as the last element of your struct). Here is my matrix_st structure:

  struct matrix_st {
    unsigned m_h, m_w; // height and width of matrix
    double m_v[]; // values inside the matrixes, there are m_h*m_w of them
  };

so my abstract data type is just pointers to

  typedef struct matrix_st Matrix;

Here are the declarations of the functions implementing my abstract data type:

  Matrix* matrix_create(unsigned height, unsigned width);
  void matrix_destroy(Matrix*mat);
  double matrix_access(Matrix*mat, unsigned i, unsigned j);
  void matrix_change_element(Matrix*mat, unsigned i, unsigned j,double v);

Here are some implementations (since I don't want to deal with pathologically huge matrixes, I define some maximal dimension; computer resources are always finite!):

  #define MATRIX_MAXDIM 10000000 /* ten millions */
  Matrix* matrix_create(unsigned height, unsigned width) {
     if (height>MATRIX_MAXDIM || width>MATRIX_MAXDIM) {
        fprintf(stderr, "too huge matrix height=%u width=%u\n",
                height, width);
        exit(EXIT_FAILURE);
     };
     Matrix* res = 
        calloc(1, sizeof(Matrix) + height*width*sizeof(double));
     if (!res) {
         perror("matrix calloc");
         exit(EXIT_FAILURE);
     };
     res->m_h = height;
     res->m_w = width;
     return res; 
  } // end matrix_create

I am using calloc not malloc because I really want some zero-ed memory. So the returned matrix contains all zeros. BTW on some computers (not mine, a PC/Linux/Debian/x86-64 desktop) the height*width*sizeof(double) could overflow.

Here is the function to access some element. It does some error checking.

double matrix_access(Matrix*mat, unsigned i, unsigned j) 
{ 
   if (!mat) 
      { fprintf(stderr, "no matrix to access\n"); exit(EXIT_FAILURE; };
   unsigned h = mat->m_h;
   unsigned w = mat->m_w;
   if (i >= h || j >= w)
      { fprintf(stderr, "out-of-bound matrix access\n"); 
        exit(EXIT_FAILURE); };
   return mat->m_v [i*h + j];
}

Since I made only one calloc the destruction is simple to code:

  void matrix_destroy(Matrix*mat) {
    if (!mat) { fprintf(stderr, "no matrix to destroy\n"); exit(EXIT_FAILURE); };
    assert (mat->m_h < MATRIX_MAXDIM);
    assert (mat->m_w < MATRIX_MAXDIM);
    free (mat);
  }

The assert statements are in principle useless (they check something which should always be true). But I love defensive programming (this would help me catching bugs in some other places misusing my Matrix). They could be disabled (read [assert(3)](http://man7.org/linux/man- pages/man3/assert.3.html)) at compilation time.

BTW, you could declare these functions as inline or static inline (and define them in some included header file). An optimizing compiler is likely to produce efficient code (e.g. compile with gcc -O2 -Wall -march=native when benchmarking).

Since you are reading a matrix from some file, you should define your file format (using, in your documentation, some EBNF notation to describe the syntax in that file is useful) and you could define and implement a function reading and creating a matrix from some opened file handle.


Coding the other functions is left as an exercise to the reader.

Don't forget to compile [with](https://gcc.gnu.org/onlinedocs/gcc/Invoking- GCC.html) all warnings and debug info, so gcc -Wall -Wextra -g with GCC. Use the debuggergdb (and also valgrind to hunt memory leaks). Read the documentation of every used function (for example your code don't check the return count of scanf but it really should). Run several test cases. Try to convince yourself that your code is good (by proving parts of it). Perhaps use some static source code analyzer (e.g. Frama-C, which wants extra annotations in ACSL). If you need to benchmark your program, enable optimizations at compile time (e.g. by passing -O2 -march=native to gcc ....).


In a code comment you are asking:

 // I need to now dynamically allocate the input files

You don't allocate input files (the operating system is managing them), you allocate some memory zone. Read about C dynamic memory allocation. Notice that memory allocation can fail (e.g. as documented in malloc(3)), because your virtual address space cannot grow indefinitely.

BTW, the call stack is limited (typically to a megabyte or a few of them on desktop computers), so you generally want to avoid large automatic variables, so that is another good reason to avoid putting matrixes in your call frame and to prefer dynamic memory allocation for them.