Find unique rows in numpy.array

ghz 1years ago ⋅ 5080 views

Question

I need to find unique rows in a numpy.array.

For example:

>>> a # I have
array([[1, 1, 1, 0, 0, 0],
       [0, 1, 1, 1, 0, 0],
       [0, 1, 1, 1, 0, 0],
       [1, 1, 1, 0, 0, 0],
       [1, 1, 1, 1, 1, 0]])
>>> new_a # I want to get to
array([[1, 1, 1, 0, 0, 0],
       [0, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 1, 0]])

I know that i can create a set and loop over the array, but I am looking for an efficient pure numpy solution. I believe that there is a way to set data type to void and then I could just use numpy.unique, but I couldn't figure out how to make it work.


Answer

As of NumPy 1.13, one can simply choose the axis for selection of unique values in any N-dim array. To get unique rows, use np.unique as follows:

unique_rows = np.unique(original_array, axis=0)