8

TLDR: is-comparison works with Python bool's and doesn't work with numpy bool_'s. Are any another differences exist?


I ran into a strange behaviour of booleans couple of days ago. When I tried to use is-comparison for this numpy array:

arr1 = np.array([1,0,2,0], dtype=bool)
arr1

Out[...]: array([ True, False,  True, False])

(These variable names are based on fiction and any similarity to real variable names or production code are purely coincidental)

I saw this result:

arr1 is True

Out[...]: False

It was logical because arr1 is not True or False, it is numpy array. I checked this:

arr1 == True

Out[...]: array([ True, False,  True, False])

This worked as expected. I mentioned this cute behaviour and forgot it immediately. Next day I checked True-ness of the array elements:

[elem is False for elem in arr1]

and it returns me this!

Out[...]: [False, False, False, False]

I was really confused because I remembered that in Python arrays (I thought that the problem is in arrays behaviour):

arr2 = [True, False, True, False]
[elem is False for elem in arr2]

it works:

Out[...]: [False, True, False, True]

Moreover, it was working in my another numpy array:

very_cunning_arr = np.array([1, False, 2, False, []])
[elem is False for elem in very_cunning_arr]

Out[...]: [False, True, False, True, False]

When I dived into my array, I unraveled that very_cunning_arr was constructed by numpy.object because of couple of non-numeric elements so it contained Python bools and arr1 was constructed by numpy.bool_. So I checked their behaviour:

numpy_waka = np.bool_(True)
numpy_waka

Out[...]: True

python_waka = True
python_waka

Out[...]: True

[numpy_waka is True, python_waka is True]

And I finally found the difference:

Out[...]: [False, True]

After all of these I have two questions:

  1. Do numpy.bool_ and bool have some another differences in their common behaviour? (I know that numpy.bool_ has many numpy functions and parameters, like .T and others)
  2. How one can check if the numpy array contains only numpy booleans, without Pythonic bools?

(PS: Yes, NOW I know that comparing to True/False with is is bad):

Don't compare boolean values to True or False using ==.

Yes:   if greeting:
No:    if greeting == True:
Worse: if greeting is True:

Edit 1: As mentioned in another question, numpy has its own bool_ type. But the details of this question are bit different: I found that is-statements works differently, but prior to this difference - is there something else is different in common bool_ and bool behaviour? If yes, what exactly?

vurmux
  • 8,742
  • 3
  • 21
  • 41
  • Possible duplicate of [boolean and type checking in python vs numpy](https://stackoverflow.com/questions/18922407/boolean-and-type-checking-in-python-vs-numpy) – Hampus Larsson Apr 29 '19 at 14:52
  • 1
    Thank you for the "any similarity to real variable names..." joke. You got me. – spaghEddie Sep 27 '21 at 16:56

3 Answers3

6
In [119]: np.array([1,0,2,0],dtype=bool)                                             
Out[119]: array([ True, False,  True, False])

In [120]: np.array([1, False, 2, False, []])                                         
Out[120]: array([1, False, 2, False, list([])], dtype=object)

Note the dtype. With object dtype, the elements of the array are Python objects, just like they are in the source list.

In the first case the array dtype is boolean. The elements represent boolean values, but they are not, themselves, Python True/False objects. Strictly speaking Out[119] does not contain np.bool_ objects. Out[119][1] is type bool_, but that's the result of the 'unboxing'. It's what ndarray indexing produces when you ask for an element. (This 'unboxing' distinction is true for all non-object dtypes.)

Normally we don't create dtype objects, preferring np.array(True), but to follow your example:

In [124]: np.bool_(True)                                                             
Out[124]: True
In [125]: type(np.bool_(True))                                                       
Out[125]: numpy.bool_
In [126]: np.bool_(True) is True                                                     
Out[126]: False
In [127]: type(True)                                                                 
Out[127]: bool

is is a strict test, not just for equality, but identity. Objects of different classes don't satisfy a is test. Objects can satisfy the == test without satisfying the is test.

Let's play with the object dtype array:

In [129]: np.array([1, False, 2, np.bool_(False), []])                               
Out[129]: array([1, False, 2, False, list([])], dtype=object)
In [130]: [i is False for i in _]                                                    
Out[130]: [False, True, False, False, False]

In the Out[129] display, the two False objects display the same, but the Out[130] test shows they are different.


To focus on your questions.

  • np.bool_(False) is a unique object, but distinct from False. As you note it has many of the same attributes/methods as np.array(False).

  • If the array dtype is bool it does not contain Python bool objects. It doesn't even contain np.bool_ objects. However indexing such an array will produce a bool_. And applying item() to that in turn produces a Python bool.

  • If the array object dtype, it most likely will contain Python bool, unless you've taken special steps to include bool_ objects.

hpaulj
  • 201,845
  • 13
  • 203
  • 313
3

There is some confusion with the variables, what is happening is a "confusion" between the module and python, use isinstance(variable, type) to check what is it an if is usable on your code.

Creating a single variable as a bool variable works just fine, python reeds it correctly:

np_bool = np.bool(True)
py_bool = True

print(isinstance(np_bool, bool)) # True
print(isinstance(py_bool, bool)) # True

But with lists it can be different, numpy bool lists are not bool values on a list as you can see in this example:

# Regular list of int
arr0 = [-2, -1, 0, 1, 2]

# Python list of bool
arr1 = [True, False, True, False]

# Numpy list of bool, from int / bool
arr3_a = np.array([-2, -1, 0, 1, 2], dtype=bool)
arr3_b = np.array([True, False, True, False], dtype=bool)

print(isinstance(arr0[0], int))    # True
print(isinstance(arr1[0], bool))   # True

print(isinstance(arr3_a[0], bool)) # False
print(isinstance(arr3_b[0], bool)) # False

In order to use a variable from the numpy list is required a conversion with bool()

arr3_a = np.array([-2, -1, 0, 1, 2], dtype=bool)

x = (bool(arr3_a[0]) is True)
print(isinstance(x, bool)) # True

Quick example of use:

arr3_a = np.array([-2, -1, 0, 1, 2], dtype=bool)

for c in range(0, len(arr3_a)):
    if ( bool(arr3_a[c]) == True ):
        print(("List value {} is True").format(c))
    else:
        print(("List value {} is False").format(c))
SrPanda
  • 684
  • 1
  • 4
  • 8
0

Another difference is that automatic casting to integers can be done on np.bool but not on np.bool_.

This is needed, for example, here

>>> np.bool(False) - np.bool(True)
-1

>>> np.bool_(False) - np.bool_(True)
TypeError: numpy boolean subtract, the `-` operator, is not supported, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

Yohai Devir
  • 168
  • 1
  • 8