0

I know that python has a concept of small integers which are numbers from -5 to 256, and if two variables assign to same numbers between this range, they both will use the same underlying object.

From Python documentation,

#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS           257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS           5
#endif

/* Small integers are preallocated in this array so that they can be shared. The integers that are preallocated are those in the range -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive). */

Also explained here,

The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object. So it should be possible to change the value of 1. I suspect the behaviour of Python in this case is undefined. :-)

Example,

a = 255
b = 255
print(id(a))
print(id(b))

gives the same id,

1561854394096
1561854394096

Which makes sense and also explained on this answer, "is" operator behaves unexpectedly with integers

If two numbers are less than -5, they should also have different IDs as follows,

a = -6
b = -6
print(id(a))
print(id(b))

gives,

2827426032208
2827426032272

this makes sense so far,

But any number greater than 256 should have different id,

This should return different IDs,

a = 257
b = 257
print(id(a))
print(id(b))

But it doesn't

2177675280112
2177675280112

Even when I am using very large integer, the IDs are same,

a = 2571299123876321621378
b = 2571299123876321621378
print(id(a))
print(id(b))

gives me,

1956826139184
1956826139184

Can someone tell me why number greater than 256 have same IDs even though in the Python code the range is -5 to 257 (not inclusive)

EDIT:

I have tried using PyCharm with both Python 2.7 and 3.6. Also tried on PythonTutor.com

MaverickD
  • 1,207
  • 1
  • 11
  • 25
  • 1
    From a logical point of view, the information that numbers between -5 and 256 have the same ids don't tell you anything about the ids of integers outside of this range. – Eric Duminil Nov 06 '18 at 08:22
  • it used to, if you check answers on this question, https://stackoverflow.com/questions/306313/is-operator-behaves-unexpectedly-with-integers any number greater than `256` used to had different `id` – MaverickD Nov 06 '18 at 08:23
  • @EricDuminil, does that mean python now cache every possible integer `>= -5` – MaverickD Nov 06 '18 at 08:24
  • Cannot reproduce on Python 3.6.3 (I get different ids on the last two examples). Same for Python2. Are you sure this is exactly how you are testing this? – kabanus Nov 06 '18 at 08:26
  • I ran it on PyCharm using Python 3.6, also tested it on http://pythontutor.com/ – MaverickD Nov 06 '18 at 08:28
  • I think those are running wrapping the runs in a function. I suggest you add this information to the question. – kabanus Nov 06 '18 at 08:29
  • added the details @kabanus – MaverickD Nov 06 '18 at 08:31
  • If you are putting them in a function, the bytecode compiler is probably noticing that the two constants are the same and allocating just one of them. This does not work for `-6` because that's `-` applied to integer `6` at run-time. – torek Nov 06 '18 at 08:32
  • @torek, Thank you for your comment. I haven't put it in any function. just running directly in pycharm. how can I verify how pycharm is causing this? – MaverickD Nov 06 '18 at 08:37
  • I'm not sure you can. Also, I tested using `dis.dis` on a definition that sets a variable to `-5` and it loads `-5` as a constant, rather than loading `5` and invoking `-` on it, so that theory seems wrong anyway. – torek Nov 06 '18 at 08:44

2 Answers2

1

On mint Python 3.6.3 (2 as well) I cannot reproduce. My guess is PyCharm or pythontutor are wrapping the run in something before interpreting - since those are not open code we cannot see the internals so I cannot verify. The reason I think this is true, is while (everything below is mint Python 3):

>>> x=2571299123876321621378
>>> y=2571299123876321621378
>>> print(id(x),id(y))
140671727739528 140671727739808

You can have this:

>>> def bla():
...  x=2571299123876321621378
...  y=2571299123876321621378
...  id(x)
...  print(id(x),id(y))
...
>>> bla()
140671727742528 140671727742528

so wrapping the two integers in something the interpreter can compile allows for these extra optimizations - like using the same constant for both definitions. Note this is limited as well:

>>> def bla():
...  x=2571299123876321621378
...  y=2571299123876321621378
...  print(id(x),id(y))
...  x+=1
...  y+=1
...  print(id(x),id(y))
...
>>> bla()
140671727755592 140671727755592
140671728111088 140671728108808

I would not have code that depends on this on any way - the guarantee is only on -5 to 256.

kabanus
  • 22,925
  • 6
  • 32
  • 68
0

This problem is duplicated, answered here

When you run code in a .py script, the entire file is compiled into a code object before executing it. In this case, CPython is able to make certain optimizations

Mushfirat Mohaimin
  • 1,503
  • 2
  • 6
  • 20
hrdom
  • 19
  • 5