62

I want to print the memory size of all variables in my scope simultaneously.

Something similar to:

for obj in locals().values():
    print sys.getsizeof(obj)

But with variable names before each value so I can see which variables I need to delete or split into batches.

Ideas?

zsquare
  • 9,580
  • 6
  • 52
  • 85
user278016
  • 631
  • 1
  • 6
  • 6

2 Answers2

100

A bit more code, but works in Python 3 and gives a sorted, human readable output:

import sys
def sizeof_fmt(num, suffix='B'):
    ''' by Fred Cirera,  https://stackoverflow.com/a/1094933/1870254, modified'''
    for unit in ['','Ki','Mi','Gi','Ti','Pi','Ei','Zi']:
        if abs(num) < 1024.0:
            return "%3.1f %s%s" % (num, unit, suffix)
        num /= 1024.0
    return "%.1f %s%s" % (num, 'Yi', suffix)

for name, size in sorted(((name, sys.getsizeof(value)) for name, value in locals().items()),
                         key= lambda x: -x[1])[:10]:
    print("{:>30}: {:>8}".format(name, sizeof_fmt(size)))

Example output:

                  umis:   3.6 GiB
       barcodes_sorted:   3.6 GiB
          barcodes_idx:   3.6 GiB
              barcodes:   3.6 GiB
                  cbcs:   3.6 GiB
         reads_per_umi:   1.3 GiB
          umis_per_cbc:  59.1 MiB
         reads_per_cbc:  59.1 MiB
                   _40:  12.1 KiB
                     _:   1.6 KiB
jan-glx
  • 5,998
  • 1
  • 37
  • 54
  • Nice! Can you explain what these `_40` are? To me it shows multiple `_\d+` rows. Some seem to have the exact same size like a named variable, others don't. – MoRe Feb 13 '19 at 09:52
  • 3
    @MoRe these are (probably) temporary variables holding the output of jupyter notebook cells. [see documentation](https://ipython.org/ipython-doc/3/interactive/reference.html#output-caching-system) – jan-glx Feb 14 '19 at 10:37
  • "This system obviously can potentially put heavy memory demands on your system, since it prevents Python’s garbage collector from removing any previously computed results. You can control how many results are kept in memory with the configuration option `InteractiveShell.cache_size`. If you set it to 0, output caching is disabled. You can also use the `%reset` and `%xdel` magics to clear large items from memory" – jan-glx Feb 14 '19 at 10:39
  • This snippet is really useful, although my variables totaled up to about 5.1 GB, whereas memory usage according to `resource.getrusage` was around 10.9 GB. This is in Google Colab. What could be accounting for the rest of the memory usage? – demongolem Mar 23 '20 at 16:17
  • I used this snippet for a=numpy.zeros((6340,200,200)). It shows a=1.9GB. Is it normal? – gocen Mar 31 '20 at 08:22
  • 1
    @gocen: yes. 6340*200*200 doubles *64 bit/double / (8 bit/byte) / (2^30 bytes per GB) = 1.889 GB – jan-glx Mar 31 '20 at 20:57
  • @demongolem: I don't know. And I don't know enough about how python works to answer. I can image libraries allocating memory outside of python, there might be memory leaks, the might be even python variables not in `locals` (e.g. .globals?), maybe garbage collection helps, try memory profiling and consider asking a different question. – jan-glx Mar 31 '20 at 21:04
  • @jan-gix but it is 1.6Kib for this list = [[ ['0.0' for col in range(6340)] for col in range(200)] for row in range(200)] What is the difference? – gocen Apr 01 '20 at 08:43
67

You can iterate over both the key and value of a dictionary using .items()

from __future__ import print_function  # for Python2
import sys

local_vars = list(locals().items())
for var, obj in local_vars:
    print(var, sys.getsizeof(obj))
TomDLT
  • 3,724
  • 1
  • 18
  • 24
zsquare
  • 9,580
  • 6
  • 52
  • 85