3

Let s say i have 4 files saved on my computer as .npz files : W,X,Y and Z. Let s assume that my computer can not endure to load at the same time more than one of them in term of RAM consumption.

How can I be able to run this command ? :

 matplotlib.pyplot.boxplot([W],[X],[Y],[Z])

In other terms, how can I load W, plot W, delete W then load Y, plot Y, delete Y, ... and have the 4 of them on the same figure ? ( and not a subplot )

Thank you !

Magea
  • 117
  • 1
  • 10

2 Answers2

5

The matplotlib.axes.boxplot function actually calls two functions under the hood. One to compute the necessary statistics (cbook.boxplot_stats) and one to actually draw the plot (matplotlib.axes.bxp). You can exploit this structure, by calling the first for each dataset (by loading one at a time) and then feed the results to the plotting function.

In this example below we have 3 datasets and iterate over them to collect the output of cbook.boxplot_stats (which needs only very little memory). After that call to ax.bxp creates the graph. (In your application you would iteratively load a file, use boxplot_stats and delete the data)

import matplotlib.cbook as cbook
import matplotlib.pyplot as plt
import numpy as np


x = np.random.rand(10,10)
y = np.random.rand(10,10)
z = np.random.rand(10,10)

fig, ax = plt.subplots(1,1)

bxpstats = list()
for dataset, label in zip([x, y, z], ['X', 'Y', 'Z']):
    bxpstats.extend(cbook.boxplot_stats(np.ravel(dataset), labels=[label]))
ax.bxp(bxpstats)
plt.show()

Result:

enter image description here

hitzg
  • 11,442
  • 47
  • 53
  • Wow ! It seems that what you propose is exactly what I was looking for. I am going to try it asap and I ll let you know ! Thank you ! – Magea Apr 28 '15 at 07:26
  • I do not find boxplot_stats in cbook output ... I guess I do not have the right version of it, I need to look further in it. But I understood what you proposed to me , and it seems just fine – Magea Apr 28 '15 at 07:39
  • I believe that the boxplot function was heavily refactored in v1.4 of matplotlib. So if you use an older version, this solution will not work. To check your version of matplotlib you can run `import matplotlib; print matplotlib.__version__`. If it is older than 1.4, I would suggest to update it – hitzg Apr 28 '15 at 08:07
  • Hi. Yes I checked that, I run a 1.3.1 version. It is not my computer, but somebody's at work, which I do not own the sudo rights to update packages ... I need to wait until he can do that for me . But still thanks ! – Magea Apr 28 '15 at 08:16
0

One option is to pass a random sample of your data to the plotting function.

Or, because the boxplot contains only aggregate data, so you should consider calculating those aggregate values separately, and then applying them to the boxplot visualization.

Using the full option list from the documentation, you may be able to construct boxplots by passing aggregate data:

boxplot(self, x, notch=False, sym='b+', vert=True, whis=1.5,
    positions=None, widths=None, patch_artist=False,
    bootstrap=None, usermedians=None, conf_intervals=None,
    meanline=False, showmeans=False, showcaps=True,
    showbox=True, showfliers=True, boxprops=None, labels=None,
    flierprops=None, medianprops=None, meanprops=None,
    capprops=None, whiskerprops=None, manage_xticks=True):

See for example usermedians:

usermedians : array-like or None (default)

An array or sequence whose first dimension (or length) is compatible with x. This overrides the medians computed by matplotlib for each element of usermedians that is not None. When an element of usermedians == None, the median will be computed by matplotlib as normal.

Community
  • 1
  • 1
philshem
  • 23,689
  • 7
  • 58
  • 120
  • Hi ! So if I understand correctly, you mean that I should load W, create this agregate, delete W, load X, ... and then use boxplot to with those agregates ? If I translate for instance what you mean by "agregated data", you mean not having anymore a square defined by all the points contained in it, but only by his "skeleton" ? the lines instead of the filling part ? If so, I have no clue how to do that properly, especially with the huge amount of Data I have ( each W,X ... is 150 Go of 512*512*256*512* Arrays that I turn into reshape(-1) for boxplot – Magea Apr 27 '15 at 14:54