0

I am trying to use joblib.Memory to cache the load of a DXF file (a 3D CAD file format) and am finding that loading the cached object seems to end in a recursion error which exhausts the python stack. I've created a MWE that demonstrates the problem.

import numpy as np

from ezdxf import recover

import numpy as np

from joblib import Memory
from numpy.lib.function_base import extract
location = './cachedir'
memory = Memory(location, verbose=0)

def _load_dxf_file( filename ):

       print( 'Loading DXF file %s' % filename )

       doc, auditor = recover.readfile(filename)

       print( '    DXF file load complete.' )

       return doc



if __name__ == '__main__':

       load_dxf_file = memory.cache( _load_dxf_file )
       filename = 'cube_mesh_2.dxf'
       doc = load_dxf_file( filename )

The sample DXF file (a simple cube) is available at this gist (too large to paste here).

https://gist.github.com/jrjbertram/87e31b3bb0ce2d3771dce7f50d2d0fba

The object loads the first time without any issues (and the data is fine, plotting not included in the MWE.) Re-running the script so that the cached files are used results in errors like:


   1 WARNING:root:[MemorizedFunc(func=<function _load_dxf_file at 0x7fbaee2a38b0>, location=./cachedir/joblib)]: Exception while loading results for      _load_dxf_file('cube_mesh_2.dxf')
   2  Traceback (most recent call last):
   3   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/joblib/memory.py", line 513, in _cached_call
   4     out = self.store_backend.load_item(
   5   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/joblib/_store_backends.py", line 170, in load_item
   6     item = numpy_pickle.load(f)
   7   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 575, in load
   8     obj = _unpickle(fobj)
   9   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 504, in _unpickle
  10     obj = unpickler.load()
  11   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/pickle.py", line 1210, in load
  12     dispatch[key[0]](self)
  13   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 329, in load_build
  14     Unpickler.load_build(self)
  15   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/pickle.py", line 1701, in load_build
  16     setstate = getattr(inst, "__setstate__", None)
  17   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 126, in __getattr__
  18     attrib_def: Optional[DXFAttr] = self.dxfattribs.get(key)
  19   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 300, in dxfattribs
  20     return self._entity.DXFATTRIBS
  21   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 126, in __getattr__
  22     attrib_def: Optional[DXFAttr] = self.dxfattribs.get(key)
  23   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 300, in dxfattribs
  24     return self._entity.DXFATTRIBS
  25   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 126, in __getattr__
  26     attrib_def: Optional[DXFAttr] = self.dxfattribs.get(key)
  27   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 300, in dxfattribs
  28     return self._entity.DXFATTRIBS
  29   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 126, in __getattr__
  30     attrib_def: Optional[DXFAttr] = self.dxfattribs.get(key)
  31   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 300, in dxfattribs
  32     return self._entity.DXFATTRIBS
<snip>
1993   File "/Users/bertrjr1/opt/anaconda3/lib/python3.8/site-packages/ezdxf/entities/dxfns.py", line 126, in __getattr__
1994     attrib_def: Optional[DXFAttr] = self.dxfattribs.get(key)
1995 RecursionError: maximum recursion depth exceeded

I suspect that there must be some rules about what joblib can successfully cache and restore and the ezdxf library must violate them? Or perhaps this is a pickle limitation similar to what is described here:

Cannot pickle object: maximum recursion depth exceeded

My workaround right now is to load the DXF file (the real one takes a long time and is very large), perform face / vertex / triangulation processing saving results off to numpy arrays, then using joblib.Memory to cache the numpy arrays. However, in my case, I need to rerun this triangulation and other processing often on the same DXF file, so it would be nice to be able to cache the loaded DXF file itself so I can avoid the parsing penalty (as the DXF file format is a very detailed text file that must all be processed.)

I may just need to walk the loaded DXF file and return a pruned version of it in some other format (lists, dictionary, etc?) and then cache that instead.

Any suggestions or ideas welcome.

Thank you, Josh.

jrjbertram
  • 320
  • 2
  • 10

0 Answers0