14

I have huge json objects containing 2D lists of coordinates that I need to transform into numpy arrays for processing.

However using json.loads followed with np.array() is too slow.

Is there a way to increase the speed of creation of numpy arrays from json?

import json
import numpy as np

json_input = '{"rings" : [[[-8081441.0, 5685214.0], [-8081446.0, 5685216.0], [-8081442.0, 5685219.0], [-8081440.0, 5685211.0], [-8081441.0, 5685214.0]]]}'

dict = json.loads(json_input)
numpy_2d_arrays = [np.array(ring) for ring in dict["rings"]]

I would take any solution whatsoever!

Below the Radar
  • 6,947
  • 10
  • 58
  • 131

3 Answers3

5

The simplest answer would just be:

numpy_2d_arrays = np.array(dict["rings"])

As this avoids explicitly looping over your array in python you would probably see a modest speedup. If you have control over the creation of json_input it would be better to write out as a serial array. A version is here.

Community
  • 1
  • 1
Daniel
  • 18,441
  • 7
  • 56
  • 74
3

Since JSON syntax is really near to Python syntax, I suggest you to use ast.literal_eval. It may be faster…

import ast
import numpy as np

json_input = """{"rings" : [[[-8081441.0, 5685214.0],
                             [-8081446.0, 5685216.0],
                             [-8081442.0, 5685219.0],
                             [-8081440.0, 5685211.0],
                             [-8081441.0, 5685214.0]]]}"""

rings = ast.literal_eval(json_input)
numpy_2d_arrays = [np.array(ring) for ring in rings["rings"]]

Give it a try. And tell us.

Laurent LAPORTE
  • 20,141
  • 5
  • 53
  • 92
0

For this specific data, you could try this

import numpy as np

json_input = '{"rings" : [[(-8081441.0, 5685214.0), (-8081446.0, 5685216.0), (-8081442.0, 5685219.0), (-8081440.0, 5685211.0), (-8081441.0, 5685214.0)]]}'
i = json_input.find('[')
L = eval(json_input[i+1:-2])
print(np.array(L))
Gribouillis
  • 2,122
  • 1
  • 8
  • 14