571

I have looked through the information that the Python docs give, but I'm still a little confused. Could somebody post sample code that would write a new file then use pickle to dump a dictionary into it?

CryptoFool
  • 17,917
  • 4
  • 23
  • 40
Chachmu
  • 6,588
  • 6
  • 28
  • 35
  • 5
    Read through this: http://www.doughellmann.com/PyMOTW/pickle/ and come back when you need a specific question – pyfunc Jun 27 '12 at 02:16
  • Check here first though http://stackoverflow.com/questions/5145664/storing-unpicklabe-pygame-surface-objects-in-external-files – John La Rooy Jun 27 '12 at 03:00

11 Answers11

1112

Try this:

import pickle

a = {'hello': 'world'}

with open('filename.pickle', 'wb') as handle:
    pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('filename.pickle', 'rb') as handle:
    b = pickle.load(handle)

print(a == b)

There's nothing about the above solution that is specific to a dict object. This same approach will will work for many Python objects, including instances of arbitrary classes and arbitrarily complex nestings of data structures. For example, replacing the second line with these lines:

import datetime
today = datetime.datetime.now()
a = [{'hello': 'world'}, 1, 2.3333, 4, True, "x", 
     ("y", [[["z"], "y"], "x"]), {'today', today}]

will produce a result of True as well.

Some objects can't be pickled due to their very nature. For example, it doesn't make sense to pickle a structure containing a handle to an open file.

CryptoFool
  • 17,917
  • 4
  • 23
  • 40
Blender
  • 275,078
  • 51
  • 420
  • 480
  • 7
    @houbysoft: Why did you remove `pickle.HIGHEST_PROTOCOL`? – Blender Dec 24 '16 at 23:01
  • 57
    @Blender: irrelevant and needlessly complicated for this level of question -- the average user will be just fine with the defaults. – houbysoft Dec 25 '16 at 02:52
  • 44
    @houbysoft: True for Python 3 users, but on Python 2, using the default protocol (0) is not only incredibly inefficient on time and space, but it can't actually handle many things that protocol 2+ handles just fine (e.g. new-style classes that use `__slots__`). I'm not saying you should always use `HIGHEST_PROTOCOL`, but ensuring you don't use protocol 0 or 1 is actually rather important. – ShadowRanger Aug 23 '17 at 18:54
  • 30
    What does `pickle.HIGHEST_PROTOCOL` actually do? – BallpointBen May 02 '18 at 23:27
  • 18
    @BallpointBen: It picks the highest protocol version your version of Python supports: https://docs.python.org/3/library/pickle.html#data-stream-format – Blender May 03 '18 at 00:53
  • 3
    To make it more concise you can write `protocol=-1` (similar to -1 indexing in a list). – Matthew D. Scholefield Nov 01 '19 at 04:13
  • 1
    @AnkurS IMHO calling a file object "fd" or "handle" is useful when you want to distinguish it from the actual file on disk, though that's not super relevant here. I agree the variable should be called `file`, ideally. – wjandrea Apr 30 '20 at 15:37
  • I don't know much about file loading and saving in pythoin but for good practice shouldn't you end your save and load with a ```handle.close()``` – LNiederha Aug 04 '20 at 11:53
  • @LoïcNiederhauser no, the `with` context takes care of that here. – bugmenot123 Aug 22 '20 at 09:24
  • 1
    'wb' is absolutely necessary here. If the option 'wb' is not passed, then you wont be able to dump binary format. Anyway, great answer! – santhisenan Dec 04 '20 at 00:50
  • 2
    Since Python2 is now deprecated and this is the top answer, we might remove `pickle.HIGHEST_PROTOCOL` – Astariul Mar 25 '21 at 06:50
  • @Astariul There is no official support for Python 2 now, but still a lot of codebases might use Python 2. – Kishore Jul 22 '21 at 07:59
  • why `wb` and not `w+`? – Charlie Parker Jul 29 '21 at 22:55
  • @Astariul: You are most incorrect, `HIGHEST_PROTOCOL` is still relevant and should be used in Python 3. – martineau Jan 19 '22 at 17:53
  • @CharlieParker: Because 1.) `w+` doesn't open the file in binary mode, and 2.) the ability to also be able to read from the file isn't needed. – martineau Jan 19 '22 at 17:56
  • I'm getting a 'no such file' error. How do I handle first-time writes? – john ktejik Jan 21 '22 at 18:33
  • Per the comment I added to the question, I think it would be cool to explain that there's nothing at all special about pickling a `dict` and that this same code will work for many other things that `your_data` might point to. – CryptoFool Mar 05 '22 at 20:42
151
import pickle

your_data = {'foo': 'bar'}

# Store data (serialize)
with open('filename.pickle', 'wb') as handle:
    pickle.dump(your_data, handle, protocol=pickle.HIGHEST_PROTOCOL)

# Load data (deserialize)
with open('filename.pickle', 'rb') as handle:
    unserialized_data = pickle.load(handle)

print(your_data == unserialized_data)

The advantage of HIGHEST_PROTOCOL is that files get smaller. This makes unpickling sometimes much faster.

Important notice: The maximum file size of pickle is about 2GB.

Alternative way

import mpu
your_data = {'foo': 'bar'}
mpu.io.write('filename.pickle', data)
unserialized_data = mpu.io.read('filename.pickle')

Alternative Formats

For your application, the following might be important:

  • Support by other programming languages
  • Reading / writing performance
  • Compactness (file size)

See also: Comparison of data serialization formats

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

Martin Thoma
  • 108,021
  • 142
  • 552
  • 849
42
# Save a dictionary into a pickle file.
import pickle

favorite_color = {"lion": "yellow", "kitty": "red"}  # create a dictionary
pickle.dump(favorite_color, open("save.p", "wb"))  # save it into a file named save.p

# -------------------------------------------------------------
# Load the dictionary back from the pickle file.
import pickle

favorite_color = pickle.load(open("save.p", "rb"))
# favorite_color is now {"lion": "yellow", "kitty": "red"}
nietaki
  • 8,227
  • 2
  • 43
  • 54
user3465692
  • 439
  • 4
  • 3
  • 2
    is it necessary to use a close() after the open()? – PlsWork Apr 30 '18 at 00:42
  • 1
    Yes, in general. However in CPython(The default python that you probably have) the file is automatically closed whenever the file object expires (when nothing refers to it). In this case since nothing refers to the file object after being returned by open(), it will be closed as soon as load returns. This is not considered good practice and will cause problems on other systems – Ankur S Jun 29 '18 at 13:17
  • 1
    why `wb` and not `w+`? – Charlie Parker Jul 29 '21 at 22:55
16

In general, pickling a dict will fail unless you have only simple objects in it, like strings and integers.

Python 2.7.9 (default, Dec 11 2014, 01:21:43) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from numpy import *
>>> type(globals())     
<type 'dict'>
>>> import pickle
>>> pik = pickle.dumps(globals())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems
    save(v)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save
    rv = reduce(self.proto)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle module objects
>>> 

Even a really simple dict will often fail. It just depends on the contents.

>>> d = {'x': lambda x:x}
>>> pik = pickle.dumps(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems
    save(v)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 748, in save_global
    (obj, module, name))
pickle.PicklingError: Can't pickle <function <lambda> at 0x102178668>: it's not found as __main__.<lambda>

However, if you use a better serializer like dill or cloudpickle, then most dictionaries can be pickled:

>>> import dill
>>> pik = dill.dumps(d)

Or if you want to save your dict to a file...

>>> with open('save.pik', 'w') as f:
...   dill.dump(globals(), f)
... 

The latter example is identical to any of the other good answers posted here (which aside from neglecting the picklability of the contents of the dict are good).

Mike McKerns
  • 30,724
  • 8
  • 111
  • 135
16

Simple way to dump a Python data (e.g. dictionary) to a pickle file.

import pickle

your_dictionary = {}

pickle.dump(your_dictionary, open('pickle_file_name.p', 'wb'))
ghchoi
  • 4,245
  • 4
  • 25
  • 50
11
>>> import pickle
>>> with open("/tmp/picklefile", "wb") as f:
...     pickle.dump({}, f)
... 

normally it's preferable to use the cPickle implementation

>>> import cPickle as pickle
>>> help(pickle.dump)
Help on built-in function dump in module cPickle:

dump(...)
    dump(obj, file, protocol=0) -- Write an object in pickle format to the given file.

    See the Pickler docstring for the meaning of optional argument proto.
John La Rooy
  • 281,034
  • 50
  • 354
  • 495
9

If you just want to store the dict in a single file, use pickle like that

import pickle

a = {'hello': 'world'}

with open('filename.pickle', 'wb') as handle:
    pickle.dump(a, handle)

with open('filename.pickle', 'rb') as handle:
    b = pickle.load(handle)

If you want to save and restore multiple dictionaries in multiple files for caching and store more complex data, use anycache. It does all the other stuff you need around pickle

from anycache import anycache

@anycache(cachedir='path/to/files')
def myfunc(hello):
    return {'hello', hello}

Anycache stores the different myfunc results depending on the arguments to different files in cachedir and reloads them.

See the documentation for any further details.

c0fec0de
  • 611
  • 8
  • 4
3

FYI, Pandas has a method to save pickles now.

I find it easier.

pd.to_pickle(object_to_save,'/temp/saved_pkl.pickle' )
George Sotiropoulos
  • 1,325
  • 1
  • 18
  • 27
2
import pickle

dictobj = {'Jack' : 123, 'John' : 456}

filename = "/foldername/filestore"

fileobj = open(filename, 'wb')

pickle.dump(dictobj, fileobj)

fileobj.close()
Nunser
  • 4,494
  • 8
  • 23
  • 35
Rahul Nair
  • 29
  • 3
-2

If you want to handle writing or reading in one line without file opening:

  import joblib

  my_dict = {'hello': 'world'}

  joblib.dump(my_dict, "my_dict.pickle") # write pickle file
  my_dict_loaded = joblib.load("my_dict.pickle") # read pickle file
gench
  • 848
  • 1
  • 10
  • 14
-11

I've found pickling confusing (possibly because I'm thick). I found that this works, though:

myDictionaryString=str(myDictionary)

Which you can then write to a text file. I gave up trying to use pickle as I was getting errors telling me to write integers to a .dat file. I apologise for not using pickle.

Pedro Rhian
  • 91
  • 1
  • 6
  • 1
    -1: Should save it as it is (i.e, a python object) so that we can read it later without hours waiting to run it again. Pickle allows us to store a python object to read later. – Catbuilts Oct 05 '18 at 04:07
  • This is an old answer coming back in the Low Quality Posts queue.. It is not a bad solution in that it likely works for very simple dictionaries, but it's very reasonable for a `dict` to contain further depth of objects (which may be printed just by name) and/or objects without any or a complete string representation. – ti7 Feb 17 '20 at 22:40
  • 1
    To add to @ti7 's point, regardless of the technical merit of the answer, this post is not VLQ. If someone feels that this answer is inaccurate, they should downvote and/or comment explaining why, *not* flag it as VLQ. – EJoshuaS - Stand with Ukraine Feb 19 '20 at 02:02