I'm trying to export all data from CKAN instances for a big data project. How can I do this?
Asked
Active
Viewed 679 times
1
-
and how can i get all of its data ? – Mootaz Jun 03 '16 at 08:23
-
What do you mean all of its data? Do you have access to that CKAN instance? Do you want only the files or you also want the metadata? – Alex Palcuie Jun 06 '16 at 10:14
-
@AlexPalcuie, the metadata, the csv and xml files as well – Mootaz Jun 07 '16 at 14:36
-
all data from every ckan instance, or all from one instance? if its one instance, which is it? @ojdo's answer is pretty informative – albert Jun 09 '16 at 18:23
-
also some harvesters are in here: https://github.com/opendatamonitor/ – Ulrich Jun 13 '16 at 15:56
1 Answers
4
Did you have a look at CKAN's extensive API documentation? To get you started, look at the answer to related question How do I get a full list of datasets available on Data.Gov using the CKAN API? and its example of using package_search.
Once you have a list of dataset IDs, you can get their metadata using the package_show API function.
The package url field sometimes contains directly downloadable files, sometimes they seem link to data provider pages which might need some individual handling (i.e. coding). But only restricting batch download to recognised filetypes (e.g. *.csv) should get you pretty far already.