6

I want to train a model for a wide variety of classes. So, I would like to get an ImageNet 21k. Where is it possible to get it (besides the official website as they are not very responsive)?

Patrick Hoefler
  • 5,790
  • 4
  • 31
  • 47
M. Romanov
  • 61
  • 1
  • 2
  • What difficulties are you having with image-net.org's website downloads? –  Mar 11 '17 at 16:51
  • 1
    The download of the imagenet dataset form the downloads is not available until you submit an application for registration. The main trouble is that my colleague submitted it in January, still haven't got it. – M. Romanov Mar 13 '17 at 09:09
  • Probably, the way to go is to download the images URLs and then get the images by URLs. Though it is important to remember that many images may have been removed. – M. Romanov Mar 13 '17 at 09:11
  • 1
    It looks like this is no longer an option. Do you know how can I obtain this dataset now? – zlenyk Jan 12 '21 at 21:52

3 Answers3

3

It seems only winter21_whole.tar.gz is available at the official site, instead fall11_whole.tar is no longer possible other than that mentioned by @Lissanro Rayen.

A preprocessing script is given here.

To untar the tarball:

#!/bin/sh
SRC='/path/to/winter21_whole'
DEST='/path/to/untar/imagenet21k/'
for tarball in $SRC/*.tar;do
    echo "untarring $tarball"
    tardir=$(echo $tarball| cut -d'.' -f 1)
    if [ ! -d $tardir ]; then
        echo "making dir $DEST/$tardir"
        mkdir -p $DEST/$tardir
        tar -xf $tarball -C $DEST/$tardir
    fi  
done

These JPEG files are found corrupted in winter21_whole.tar.gz.

n01678043/n01678043_6448.JPEG 
n01896844/n01896844_997.JPEG 
n02368116/n02368116_318.JPEG 
n02428089/n02428089_710.JPEG 
n02487347/n02487347_1956.JPEG 
n02597972/n02597972_5463.JPEG 
n03957420/n03957420_33553.JPEG 
n03957420/n03957420_30695.JPEG 
n03957420/n03957420_8296.JPEG 
n04135315/n04135315_9318.JPEG 
n04135315/n04135315_8814.JPEG 
n04257684/n04257684_9033.JPEG 
n04427559/n04427559_2974.JPEG 
n06470073/n06470073_47249.JPEG 
n07930062/n07930062_4147.JPEG 
n09224725/n09224725_3995.JPEG 
n09359803/n09359803_8155.JPEG 
n09894445/n09894445_7463.JPEG 
n12353203/n12353203_3849.JPEG 
n12630763/n12630763_8018.JPEG
2

The only place I found where it is possible to actually download full ImageNet: https://academictorrents.com/details/564a77c1e1119da199ff32622a1609431b9f1c47. All other places where I looked either provide just a small portion of it or have broken links, even image-net.org.

1

To elaborate on what @MeadowMuffins wrote

check the article " ImageNet-21K Pretraining for the Masses" for more details about how to pretrain on this dataset, it's more complicated than regular ImageNet1K, but pretrain quality is much (much) better. https://github.com/Alibaba-MIIL/ImageNet21K

mr_t
  • 11
  • 1