A more ambitious image classification dataset : CIFAR-100

Author

Jeremy Fix

Keywords

Keras tutorial, CIFAR-100

Objectives

We now turn to a more difficult problem of classifying RBG images belonging to one of 100 classes with the CIFAR-100 dataset. The CIFAR-100 dataset consists of 60000 32x32 colour images in 100 classes, with 600 images per class. There are 50000 training images and 10000 test images. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a “fine” label (the class to which it belongs) and a “coarse” label (the superclass to which it belongs). To give you an idea of the coarse labels, you find fruits, fish, aquatic mammals, vehicles, … and the fine labels are for example seal, whale, orchids, bicycle, bus, … Keras provides functions to automatically get the CIFAR-100 dataset.

Classical dataset augmentation in CIFAR-100 include :

feature wise standardization
horizontal flip
zero padding of 4 pixels on each side, with random crops of 32x32.

For the last augmentation, you can make use of width_shift_range, height_shift_range, fill_mode=“constant” and cval=0.

I now propose you a list of recent papers published on arxiv and I propose you to try reimplementing their architecture and training setup :

The following papers are trickier to implement :

ShuffleNet (Zhang, Zhou, Lin, & Sun, 2017) ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
ResNet with Stochastic Depth (J. Huang et al., 2016) Deep networks with stochastic depth
Shake-Shake (Gastaldi, 2017) Shake-Shake regularization

If you wish to get an idea of the state of the art in 2015 on CIFAR-100, I invite you to visite the classification scores website.

References

Chollet, F. (2016). Xception: Deep learning with depthwise separable convolutions. CoRR, abs/1610.02357. Retrieved from http://arxiv.org/abs/1610.02357

Gastaldi, X. (2017). Shake-shake regularization. CoRR, abs/1705.07485. Retrieved from http://arxiv.org/abs/1705.07485

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., … Adam, H. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861. Retrieved from http://arxiv.org/abs/1704.04861

Huang, G., Liu, Z., & Weinberger, K. Q. (2016). Densely connected convolutional networks. CoRR, abs/1608.06993. Retrieved from http://arxiv.org/abs/1608.06993

Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., … Murphy, K. (2016). Speed/accuracy trade-offs for modern convolutional object detectors. CoRR, abs/1611.10012. Retrieved from http://arxiv.org/abs/1611.10012

Iandola, F. N., Moskewicz, M. W., Ashraf, K., Han, S., Dally, W. J., & Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR, abs/1602.07360. Retrieved from http://arxiv.org/abs/1602.07360

Zagoruyko, S., & Komodakis, N. (2016). Wide residual networks. CoRR, abs/1605.07146. Retrieved from http://arxiv.org/abs/1605.07146

Zhang, X., Zhou, X., Lin, M., & Sun, J. (2017). ShuffleNet: An extremely efficient convolutional neural network for mobile devices. CoRR, abs/1707.01083. Retrieved from http://arxiv.org/abs/1707.01083

Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2017). Learning transferable architectures for scalable image recognition. CoRR, abs/1707.07012. Retrieved from http://arxiv.org/abs/1707.07012