DeepLearning

딥러닝 데이터셋: Mnist, SVHN, CIFAR, ImageNet

jiheek 2022. 4. 11. 22:02

가끔 사이즈를 까먹어서 한번 데이터셋을 정리해보자!

 

1. MNIST

 

MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges

 

yann.lecun.com

train set 60000, test set 10000, 클래스 10개

사이즈: 28x28 grayscale, 32 bit integer

 

 

 

2. SVHN (The Street View House Numbers)

 

The Street View House Numbers (SVHN) Dataset

SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting. It can be seen as similar in flavor to MNIST (e.g., the images are of small cropped digits),

ufldl.stanford.edu

73257 digits for training, 26032 digits for testing, and 531131 additional

사이즈: 32*32 RGB

 

 

 

3. CIFAR

 

CIFAR-10 and CIFAR-100 datasets

< Back to Alex Krizhevsky's home page The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. The CIFAR-10 dataset The CIFAR-10 dataset consists of 60000

www.cs.toronto.edu

- CIFAR-10: train set 50000, test set 10000, 클래스 10개

- CIFAR-100: train set 50000, test set 10000, 클래스 100개 (한 클래스당 train 500, test 100)

사이즈: 32x32 RGB

 

 

 

4. ImageNet

 

ImageNet

Download ImageNet Data The most highly-used subset of ImageNet is the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012-2017 image classification and localization dataset. This dataset spans 1000 object classes and contains 1,281,167 training

image-net.org

1,281,167 training images, 50,000 validation images and 100,000 test images, 클래스 1000개

사이즈: 평균 469x387, RGB