Accelerating AlexNet by Reducing Image Resolution

Running AlexNet in Caffe with 128×128 instead of 256×256 images, we observed a 5.4x speedup and a <2 percentage point drop in ImageNet accuracy:

Input Crop Top-1 accuracy Top-5 accuracy Frame rate at test-time
256×256 227×227 57.1% 80.2% 624 fps
128×128 99×99 55.9% (-1.2) 78.7% (-1.5) 3368 fps (5.4x speedup)

Reducing the input data size reduces the amount of work that every convolutional layer needs to perform.


  • In Caffe’s default AlexNet configuration, we train and test with 256×256 images, with randomized 227×227 crops for training and central 227×227 crops for testing.
  • In our 128×128 experiment, we train and test with 99×99 crops. (256-227=29, and 128-99=29.) 99×99 crops contain 5.25x fewer pixels than 227×227 crops. Other than this dimension change, our 128×128 experiments are identical to the default Caffe AlexNet configuration.
  • Speed tests were performed on an NVIDIA K40 with CUDA 6.5, and Caffe compiled with cuDNN version 1.
  • For 256×256 images, Alex Krizhevsky et al reported slightly higher accuracy (~82% top-5) than we are achieving in Caffe. This may be related to data augmentation settings in training and/or testing.
  • Trained on ILSVRC2012-train, tested on ILSVRC2012-val.

Post a Comment

Your email is never published nor shared. Required fields are marked *

Spam protection by WP Captcha-Free