Some fun with Marvin

Marvin is a deep learning framework for working with convolutional neural networks. We of the Princeton Vision Group wrote it to be really simple and easy to modify, but to still work with higher-dimensional data on the GPU.*

Anyway, I'm going to train the network to classify between dogs and cats using the Oxford Pets database, which is literally thousands of pictures of different breeds of cats and dogs.

Step 1: Prep the data
  • Download the data here
  • Prepare the data into the Marvin data format (Matlab script here)
Step 2: Train the network
  • Define the network architecture (example here)
  • Use a network pre-trained on ImageNet (because then it's already really good; download weights here)
  • TRAIN
Step 3: Test the network

~90% accuracy; not bad! Feel like the bottom left is pretty hard to tell until you see the eyes.

*Eventually I might make an effort to make this blog more understandable to people not in the same field, but for now, just logging what I'm up to every day.