2 Pre-Trained Networks

This chapter covers:

Running pre-trained image recognition models on sample data
An introduction to GANs (generative adversarial networks) and CycleGAN
Captioning models that can produce text descriptions of images
Sharing models through TorchHub

We closed our first chapter promising to unveil amazing things in this chapter, and now it’s time to deliver.

Computer vision is certainly one of the fields that have been most impacted by the advent of deep learning, for a variety of reasons. The need for classifying or interpreting the content of natural images existed, very large datasets became available and new constructs such as convolutional layers were invented and could be ran quickly on GPUs with unprecedented accuracies. All this combined with the motivation of the Internet giants to understand pictures shot by millions of users through their mobile devices and managed on said giants' platforms. Quite the perfect storm.

2.1 A pre-trained network that recognizes the subject of an image

2.1.1 Obtaining a pre-trained network for image recognition

2.1.2 AlexNet

2.1.3 ResNet

2.1.4 Ready, set, almost run

2.1.5 Run!

2.2 A pre-trained model that fakes it until it makes it

2.2.1 The GAN game

2.2.2 CycleGAN

2.2.3 A network that turns horses into zebras

2.3 A pre-trained network that describes scenes

2.3.1 NeuralTalk2

2.4 Torch Hub

2.5 Conclusion

2.6 Exercises

2.7 Summary