PyTorch vs. TensorFlow: Which Framework Is Best for Your Deep Learning Project?
If you are reading this you've probably already started your journey into deep learning. If you are new to this field, in simple terms deep learning is an add-on to develop human-like computers to solve real-world problems with its special brain-like architectures called artificial neural networks. To help develop these architectures, tech giants like Google, Facebook and Uber have released various frameworks for the Python deep learning environment, making it easier for to learn, build and train diversified neural networks. In this article, we’ll take a look at two popular frameworks and compare them: PyTorch vs. TensorFlow. be comparing, in brief, the most used and relied Python frameworks TensorFlow and PyTorch.
Table of Contents
- Google’s TensorFlow
- Facebook’s PyTorch
- What Can We Build With TensorFlow and PyTorch?
- Comparing PyTorch and TensorFlow
- Pros and Cons of PyTorch and TensorFlow
- PyTorch and TF Installation, Versions, Updates
- TensorFlow vs. PyTorch: My Recommendation
TensorFlow is open source deep learning framework created by developers at Google and released in 2015. The official research is published in the paper “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.”
TensorFlow is now widely used by companies, startups, and business firms to automate things and develop new systems. It draws its reputation from its distributed training support, scalable production and deployment options, and support for various devices like Android.
PyTorch is one of the latest deep learning frameworks and was developed by the team at Facebook and open sourced on GitHub in 2017. You can read more about its development in the research paper "Automatic Differentiation in PyTorch."
PyTorch is gaining popularity for its simplicity, ease of use, dynamic computational graph and efficient memory usage, which we'll discuss in more detail later.
What can we build with TensorFlow and PyTorch?
Initially, neural networks were used to solve simple classification problems like handwritten digit recognition or identifying a car’s registration number using cameras. But thanks to the latest frameworks and NVIDIA’s high computational graphics processing units (GPU’s), we can train neural networks on terra bytes of data and solve far more complex problems. A few notable achievements include reaching state of the art performance on the IMAGENET dataset using convolutional neural networks implemented in both TensorFlow and PyTorch. The trained model can be used in different applications, such as object detection, image semantic segmentation and more.
Although the architecture of a neural network can be implemented on any of these frameworks, the result will not be the same. The training process has a lot of parameters that are framework dependent. For example, if you are training a dataset on PyTorch you can enhance the training process using GPU’s as they run on CUDA (a C++ backend). In TensorFlow you can access GPU’s but it uses its own inbuilt GPU acceleration, so the time to train these models will always vary based on the framework you choose.
Top TensorFlow Projects
Sonnet: Sonnet is a library built on top of TensorFlow for building complex neural networks. (https://sonnet.dev/)
Ludwig: Ludwig is a toolbox to train and test deep learning models without the need to write code. (https://uber.github.io/ludwig/)
Top PyTorch Projects
CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. (https://stanfordmlgroup.github.io/projects/chexnet/)
Horizon: A platform for applied reinforcement learning (Applied RL) (https://horizonrl.com)
These are a few frameworks and projects that are built on top of TensorFlow and PyTorch. You can find more on Github and the official websites of TF and PyTorch.
Comparing PyTorch and TensorFlow
The key difference between PyTorch and TensorFlow is the way they execute code. Both frameworks work on the fundamental datatype tensor. You can imagine a tensor as a multi-dimensional array shown in the below picture.
1. Mechanism: Dynamic vs Static graph definition
TensorFlow is a framework composed of two core building blocks:
- A library for defining computational graphs and runtime for executing such graphs on a variety of different hardware.
- A computational graph which has many advantages (but more on that in just a moment).
A computational graph is an abstract way of describing computations as a directed graph. A graph is a data structure consisting of nodes (vertices) and edges. It’s a set of vertices connected pairwise by directed edges.
When you run code in TensorFlow, the computation graphs are defined statically. All communication with the outer world is performed via
tf.Session object and
tf.Placeholder, which are tensors that will be substituted by external data at runtime. For example, consider the following code snippet.
This is how a computational graph is generated in a static way before the code is run in TensorFlow. The core advantage of having a computational graph is allowing parallelism or dependency driving scheduling which makes training faster and more efficient.
Similar to TensorFlow, PyTorch has two core building blocks:
- Imperative and dynamic building of computational graphs.
- Autograds: Performs automatic differentiation of the dynamic graphs.
As you can see in the animation below, the graphs change and execute nodes as you go with no special session interfaces or placeholders. Overall, the framework is more tightly integrated with the Python language and feels more native most of the time. Hence, PyTorch is more of a pythonic framework and TensorFlow feels like a completely new language.
These differ a lot in the software fields based on the framework you use. TensorFlow provides a way of implementing dynamic graph using a library called TensorFlow Fold, but PyTorch has it inbuilt.
2. Distributed Training
One main feature that distinguishes PyTorch from TensorFlow is data parallelism. PyTorch optimizes performance by taking advantage of native support for asynchronous execution from Python. In TensorFlow, you'll have to manually code and fine tune every operation to be run on a specific device to allow distributed training. However, you can replicate everything in TensorFlow from PyTorch but you need to put in more effort. Below is the code snippet explaining how simple it is to implement distributed training for a model in PyTorch.
When it comes to visualization of the training process, TensorFlow takes the lead. Visualization helps the developer track the training process and debug in a more convenient way. TenforFlow’s visualization library is called TensorBoard. PyTorch developers use Visdom, however, the features provided by Visdom are very minimalistic and limited, so TensorBoard scores a point in visualizing the training process.
Features of TensorBoard
- Tracking and visualizing metrics such as loss and accuracy.
- Visualizing the computational graph (ops and layers).
- Viewing histograms of weights, biases or other tensors as they change over time.
- Displaying images, text and audio data.
- Profiling TensorFlow programs.
Features of Visdom
- Handling callbacks.
- Plotting graphs and details.
- Managing environments.
4. Production Deployment
When it comes to deploying trained models to production, TensorFlow is the clear winner. We can directly deploy models in TensorFlow using TensorFlow serving which is a framework that uses REST Client API.
In PyTorch, these production deployments became easier to handle than in it’s latest 1.0 stable version, but it doesn't provide any framework to deploy models directly on to the web. You'll have to use either Flask or Django as the backend server. So, TensorFlow serving may be a better option if performance is a concern.
5. Defining a simple Neural Network in PyTorch and TensorFlow
Let's compare how we declare the neural network in PyTorch and TensorFlow.
In PyTorch, your neural network will be a class and using torch.nn package we import the necessary layers that are needed to build your architecture. All the layers are first declared in the
__init__() method, and then in the
forward() method we define how input x is traversed to all the layers in the network. Lastly, we declare a variable model and assign it to the defined architecture (
model = NeuralNet()).
Recently Keras, a neural network framework which uses TensorFlow as the backend was merged into TF Repository. From then on the syntax of declaring layers in TensorFlow was similar to the syntax of Keras. First, we declare the variable and assign it to the type of architecture we will be declaring, in this case a “
Sequential()” architecture. Next, we directly add layers in a sequential manner using
model.add() method. The type of layer can be imported from
tf.layers as shown in the code snippet below.
Pros and Cons of PyTorch and TEnsorFlow
- Simple built-in high-level API.
- Visualizing training with Tensorboard.
- Production-ready thanks to TensorFlow serving.
- Easy mobile support.
- Open source.
- Good documentation and community support.
- Static graph.
- Debugging method.
- Hard to make quick changes.
- Python-like coding.
- Dynamic graph.
- Easy & quick editing.
- Good documentation and community support.
- Open source.
- Plenty of projects out there using PyTorch.
- Third-party needed for visualization.
- API server needed for production.
PyTorch and TF Installation, Versions, Updates
Recently PyTorch and TensorFlow released new versions, PyTorch 1.0 (the first stable version) and TensorFlow 2.0 (running on beta). Both these versions have major updates and new features that make the training process more efficient, smooth and powerful.
To install the latest version of these frameworks on your machine you can either build from source or install from pip
● macOS and Linux
pip3 install torch torchvision
pip3 install https://download.pytorch.org/whl/cu90/torch-1.1.0-cp36-cp36m-win_amd64.whl
pip3 install https://download.pytorch.org/whl/cu90/torchvision-0.3.0-cp36-cp36m-win_amd64.whl
● macOS, Linux, and Windows
# Current stable release for CPU-only
pip install tensorflow
# Install TensorFlow 2.0 Beta
pip install tensorflow==2.0.0-beta1
To check if you’re installation was successful, go to your command prompt or terminal and follow the below steps.
TensorFlow vs PyTorch: My REcommendation
TensorFlow is a very powerful and mature deep learning library with strong visualization capabilities and several options to use for high-level model development. It has production-ready deployment options and support for mobile platforms. PyTorch, on the other hand, is still a young framework with stronger community movement and it's more Python friendly.
What I would recommend is if you want to make things faster and build AI-related products, TensorFlow is a good choice. PyTorch is mostly recommended for research-oriented developers as it supports fast and dynamic training.