One of the hurdles that new Python developers have to get over is understanding the Python packaging ecosystem. This blog post is based on material covered in our Python for Programmers training course, which attempts to explain pip and virtualenv for new Python users.
Prerequisites
Python for Programmers is aimed at developers who are already familiar with one or more programming languages, and so we assume a certain amount of technical knowledge. It will help if you’re reasonably comfortable with a command line. The examples below use bash
, which is the default shell on Macs and most Linux systems. But the commands are simple enough that the concepts should be transferrable to any terminal, such as PowerShell for Windows.
pip
Let’s dive in. pip is a tool for installing Python packages from the Python Package Index.
PyPI (which you’ll occasionally see referred to as The Cheeseshop) is a repository for open-source third-party Python packages. It’s similar to RubyGems in the Ruby world, PHP’s Packagist, CPAN for Perl, and NPM for Node.js.
Python actually has another, more primitive, package manager called easy_install
, which is installed automatically when you install Python itself. pip is vastly superior to easy_install
for lots of reasons, and so should generally be used instead. You can use easy_install
to install pip as follows:
$ sudo easy_install pip
You can then install packages with pip as follows (in this example, we’re installing Django):
# DON'T DO THIS
$ sudo pip install django
Here, we’re installing Django globally on the system. But in most cases, you shouldn’t install packages globally. Read on to find out why.
virtualenv
virtualenv solves a very specific problem: it allows multiple Python projects that have different (and often conflicting) requirements, to coexist on the same computer.
What problem does it solve?
To illustrate this, let’s start by pretending virtualenv doesn’t exist. Imagine we’re going to write a Python program that needs to make HTTP requests to a remote web server. We’re going to use the Requests library, which is brilliant for that sort of thing. As we saw above, we can use pip to install Requests.
But where on your computer does pip
install the packages to? Here’s what happens if I try to run pip install requests
:
$ pip install requests
Downloading/unpacking requests
Downloading requests-1.1.0.tar.gz (337Kb): 337Kb downloaded
Running setup.py egg_info for package requests
Installing collected packages: requests
Running setup.py install for requests
error: could not create '/Library/Python/2.7/site-packages/requests': Permission denied
Oops! It looks like pip
is trying to install the package into /Library/Python/2.7/site-packages/requests
. This is a special directory that Python knows about. Anything that’s installed in site-packages
can be imported by your programs.
We’re seeing the error because /Library/
(on a Mac) is not usually writeable by “ordinary” users. To fix the error, we can run sudo pip install requests
(sudo
means “run this command as a superuser”). Then everything will work fine:
$ sudo pip install requests
Password:
Downloading/unpacking requests
Running setup.py egg_info for package requests
Installing collected packages: requests
Running setup.py install for requests
Successfully installed requests
Cleaning up...
This time it worked. We can now type python
and try importing our new library:
>>> import requests
>>> requests.get('http://dabapps.com')
So, we now know that we can import requests
and use it in our program. We go ahead and work feverishly on our new program, using requests
(and probably lots of other libraries from PyPI too). The software works brilliantly, we make loads of money, and our clients are so impressed that they ask us to write another program to do something slightly different.
But this time, we find a brand new feature that’s been added to requests
since we wrote our first program that we really need to use in our second program. So we decide to upgrade the requests
library to get the new feature:
sudo pip install --upgrade requests
Everything seems fine, but we’ve unknowingly created a disaster!
Next time we try to run it, we discover that our original program (the one that made us loads of money) has completely stopped working and is raising errors when we try to run it. Why? Because something in the API of the requests
library has changed between the previous version and the one we just upgraded to. It might only be a small change, but it means our code no longer uses the library correctly. Everything is broken!
Sure, we could fix the code in our first program to use the new version of the requests
API, but that takes time and distracts us from our new project. And, of course, a seasoned Python programmer won’t just have two projects but dozens – and each project might have dozens of dependencies! Keeping them all up-to-date and working with the same versions of every library would be a complete nightmare.
How does virtualenv help?
virtualenv solves this problem by creating a completely isolated virtual environment for each of your programs. An environment is simply a directory that contains a complete copy of everything needed to run a Python program, including a copy of the python
binary itself, a copy of the entire Python standard library, a copy of the pip
installer, and (crucially) a copy of the site-packages
directory mentioned above. When you install a package from PyPI using the copy of pip
that’s created by the virtualenv
tool, it will install the package into the site-packages
directory inside the virtualenv directory. You can then use it in your program just as before.
How can I install virtualenv?
If you already have pip
, the easiest way is to install it globally sudo pip install virtualenv
. Usually pip
and virtualenv
are the only two packages you ever need to install globally, because once you’ve got both of these you can do all your work inside virtual environments.
In fact, virtualenv
comes with a copy of pip
which gets copied into every new environment you create, so virtualenv
is really all you need. You can even install it as a separate standalone package (rather than from PyPI). This might be easier for Windows users. See virtualenv.org for instructions.
How do I create a new virtual environment?
You only need the virtualenv
tool itself when you want to create a new environment. This is really simple. Start by changing directory into the root of your project directory, and then use the virtualenv
command-line tool to create a new environment:
$ cd ~/code/myproject/
$ virtualenv env
New python executable in env/bin/python
Installing setuptools............done.
Installing pip...............done.
Here, env
is just the name of the directory you want to create your virtual environment inside. It’s a common convention to call this directory env
, and to put it inside your project directory (so, say you keep your code at ~/code/projectname/
, the environment will be at ~/code/projectname/env/
– each project gets its own env
). But you can call it whatever you like and put it wherever you like!
Note: if you’re using a version control system like git
, you shouldn’t commit the env
directory. Add it to your .gitignore
file (or similar).
How do I use my shiny new virtual environment?
If you look inside the env
directory you just created, you’ll see a few subdirectories:
$ ls env
bin include lib
The one you care about the most is bin
. This is where the local copy of the python
binary and the pip
installer exists. Let’s start by using the copy of pip
to install requests
into the virtualenv (rather than globally):
$ env/bin/pip install requests
Downloading/unpacking requests
Downloading requests-1.1.0.tar.gz (337kB): 337kB downloaded
Running setup.py egg_info for package requests
Installing collected packages: requests
Running setup.py install for requests
Successfully installed requests
Cleaning up...
It worked! Notice that we didn’t need to use sudo
this time, because we’re not installing requests
globally, we’re just installing it inside our home directory.
Now, instead of typing python
to get a Python shell, we type env/bin/python
, and then…
>>> import requests
>>> requests.get('http://dabapps.com')
But that’s a lot of typing!
virtualenv has one more trick up its sleeve. Instead of typing env/bin/python
and env/bin/pip
every time, we can run a script to activate the environment. This script, which can be executed with source env/bin/activate
, simply adjusts a few variables in your shell (temporarily) so that when you type python
, you actually get the Python binary inside the virtualenv instead of the global one:
$ which python
/usr/bin/python
$ source env/bin/activate
$ which python
/Users/jamie/code/myproject/env/bin/python
So now we can just run pip install requests
(instead of env/bin/pip install requests
) and pip
will install the library into the environment, instead of globally. The adjustments to your shell only last for as long as the terminal is open, so you’ll need to remember to rerun source env/bin/activate
each time you close and open your terminal window. If you switch to work on a different project (with its own environment) you can run deactivate
to stop using one environment, and then source env/bin/activate
to activate the other.
Activating and deactivating environments does save a little typing, but it’s a bit “magical” and can be confusing. Make your own decision about whether you want to use it.
Requirements files
virtualenv and pip make great companions, especially when you use the requirements
feature of pip. Each project you work on has its own requirements.txt
file, and you can use this to install the dependencies for that project into its virtual environment:
env/bin/pip install -r requirements.txt
Recap
- pip is a tool for installing packages from the Python Package Index.
- virtualenv is a tool for creating isolated Python environments containing their own copy of
python
,pip
, and their own place to keep libraries installed from PyPI. - It’s designed to allow you to work on multiple projects with different dependencies at the same time on the same machine.
- You can see instructions for installing it at virtualenv.org.
- After installing it, run
virtualenv env
to create a new environment inside a directory calledenv
. - You’ll need one of these environments for each of your projects. Make sure you exclude these directories from your version control system.
- To use the versions of
python
andpip
inside the environment, typeenv/bin/python
andenv/bin/pip
respectively. - You can “activate” an environment with
source env/bin/activate
and deactivate one withdeactivate
. This is entirely optional but might make life a little easier.
pip and virtualenv are indispensible tools if you’re a regular Python user. Both are fairly simple to understand, and we highly recommend getting to grips with them.