23-04-2021



When you’re choosing a base image for your Docker image, Alpine Linux is often recommended.Using Alpine, you’re told, will make your images smaller and speed up your builds.And if you’re using Go that’s reasonable advice.

But if you’re using Python, Alpine Linux will quite often:

  1. Make your builds much slower.
  2. Make your images bigger.
  3. Waste your time.
  4. On occassion, introduce obscure runtime bugs.

Let’s see why Alpine is recommended, and why you probably shouldn’t use it for your Python application.

When you’re choosing a base image for your Docker image, Alpine Linux is often recommended. Using Alpine, you’re told, will make your images smaller and speed up your builds. And if you’re using Go that’s reasonable advice. But if you’re using Python, Alpine Linux will quite often: Make your builds much slower. Make your images bigger. Waste your time. On occassion, introduce obscure.

Why people recommend Alpine

Let’s say we need to install gcc as part of our image build, and we want to see how Alpine Linux compares to Ubuntu 18.04 in terms of build time and image size.

  1. In this article, we will show you how to install Docker CE (Community Edition), create and run Docker containers on Ubuntu distribution. Installing Docker CE (Community Edition) in Ubuntu. To install Docker CE, first, you need to remove older versions of Docker were called docker, docker.io, or docker-engine from the system using the.
  2. Docker supports Windows containers, too! Learn how to run ASP.NET, SQL Server, and more in these tutorials. Docker Security: How to take advantage of Docker security features. Building a 12-factor application with Docker: Use Docker to create an app that conforms to Heroku’s “12 factors for cloud-native applications.”.
  3. The Docker daemon pulled the 'hello-world' image from the Docker Hub. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.

First, I’ll pull both images, and check their size:

As you can see, the base image for Alpine is much smaller.

Next, we’ll try installing gcc in both of them.First, with Ubuntu:

Note: Outside the very specific topic under discussion, the Dockerfiles in this article are not examples of best practices, since the added complexity would obscure the main point of the article.

To ensure you’re writing secure, correct, fast Dockerfiles, consider my Python on Docker Production Handbook, which includes a packaging process and >70 best practices.

We can then build and time that:

Now let’s make the equivalent Alpine Dockerfile:

And again, build the image and check its size:

Docker

As promised, Alpine images build faster and are smaller: 15 seconds instead of 30 seconds, and the image is 105MB instead of 150MB.That’s pretty good!

But when we switch to packaging a Python application, things start going wrong.

Let’s build a Python image

We want to package a Python application that uses pandas and matplotlib.So one option is to use the Debian-based official Python image (which I pulled in advance), with the following Dockerfile:

And when we build it:

The resulting image is 363MB.

Can we do better with Alpine? Let’s try:

And now we build it:

What’s going on?

Standard PyPI wheels don’t work on Alpine

If you look at the Debian-based build above, you’ll see it’s downloading matplotlib-3.1.2-cp38-cp38-manylinux1_x86_64.whl.This is a pre-compiled binary wheel.Alpine, in contrast, downloads the source code (matplotlib-3.1.2.tar.gz), because standard Linux wheels don’t work on Alpine Linux.

Why?Most Linux distributions use the GNU version (glibc) of the standard C library that is required by pretty much every C program, including Python.But Alpine Linux uses musl, those binary wheels are compiled against glibc, and therefore Alpine disabled Linux wheel support.

Most Python packages these days include binary wheels on PyPI, significantly speeding install time.But if you’re using Alpine Linux you need to compile all the C code in every Python package that you use.

Which also means you need to figure out every single system library dependency yourself.In this case, to figure out the dependencies I did some research, and ended up with the following updated Dockerfile:

And then we build it, and it takes…

Docker install mint. … 25 minutes, 57 seconds! And the resulting image is 851MB.

Here’s a comparison between the two base images:

Base imageTime to buildImage sizeResearch required
python:3.8-slim30 seconds363MBNo
python:3.8-alpine1557 seconds851MBYes

Alpine builds are vastly slower, the image is bigger, and I had to do a bunch of research.

Can’t you work around these issues?

Build time

For faster build times, Alpine Edge, which will eventually become the next stable release, does have matplotlib and pandas.And installing system packages is quite fast.As of January 2020, however, the current stable release does not include these popular packages.

Even when they are available, however, system packages almost always lag what’s on PyPI, and it’s unlikely that Alpine will ever package everything that’s on PyPI.In practice most Python teams I know don’t use system packages for Python dependencies, they rely on PyPI or Conda Forge.

Image size

Some readers pointed out that you can remove the originally installed packages, or add an option not to cache package downloads, or use a multi-stage build.One reader attempt resulted in a 470MB image.

So yes, you can get an image that’s in the ballpark of the slim-based image, but the whole motivation for Alpine Linux is smaller images and faster builds.With enough work you may be able to get a smaller image, but you’re still suffering from a 1500-second build time when they you get a 30-second build time using the python:3.8-slim image.

Docker ubuntu 18.04 gcc

But wait, there’s more!

Alpine Linux can cause unexpected runtime bugs

While in theory the musl C library used by Alpine is mostly compatible with the glibc used by other Linux distributions, in practice the differences can cause problems.And when problems do occur, they are going to be strange and unexpected.

Some examples:

  1. Alpine has a smaller default stack size for threads, which can lead to Python crashes.
  2. One Alpine user discovered that their Python application was much slower because of the way musl allocates memory vs. glibc.
  3. I once couldn’t do DNS lookups in Alpine images running on minikube (Kubernetes in a VM) when using the WeWork coworking space’s WiFi.The cause was a combination of a bad DNS setup by WeWork, the way Kubernetes and minikube do DNS, and musl’s handling of this edge case vs. what glibc does.musl wasn’t wrong (it matched the RFC), but I had to waste time figuring out the problem and then switching to a glibc-based image.
  4. Another user discovered issues with time formatting and parsing.

Docker Ubuntu Gcc

Most or perhaps all of these problems have already been fixed, but no doubt there are more problems to discover.Random breakage of this sort is just one more thing to worry about.

Don’t use Alpine Linux for Python images

Unless you want massively slower build times, larger images, more work, and the potential for obscure bugs, you’ll want to avoid Alpine Linux as a base image.For some recommendations on what you should use, see my article on choosing a good base image.

Estimated reading time: 7 minutes

Tutorial labs

Docker Ubuntu 18.04 Gcc

Learn how to develop and ship containerized applications, by walking through asample that exhibits canonical practices. These labs are from the Docker Labsrepository.

SampleDescription
Docker for BeginnersA good “Docker 101” course.
Docker Swarm modeUse Docker for natively managing a cluster of Docker Engines called a swarm.
Configuring developer tools and programming languagesHow to set-up and use common developer tools and programming languages with Docker.
Live Debugging Java with DockerJava developers can use Docker to build a development environment where they can run, test, and live debug code running within a container.
Docker for Java DevelopersOffers Java developers an intro-level and self-paced hands-on workshop with Docker.
Live Debugging a Node.js application in DockerNode developers can use Docker to build a development environment where they can run, test, and live debug code running within a container.
Dockerizing a Node.js applicationThis tutorial starts with a simple Node.js application and details the steps needed to Dockerize it and ensure its scalability.
Docker for ASP.NET and Windows containersDocker supports Windows containers, too! Learn how to run ASP.NET, SQL Server, and more in these tutorials.
Docker SecurityHow to take advantage of Docker security features.
Building a 12-factor application with DockerUse Docker to create an app that conforms to Heroku’s “12 factors for cloud-native applications.”

Sample applications

Docker Ubuntu Linux

Run popular software using Docker.

SampleDescription
apt-cacher-ngRun a Dockerized apt-cacher-ng instance.
.Net Core applicationRun a Dockerized ASP.NET Core application.
ASP.NET Core + SQL Server on LinuxRun a Dockerized ASP.NET Core + SQL Server environment.
CouchDBRun a Dockerized CouchDB instance.
Django + PostgreSQLRun a Dockerized Django + PostgreSQL environment.
PostgreSQLRun a Dockerized PostgreSQL instance.
Rails + PostgreSQLRun a Dockerized Rails + PostgreSQL environment.
RiakRun a Dockerized Riak instance.
SSHdRun a Dockerized SSHd instance.
WordPressQuickstart: Compose and WordPress.

Library references

The following table provides a list of popular official Docker images. For detailed documentation, select the specific image name.

Docker Ubuntu Gcc Not Found

Image nameDescription
Adminer
Adoptopenjdk
Aerospike
Alpine
Alt
Amazoncorretto
Amazonlinux
Arangodb
Backdrop
Bash
Bonita
Buildpack Deps
Busybox
Cassandra
Centos
Chronograf
Cirros
Clearlinux
Clefos
Clojure
Composer
Consul
Convertigo
Couchbase
Couchdb
Crate
Crux
Debian
Docker
Drupal
Eclipse Mosquitto
Eggdrop
Elasticsearch
Elixir
Erlang
Euleros
Express Gateway
Fedora
Flink
Fluentd
Fsharp
Gazebo
Gcc
Geonetwork
Ghost
Golang
Gradle
Groovy
Haproxy
Haskell
Haxe
Hello World
Httpd
Hylang
Ibmjava
Influxdb
Irssi
Jetty
Jobber
Joomla
Jruby
Julia
Kaazing Gateway
Kapacitor
Kibana
Known
Kong
Lightstreamer
Logstash
Mageia
Mariadb
Matomo
Maven
Mediawiki
Memcached
Mongo Express
Mongo
Mono
Mysql
Nats Streaming
Nats
Neo4j
Neurodebian
Nextcloud
Nginx
Node
Notary
Nuxeo
Odoo
Open Liberty
Openjdk
Opensuse
Oraclelinux
Orientdb
Percona
Perl
Photon
Php Zendserver
Php
Plone
Postfixadmin
Postgres
Pypy
Python
R Base
Rabbitmq
Rakudo Star
Rapidoid
Redis
Redmine
Registry
Rethinkdb
Rocket.chat
Ros
Ruby
Rust
Sapmachine
Scratch
Sentry
Silverpeas
Sl
Solr
Sonarqube
Sourcemage
Spiped
Storm
Swarm
Swift
Swipl
Teamspeak
Telegraf
Thrift
Tomcat
Tomee
Traefik
Ubuntu
Varnish
Vault
Websphere Liberty
Wordpress
Xwiki
Yourls
Znc
Zookeeper