Over time, I have worked with a couple of different commercial cloud platforms like AWS, DigitalOcean, GCP, Paperspace or Packet.net. Even though these platforms are rather well documented, there comes a point where you would like to have more insights into the inner workings of a cloud platform. Unfortunately, not too many of use have permission to walk into a Google data center and dive into their setup, but we can install and study one of the most relevant open source cloud platforms – OpenStack.
What is OpenStack?
OpenStack is an open source project (or, more precisely, a collection of projects) aiming at providing a state-of-the-art cloud platform. Essentially, OpenStack contains everything that you need to convert a set of physical servers into a cloud. There are components that interact with a hypervisor like KVM to build and run virtual machines, components to define and operate virtual networks and virtual storage, components to maintain images, a set of APIs to operate your cloud and a web-based graphical user interface.
OpenStack has been launched by Rackspace and NASA in 2010, but is currently supported by a large number of organisations. Some commercially supported OpenStack distributions are available, like RedHat OpenStack, Lenovo Thinkcloud or VMWare Integrated OpenStack. The software is written in Python, which for me was one of the reasons why I have decided to dive into OpenStack instead of one of the other open source cloud platforms like OpenNebula or Apache Cloudstack.
New releases of OpenStack are published every six months. This and the following posts use the Stein release from April 2019 and Ubuntu 18.04 Bionic as the underlying operating system.
OpenStack architecture basics
OpenStack is composed of a large number of components and services which initially can be a bit confusing. However, a minimal OpenStack installation only requires a hand-full of services which are displayed in the diagram below.
At the lowest layer, there are a couple of components that are used by OpenStack but provided by other open source projects. First, as OpenStack is written in Python, it uses the WSGI specification to expose its APIs. Some services come with their own WSGI container, others use Apache2 for that purpose.
Then, of course, OpenStack needs to persist the state of instances, networks and so forth somewhere and therefore relies on a relational database which by default is MariaDB (but could also be PostgreSQL, and in fact, every database that works with SQLAlchemy should do). Next, the different components of an OpenStack service communicate with each other via message queues provided by RabbitMQ and store data temporarily in Memcached. And finally, there is of course the hypervisor which by default is KVM.
On top of these infrastructure components, there are OpenStack services that lay the foundations for the compute, storage and network components. The first of these services is Keystone which provides identity management and a service catalog. All end user and all other services are registered as user with Keystone, and Keystone is handing out tokens so that these users can access the APIs of the various OpenStack services.
Then, there is the Glance image service. Glance allows an administrator to import OS images for use with virtual machines in the cloud, similar to a Docker registry for Docker images. The third of these intermediate services is the placement service which used to be a part of Nova and is providing information on available and used resources so that OpenStack can decide where a virtual machine should be scheduled.
On the upper layer, we have the services that make up the heart of OpenStack. Nova is the compute service, responsible for interacting with the hypervisor to bring up and maintain virtual machines. Neutron is creating virtual networks so that these virtual machines can talk to each other and the outside world. And finally, Cinder (which is not absolutely needed in a minimum installation) is providing block storage.
There are a couple of services that we have not represented in this picture, like the GUI Horizon or the bare-metal service Ironic. We will not discuss Ironic in this series and we will set up Horizon, but mostly use the API.
OpenStack offers quite a bit of flexibility as to how these services are distributed among physical nodes. We can not only distribute these services, but can even split individual services and distribute them across several physical nodes. Neutron, for instance, consists of a server and several agents, and typically these agents are installed on each compute node. Over time, we will look into more complex setups, but for our first steps, we will use a setup where there is a single controller node holding most of the Nova services and one or more compute nodes on which parts of the Nova service and the Neutron service are running.
In a later lab, we will build up an additional network host that runs a part of the Neutron network services, to demonstrate how this works.
Organisation of the upcoming series
In the remainder of this series, I will walk you through the installation of OpenStack in a virtual environment. But the main purpose of this exercise is not to simply have a working installation of OpenStack – if you want this, you could as well use one of the available installation methods like DevStack. Instead, the idea is to understand a bit what is going on behind the scenes – the architecture, the main configuration options, and here and then a little deep-dive into the source code.
To achieve this, we will discuss each service, its overall architecture, some use cases and the configuration steps, starting with the basic setup and ending with the Neutron networking service (on which I will put a certain focus out of interest). To turn this into a hands-on experience, I will guide you through a sequence of labs. In each lab, we will do some exercises and see OpenStack in action. Here is my current plan how the series will be organized.
- Environment and base services
- The Keystone identity service
- A deep-dive into Keystone: tokens, roles and policies
- The Glance image service and the Placement service
- The Nova compute service
- A deep-dive into the provisioning process with Nova
- Neutron basics – components and architecture
- Setting up Neutron and running our first instances
- Provider networks: flat networks and VLAN networks with Neutron
- Project and overlay networks
- Separating the network node
- Cinder: architecture, installation and deep-dives
- Octavia: load balancer as-a-service with OpenStack
- Moving the entire setup into the cloud – running OpenStack on Packet, GCE or DigitalOcean
As always, the code for this series is available on GitHub. Most of the actual setup will be fully automated using Vagrant and Ansible. We will simulate the individual nodes as virtual machines using VirtualBox, but it should not be difficult to adapt this to the hypervisor of your choice. And finally, the setup is flexible enough to work on a sufficiently well equipped desktop PC as well as in the cloud.
After this general overview, let us now get started. In the next post, we will dive right into our first lab and install the base services that OpenStack needs.