Thursday, November 19, 2020

Nesting all the way - kubernetes on openstack on google compute engine

The gist: Can gcp/azure/AWS can be squeezed into one machine?

Openstack is a cloud provider framework(read GCP/AWS/Azure - except that it is opensource). Here we squeeze the whole storage/compute/network virtualization framework to one GCP instance - create tenant, provision VMs and create a kubernetes cluster out of those VMs and deploy PODs.


Provisioning the compute engine: 

Not all compute engines support virtualization - we need to create them from disks which are tagged specifically to support virtualization and we need to provision the compute engines on N1(Haswell) and later series of CPUs and these CPUs are not available in all the GCP regions. At this point - US central and some Europe regions have N1 series of CPUs. See my earlier post on how to do this.


Create the disks in the GCP cloud console:

We execute the the following command in the google cloud console to create a disk that would support virtualization:



$gcloud compute disks create disk-tagged-for-virtualization --image-project ubuntu-os-cloud --image-family ubuntu-1804-lts --zone us-central1-a --licenses "https://www.googleapis.com/compute/v1/projects/vm-options/global/licenses/enable-vmx"











This would create a 10 GB disk - which can be resized as per our requirements. In this case we resize it to 120 GB to launch our ubuntu 18.04 bionic compute engine VM. Inside this VM we would deploy Openstack using Devstack. So, whole Openstack deployment would be contained within one compute engine with 6 vCPUs, 16 GB RAM and 120 GB hard disk. 

Within the Openstack deployment, we would launch 2 ubuntu 18.04 VMs - one would be the master node and another would be worker node of the kubernetes cluster. Both nodes would be 2 vCPU and 20 GB hard disk each. We would deploy weave CNI network plugin for POD networking and would validate that POD connectivity works without any hiccups. Would check for POD port forwarding, expose a deployment as NodePort and check for NodePort service accessibility. We would also, validate kubernetes DNS by going inside POD a container and access other containers by POD DNS names.

Disclaimer: 

This is not an advisable production setup and compute and storage capacities are not based on any benchmark. This could be a POC, academic exercise or a guide for someone looking to get started with Openstack and hack its internals from there. We could have taken nesting and virtualization levels deeper just by making use of Linux lxd containers because ubuntu has in-built   support for them. But that is a post for another day.

Following are the snaps of the provisioned GCP compute VM based off of the disk that we have created above.

















We have selected N1 series of CPU and US central region for required virtualization support. We have appropriately named the compute engine 'openstack'(pun intended)!
















There are various ways we can deploy Openstack. But here we are going to use 'devstack' - which facilitate single or multi-node deployment of Openstack. Devstack is collection of scripts used to bring up a complete Openstack setup quickly. The whole setup is triggered by a script  'stack.sh'. It pulls down various  components of Openstack from GitHub source code repository mirrors, configures the system and makes it ready. Hence, Devstack is a natural choice to get started with Openstack. Below is quick rundown of various openstack core infrastructure components:

Dashboard:

Web-based management interface of openstack. Also, interchangeably referred to as horizon, it is an intuitive UI to create, configure and manage various components and services that a cloud framework must provide - such as control, compute, network and storage. More specifically, horizon is a core UI infrastructure based on which component specific management UIs are built and the dashboard stiches them together.

Note: API and CLI exploits are there underneath to execute the tasks that are performed on the dashboard. The dashboard piggy back on the APIs.

Keystone:

The identity management component that handles user, roles, tenants. Also, manages catalog of services and service endpoints. One thing to notice is that - in Openstack parlance - tenants and projects mean the same thing. Operations are executed in a project or tenant context. Once openstack setup is complete, two users 'admin' and 'demo' will be present in the system. Admin is the super user over the whole cloud setup.

Glance:

Glance is the OS image registry. The Glance is the VM image management component. Before we launch a VM we need to make M boot images will get their authorized host key(to access them via SSH) and network device MAC address populated during the cloud initialization phase of VM launch.

Neutron:

Neutron is responsible for virtual networking in Openstack. It exposes APIs to manage the software defined tenant networks(SDN) to which VMs are launched. Neutron provides inter-network connectivity via virtual routers. External connectivity to internal VMs can be provided if a router is has a gateway that is connected to an external network. VMs can be accessed via floating IPs which have direct one-to-one with VMs' internal IPs.

Nova:

Nova manages the VM instances. Once a VM image is available in Glance and a virtual network is defined, with the addition of SSH key pair (which can either be generated or imported) and security group (which controls VM's ingress and egress traffic), Nova looks at the hypervisor resources and schedules a VM on a compute node. Once the VM is launched, the public key of the SSH key pair is injected into the authorized_keys file of launched VM during the cloud initialization phase.

Cinder:

Cinder is the component for block storage management. It manages the volumes that get attached to and detached from VMs. Cinder works with glusterFS, ceph and many more solutions.

Swift:

Swift is the object storage in Openstack. It is analogous to AWS S3. 

Ceilometer:

Originally designed for tenant resource consumption billing purposes, it is the telemetry component of openstack. Its provides resource usage statistics and can be configured for alert and monitoring.

Now that we know what core components of openstack are, let get started with the first up.

Setting things up:

We SSH into our provisioned GCP instance called 'openstack'. First thing, we do is create an user called 'stack' with home directory set to /opt/stack. This is done because devstack scripts are required to be executed by the 'stack' user.

We update the system, create the 'stack' user, login as stack and checkout the latest stable release 'victoria' of Devstack repo.

$ sudo su -

root@openstack:~# apt update

root@openstack:~# apt upgrade -y