Thursday, November 19, 2020

Nesting all the way - kubernetes on openstack on google compute engine

The gist: Can gcp/azure/AWS can be squeezed into one machine?

Openstack is a cloud provider framework(read GCP/AWS/Azure - except that it is opensource). Here we squeeze the whole storage/compute/network virtualization framework to one GCP instance - create tenant, provision VMs and create a kubernetes cluster out of those VMs and deploy PODs.


Provisioning the compute engine: 

Not all compute engines support virtualization - we need to create them from disks which are tagged specifically to support virtualization and we need to provision the compute engines on N1(Haswell) and later series of CPUs and these CPUs are not available in all the GCP regions. At this point - US central and some Europe regions have N1 series of CPUs. See my earlier post on how to do this.


Create the disks in the GCP cloud console:

We execute the the following command in the google cloud console to create a disk that would support virtualization:



$gcloud compute disks create disk-tagged-for-virtualization --image-project ubuntu-os-cloud --image-family ubuntu-1804-lts --zone us-central1-a --licenses "https://www.googleapis.com/compute/v1/projects/vm-options/global/licenses/enable-vmx"











This would create a 10 GB disk - which can be resized as per our requirements. In this case we resize it to 120 GB to launch our ubuntu 18.04 bionic compute engine VM. Inside this VM we would deploy Openstack using Devstack. So, whole Openstack deployment would be contained within one compute engine with 6 vCPUs, 16 GB RAM and 120 GB hard disk. 

Within the Openstack deployment, we would launch 2 ubuntu 18.04 VMs - one would be the master node and another would be worker node of the kubernetes cluster. Both nodes would be 2 vCPU and 20 GB hard disk each. We would deploy weave CNI network plugin for POD networking and would validate that POD connectivity works without any hiccups. Would check for POD port forwarding, expose a deployment as NodePort and check for NodePort service accessibility. We would also, validate kubernetes DNS by going inside POD a container and access other containers by POD DNS names.

Disclaimer: 

This is not an advisable production setup and compute and storage capacities are not based on any benchmark. This could be a POC, academic exercise or a guide for someone looking to get started with Openstack and hack its internals from there. We could have taken nesting and virtualization levels deeper just by making use of Linux lxd containers because ubuntu has in-built   support for them. But that is a post for another day.

Following are the snaps of the provisioned GCP compute VM based off of the disk that we have created above.

















We have selected N1 series of CPU and US central region for required virtualization support. We have appropriately named the compute engine 'openstack'(pun intended)!
















There are various ways we can deploy Openstack. But here we are going to use 'devstack' - which facilitate single or multi-node deployment of Openstack. Devstack is collection of scripts used to bring up a complete Openstack setup quickly. The whole setup is triggered by a script  'stack.sh'. It pulls down various  components of Openstack from GitHub source code repository mirrors, configures the system and makes it ready. Hence, Devstack is a natural choice to get started with Openstack. Below is quick rundown of various openstack core infrastructure components:

Dashboard:

Web-based management interface of openstack. Also, interchangeably referred to as horizon, it is an intuitive UI to create, configure and manage various components and services that a cloud framework must provide - such as control, compute, network and storage. More specifically, horizon is a core UI infrastructure based on which component specific management UIs are built and the dashboard stiches them together.

Note: API and CLI exploits are there underneath to execute the tasks that are performed on the dashboard. The dashboard piggy back on the APIs.

Keystone:

The identity management component that handles user, roles, tenants. Also, manages catalog of services and service endpoints. One thing to notice is that - in Openstack parlance - tenants and projects mean the same thing. Operations are executed in a project or tenant context. Once openstack setup is complete, two users 'admin' and 'demo' will be present in the system. Admin is the super user over the whole cloud setup.

Glance:

Glance is the OS image registry. The Glance is the VM image management component. Before we launch a VM we need to make M boot images will get their authorized host key(to access them via SSH) and network device MAC address populated during the cloud initialization phase of VM launch.

Neutron:

Neutron is responsible for virtual networking in Openstack. It exposes APIs to manage the software defined tenant networks(SDN) to which VMs are launched. Neutron provides inter-network connectivity via virtual routers. External connectivity to internal VMs can be provided if a router is has a gateway that is connected to an external network. VMs can be accessed via floating IPs which have direct one-to-one with VMs' internal IPs.

Nova:

Nova manages the VM instances. Once a VM image is available in Glance and a virtual network is defined, with the addition of SSH key pair (which can either be generated or imported) and security group (which controls VM's ingress and egress traffic), Nova looks at the hypervisor resources and schedules a VM on a compute node. Once the VM is launched, the public key of the SSH key pair is injected into the authorized_keys file of launched VM during the cloud initialization phase.

Cinder:

Cinder is the component for block storage management. It manages the volumes that get attached to and detached from VMs. Cinder works with glusterFS, ceph and many more solutions.

Swift:

Swift is the object storage in Openstack. It is analogous to AWS S3. 

Ceilometer:

Originally designed for tenant resource consumption billing purposes, it is the telemetry component of openstack. Its provides resource usage statistics and can be configured for alert and monitoring.

Now that we know what core components of openstack are, let get started with the first up.

Setting things up:

We SSH into our provisioned GCP instance called 'openstack'. First thing, we do is create an user called 'stack' with home directory set to /opt/stack. This is done because devstack scripts are required to be executed by the 'stack' user.

We update the system, create the 'stack' user, login as stack and checkout the latest stable release 'victoria' of Devstack repo.

$ sudo su -

root@openstack:~# apt update

root@openstack:~# apt upgrade -y

Sunday, October 25, 2020

NFS persistent volume and claim in kubernetes

This write up shows detailed steps of how to setup a nfs server as a storage backend and make use of it in POD's persistent volume via persistent volume claim. These have been tested to work with  kubernetes v1.19.3 and nfs v4.2 version ubuntu Ubuntu 18.04.5 LTS.

Set up nfs server:

Following commands would set up the server for storage backend:

apt update && apt install nfs-kernel-server -y

Create a directory called /mnt/nfs-share and provide all permission to clients. We may need to re-consider what permissions are allowed based on requirements. 

mkdir -p /mnt/nfs-share/ && chmod -R 777 /mnt/nfs-share

We also, change the ownership of the shared directory next.

chown -R nobody:nogroup /mnt/nfs_share/

We need to edit the /etc/exports file to grant access to clients. Here we are specifically granting access to couple of machines.

vi /etc/exports

/mnt/nfs-share/ 10.148.0.13/32(rw,sync,no_subtree_check)/mnt/nfs-share/ 10.148.0.12/32(rw,sync,no_subtree_check)

Execute the following commands to complete the server setup process:

exportfs -a

systemctl restart nfs-kernel-server

Note: We might need to allow firewall access depending on whether firewall is active or not:

root@master:~/deployments/nfs# ufw statusStatus: inactive

If firewall is active fire the following commands:

ufw allow from 10.148.0.12 to any port nfs

ufw allow from 10.148.0.13 to any port nfs


Set up a client and check that shared access is working:

Install nfs client in one of the boxes that would share kubernetes workloads :

root@worker-1:~# apt install nfs-common

Create a directory and mount the shared nfs server directory as shown below:

mount 10.148.0.10:/mnt/nfs_share /mnt/client_share

Writing something in the client's /mnt/client_share folder should reflect it on the server and other connected clients.


Create PV and PVC (Persistent volume & claim):

First we create PV definition on the kubernetes server:

root@master:~/deployments/nfs# cat pv-nfs.yaml apiVersion: v1kind: PersistentVolumemetadata: name: pv-nfs-volspec: capacity: storage: 1Gi volumeMode: Filesystem accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs-storage-class mountOptions: - hard - nfsvers=4.2 nfs: path: /mnt/nfs-share server: 10.148.0.10root@master:~/deployments/nfs# kubectl apply -f pv-nfs.yaml persistentvolume/pv-nfs-vol created


Next, we create the PVC definition followed by the busybox pod definition as shown below:

root@master:~/deployments/nfs# cat pvc-nfs.yaml apiVersion: v1kind: PersistentVolumeClaimmetadata: name: pvc-nsfspec: accessModes: - ReadWriteMany storageClassName: nfs-storage-class resources: requests: storage: 100Miroot@master:~/deployments/nfs# kubectl apply -f pvc-nfs.yaml persistentvolumeclaim/pvc-nsf created


kubectl get pv,pvc

The above command out should show the status of PV and PVC as BOUND at this point.

The POD definition:

root@master:~/deployments/nfs# cat pod-busybox.yaml apiVersion: v1kind: Podmetadata: name: busyboxspec: containers: - image: busybox name: busybox command: - sh - -c - 'while true; do date >> /tmp/nfs/index.html; hostname >> /tmp/nfs/index.html; sleep 10; done' imagePullPolicy: IfNotPresent volumeMounts: - name: nfs-claim mountPath: "/tmp/nfs" restartPolicy: OnFailure volumes: - name: nfs-claim persistentVolumeClaim: claimName: pvc-nsf nodeName: masterroot@master:~/deployments/nfs# kubectl apply -f pod-busybox.yaml pod/busybox createdroot@master:~/deployments/nfs# kubectl get pod busybox NAME READY STATUS RESTARTS AGEbusybox 1/1 Running 0 10s

The changes in index.html can be seen in server/mnt/nfs_share, client/mnt/client_share and inside the busybox container's /tmp/nfs directories.


Thursday, September 24, 2020

Nested virtualbox VM inside google compute engine

Though guest VM inside google compute engine raises concerns about performance, there are situations where they prove useful. In this post I am going to discuss how we can install a ubuntu guest VM on google compute engine instance.

There are few caveats though - we can install a KVM compatible hypervisor only on Linux VM instances running only on Haswell or newer processors. Also, Haswell based processors are not available in all GCP regions - they are available in certain regions in US(US central) and Europe.

Windows compute engines do not support nested virtualization.

We need to create compute engines supporting nested virtualization off of disk images tagged with a specific license, namely:

"--licenses https://compute.googleapis.com/compute/v1/projects/vm-options/global/licenses/enable-vmx";

With these restrictions in mind, lets proceed with the following steps to launch a compute engine which will host a ubuntu guest VM on top of oracle virtualbox.

1) Log into GCP console and launch the cloud shell:

 2) set project:
   gcloud config set project [PROJECT]

3) We would create disk image from ubuntu family tagging it with above license as shown:

$ gcloud compute disks create virtualization-tagged-disk --image-project ubuntu-os-cloud --image-family ubuntu-1804-lts --zone us-central1-a --licenses "https://www.googleapis.com/compute/v1/projects/vm-options/global/licenses/enable-vmx"

It might ask for authorization - click on "Authorize".



This will create a disk image called "virtualization-tagged-disk". We can launch a ubuntu VM based off of this image and install oracle virtualbox on that compute engine instance and then launch a guest VM inside virtualbox.

Note: We would have to launch the instance in a GCP region where Haswell processors (N1 family) are available.

4) Select the disk image as shown:



5) Launch an instance selecting the disk and appropriate region and processor:


Once the compute engine instance is started - we can ssh into it and setup a desktop environment so that we can access it via RDP.

Setting up RDP and install oracle virtualbox:

Setup RDP on the compute engine:

1) Once inside the compute engine instance - we can uncomment following series of command to setup the RDP server environment and change the password for root user and then exit and connect back to the box via a remove RDP client:


2) Connect to the box via a remote desktop client client install virtualbox.

curl -O https://download.virtualbox.org/virtualbox/6.1.14/virtualbox-6.1_6.1.14-140239~Ubuntu~bionic_amd64.deb
apt install ./virtualbox-6.1_6.1.14-140239~Ubuntu~bionic_amd64.deb




Note: For better view install xfce4 goodies:

apt-get install xfce4 xfce4-goodies


Tuesday, September 15, 2020

Exploring envoyproxy

With the adoption of micro-services, we need to tackle hosts of issues associated with remote calls and networking - because what used to be an in-process function call now becomes a RPC call that need to be handled by a service which needs to be discovered. Service discovery has its own issues - among others, the most important being able to discover services that are active. Once services are discovered - we need to handle uniform spread of requests among discovered service instances. Traffic encryption becomes another issue that we need to handle once a call goes out over the wire from one micro-service to another.  

Another very obvious requirement in a cluster of micro-services is the need to monitor and trace requests. Without this requirement being taken care, it is difficult to figure out how services are executing on distributed network.

While there still many more issues that we need to take care of in a micro-services' setup - what should be our approach on a high-level to tackle them?

Among others - one approach could be to develop client side libraries - which handle issues of service discovery, load balancing(retry, timeout, circuit breaking etc) and others. This was the approach Netflix Eureka/ribbon/zuul stack had proposed. Eureka client acted as a service proxy - eureka server acting as a service registry and ribbon providing client side load balancing and zuul handling request routing.

While the library approach works - there are few things to consider while we embark on the library journey:

- Business code now get entangled with infrastructure code.

- We are sticking are head out either as an one language shop or we are undertaking quite a complicated job of developing/maintaining client libraries in multiple programming languages.

Is there an easy way to tackle the issues associated with micro-services? Yes! That is what out-of-process proxies like envoy and linkerd provide. All ready existing proxies like Nginx and HA-PROXY etc are adding support for capabilities similar to those provided by envoy proxy. We will discuss linkerd in a later post - here we will talk about envoy proxy.

Envoy's website talks of envoy being a edge and service service proxy - this means we can use envoy for north-south as well as east-west traffic. Envoy is written in modern C++ and most of requests handling run concurrently in lock free code. This make envoy very fast. 

Envoy has lot of goodies to offer - it's out of process architecture straight away boosts developer productivity. The network and myriad of associated issues go away instantly - letting the developer focus on his business problem. Envoy being out of process - provides a huge advantage - we can develop our services in any language of our choice and necessity. 

As mentioned earlier - envoy provides many load balancing features - automatic request retries, circuit breaking, request timeouts, request shadowing, rate limit etc.

With envoy's traffic routing features, we can easily do rolling upgrade of services, blue/green and canary deployment etc.

Envoy provides wire level observability of request traffic and native support for distributed tracing.

Envoy supports HTTP/1.1, HTTP/2 & gRPC - it transparently switches between HTTP/1.1 and HTTP/2.

One very important aspect of envoy proxy is that - while envoy can be configured statically - yet it provides robust APIs for dynamic configuration. What is more - envoy can patched and upgraded without shutting it down via what is called "Hot-Restart". We would configure some of envoy's features in upcoming posts - but before that following concepts about envoy would help.

Envoy proxy concepts:

Downstream: A downstream host connects to Envoy, sends requests, and receives responses.

Upstream: An upstream host receives connections and requests from Envoy and returns responses.

Listeners: They are the addresses where envoy process listens for incoming connections. For example,

0.0.0.0:9090. There can multiple listeners in an envoy process.

Filter chains: Each listener in an envoy process can be configured with filter chains - where each chain consists of one or more filters. Filter chains are selected based on incoming request and some matching criteria.

Routes: Based on matching criteria - requests are delegated to be handled by back-end clusters.

Clusters: Named collection of back-ends called endpoints. Requests are load balanced among the cluster endpoints.

Endpoints: Endpoints are the delegates which handles requests. They are part of cluster definition.

In the next post - we would install envoy and look at traffic routing. Stay tuned.

Sunday, August 2, 2020

Algorithms and data structures in rust - mergesort

Bets about rust are turning out to be right. Initially proposed as system programming language(read servo - rewrite of firefox browser) - numerous front end web frameworks are appearing and becoming ubiquitous. Rust has a steep learning curve - but that is a reasonable investment considering the low level power it gives to do system level things at the same time high level abstraction that comes without the price(zero cost). Borrow and ownership prevent memory issues, move semantics avoids data races, effort less C binding, avoidance of a runtime are some of the salient features of rust. Its implementation of Async & wait is also novel.

Its has been quite some time following the developments in rust - have been utilizing rust in one of my projects currently. I am going to be publishing some posts on rust and what better way than implementing algorithms and data structures to start with?

Following is an implementation of merge sort algorithm in rust. 

Rust allows defining local functions. Which is kind of neat & compact. Below is merge sort algorithm implemented using local functions. 




We make use of life times to return the sorted array. There does not seem to be an easy way to create arrays at runtime - the reason being arrays are allocated at the stack.