Ansible Role to Configure K8S Multi-Node Cluster over AWS Cloud
What is Kubernetes?
Kubernetes (also known as k8s or “Kube”) is an open-source container orchestration platform that automates many of the manual processes involved in deploying, managing, and scaling containerized applications.
Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.
What is a Kubernetes cluster?
A Kubernetes cluster is a set of nodes that run containerized applications. It allows containers to run across multiple machines and environments: virtual, physical, cloud-based, and on-premises. Kubernetes containers are not restricted to a specific operating system, unlike virtual machines. Instead, they can share operating systems and run anywhere.
There are two kinds of Nodes:
· Master Node: Hosts the “Control Plane” i.e. it’s the control center that manages the deployed resources. Some of its components are kube-apiserver, kube-scheduler, kube-controller-manager, kubelet.
· Worker Nodes: Machines where the actual Containers are running on. Some of the active processes are kubelet service, container runtime (like Docker), kube-proxy service.
What is Ansible?
Ansible is a software tool that provides simple but powerful automation for cross-platform computer support. It is primarily intended for IT professionals, who use it for application deployment, updates on workstations and servers, cloud provisioning, configuration management, intra-service orchestration, and nearly anything a systems administrator does on a weekly or daily basis. Ansible doesn't depend on agent software and has no additional security infrastructure, so it's easy to deploy
Task Description
📌 Ansible Role to Configure K8S Multi Node Cluster over AWS Cloud.
🔅 Create Ansible Playbook to launch 3 AWS EC2 Instance
🔅 Create Ansible Playbook to configure Docker over those instances.
🔅 Create Playbook to configure K8S Master, K8S Worker Nodes on the above created EC2 Instances using kubeadm.
🔅 Convert Playbook into roles and Upload those role on your Ansible Galaxy.
🔅 Also Upload all the YAML code over your GitHub Repository.
Ansible Playbook to launch 3 AWS EC2 instances
Step1: Ansible Vault to store credentials
Ansible Vault encrypts variables and files so you can protect sensitive content such as passwords or keys rather than leaving it visible as plaintext in playbooks or roles. Use the passwords with the ansible-vault command-line tool to create and view encrypted variables, create encrypted files, encrypt existing files, or edit, re-key, or decrypt files.
ansible-vault create aws_cred.yml
access_key: <your AWS access key> secret_key: <your AWS secret key>
Step2: update ansible.cfg file
Vim /etc/ansible/ansible.cfg
· Most of the EC2 instances allow us to login as an “ec2-user” user, this is why we have to mention the remote_user as “ec2-user”.
· EC2 instances allow key-based authentication, hence, we must mention the path of the private key.
· The most important part is privilege_escalation. “root” powers are required if we want to configure anything in the instance. But “ec2-user” user is a general user with limited powers. Privilege Escalation is used to give “sudo” powers to a general user.
a) Installing boto python library
- hosts: localhost gather_facts: no vars_files: - cred.yml tasks: - name: "Installing Boto3 library" pip: name: boto
state: present
Boto is the Amazon Web Services (AWS) SDK for Python. It enables Python developers to create, configure, and manage AWS services, such as EC2 and S3. Boto provides an easy to use, object-oriented API, as well as low-level access to AWS services.
The “aws_cred.yml” file is included in the playbook because we need the aws_access_key and aws_secret_key to provision any resource in AWS.
b) Creating a Security Group
- name: "Security Group for both Master and Worker Nodes" ec2_group: name: MultiNodeCluster description: "This SG allows allows all traffic" region: ap-south-1 aws_access_key: "{{ access_key }}" aws_secret_key: "{{ secret_key }}" rules: - proto: all cidr_ip: rules_egress: - proto: all
c) Launching instances
- name: "Launching Master Node" ec2: key_name: ansiblekey instance_type: t2.micro image: ami-0eeb03e72075b9bcc wait: true group: MultiNodeCluster count: 1 vpc_subnet_id: subnet-0ecad70c1783cb8f6 assign_public_ip: yes region: ap-south-1 state: present aws_access_key: "{{ access_key }}" aws_secret_key: "{{ secret_key }}" instance_tags:
Name: MasterNode register: master_info
- name: "Launching Worker Nodes" ec2: key_name: aws instance_type: t2.micro image: ami-0eeb03e72075b9bcc wait: true group: MultiNodeCluster count: 2 vpc_subnet_id: subnet-0ecad70c1783cb8f6 assign_public_ip: yes region: ap-south-1 state: present aws_access_key: "{{ access_key }}" aws_secret_key: "{{ secret_key }}" instance_tags: Name: WorkerNode register: worker_info
To launch any instance, we need the following data:
· image
· instance_type
· vpc_subnet_id
· group
· key_name
· count
· region
The instance’s metadata is stored in the variables master_info and worker_info.
Update Inventory:
- name: "Add MasterNode to host group" add_host: hostname: "{{ item.public_ip }}" groupname: master loop: "{{ master_info['instances'] }}" - name: "Add WorkerNodes to host group" add_host: hostname: "{{ item.public_ip }}" groupname: workers loop: "{{ worker_info['instances'] }}" - name: "Waiting for SSH" wait_for: host: "{{ item.public_dns_name }}" port: 22 state: started loop: "{{ master_info['instances'] }}" - name: "Waiting for SSH" wait_for: host: "{{ item.public_dns_name }}" port: 22 state: started
loop: "{{ worker_info['instances'] }}"
“add_host” is an ansible module that helps us to add IP dynamically in a temporary inventory variable. “hostname” holds the public IP of the instances.
Host Groups are useful in cases when we want to configure multiple hosts with the same configuration. In the above case, we are grouping all the backend servers under one host-group called “backendservers”. Similarly, the load balancer is under the host-group called “proxyserver”.
“wait_for” is another ansible module that helps in checking whether the instances are ready. The “public DNS” of the instances can be used to check whether SSH service has started on port number 22 or not. Once the Instance is ready to do SSH the next play will be executed.
Ansible Role to configure the MasterNode of K8s
1)Creating a MasterNode role
ansible-galaxy init MasterNode
2) Steps to configure the MasterNode
• Both the nodes (master and slave) require docker containers. The master node needs containers for running different services, while the worker node needs it for the client to launch the pods.
• Now the next step is to install the software for K8s setup which is kubeadm, but by default, kubeadm is not provided by the repos configured in the yum, so we need to configure yum first before downloading it.
• Installing kubelet, kubectl, kubeadm, and iproute-tc. • Start the kubelet service. • All the services of the MasterNode are available as containers and “kubeadm” can be used to pull the required images.
• Docker by default uses a Cgroup Driver called “cgroupfs”. But Kubernetes doesn’t support this cgroup rather it supports the “systemd” driver. Hence, we must change this. • Restart the Docker service
• Copying k8s.conf file
• Start the kubeadm service by providing all the required parameters.
• We usually have a separate client who will use the kubectl command on the master, but just for testing, we can make the master as the client/user. Now if you run the “kubectl” command, it will fail (we already have kubectl software in the system). The reason for the above issue is that the client doesn’t know where the master is, so the client should know the port number of API, and username and password of the master, so to use this cluster as a normal user, we can copy some files in the HOME location, the files contain all the credentials of the master node.
• Install Add-ons
3) Configure tasks in MasterNode role
vim MasterNode/tasks/main.yml
--- # tasks file for MasterNode - name: "Installing Docker" package: name: docker state: present - name: "Starting Docker Service" service: name: "docker" state: started enabled: yes - name: "Configuring yum for kubectl,kubeadm,kubelet programs" yum_repository: name: kubernetes description: Kubernetes baseurl: "" enabled: true gpgcheck: true repo_gpgcheck: true gpgkey: >- file: kubernetes - name: "Installing kubelet,kubectl,kubeadm and iproute-tc" package: name: "{{ item }}" state: present loop: - kubelet - kubeadm - kubectl - iproute-tc - name: "Starting kubelet program" service: name: "kubelet" state: started enabled: yes - name: "Pulling all the required images" command: "kubeadm config images pull" - name: "Changing the cgroup driver of docker to 'systemd'" copy: dest: "/etc/docker/daemon.json" src: "daemon.json" - name: "Restarting Docker" service: name: "docker" state: restarted enabled: yes - shell: "echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables" - name: "Clearing Caches" shell: "echo '3' > /proc/sys/vm/drop_caches" - name: "Copying k8s.conf file" copy: dest: "/etc/sysctl.d/k8s.conf" src: "k8s.conf" - name: "Refreshing sysctl" shell: "sysctl --system" - name: "Checking kubectl service" shell: "kubectl get nodes" register: status ignore_errors: true - name: "Starting kubeadm service" shell: "kubeadm init --pod-network-cidr= --ignore-preflight-errors=NumCPU --ignore-preflight-errors=Mem --ignore-preflight-errors=Swap" when: status.rc==1 - name: "Creating .kube Directory" file: path: "$HOME/.kube" state: directory - name: "Copying config file" shell: "echo Y | cp -i /etc/kubernetes/admin.conf $HOME/.kube/config" - shell: "chown $(id -u):$(id -g) $HOME/.kube/config" - name: "Installing Addons" shell: "kubectl apply -f"
Go inside the “MasterNode” folder and then go inside the “files” folder. In files folder create daemon.json and k8s.conf
{ "exec-opts": ["native.cgroupdriver=systemd"]
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
Ansible Role to configure the WorkerNode
1) Creating a WorkerNode role
ansible-galaxy init WorkerNode
2) Configure the main.yml in tasks
--- # tasks file for WorkerNode - name: "Installing Docker" package: name: docker state: present - name: "Starting Docker Service" service: name: "docker" state: started enabled: yes - name: "Configuring yum for kubectl,kubeadm,kubelet programs" yum_repository: name: kubernetes description: Kubernetes baseurl: "" enabled: true gpgcheck: true repo_gpgcheck: true gpgkey: >- file: kubernetes - name: "Installing kubelet,kubectl,kubeadm and iproute-tc" package: name: "{{ item }}" state: present loop: - kubelet - kubeadm - kubectl - iproute-tc - name: "Starting kubelet program" service: name: "kubelet" state: started enabled: yes - name: "Pulling all the required images" command: "kubeadm config images pull" - name: "Changing the cgroup driver of docker to 'systemd'" copy: dest: "/etc/docker/daemon.json" src: "daemon.json" - name: "Restarting Docker" service: name: "docker" state: restarted enabled: yes - shell: "echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables" - name: "Clearing Caches" shell: "echo '3' > /proc/sys/vm/drop_caches" - name: "Copying k8s.conf file" copy: dest: "/etc/sysctl.d/k8s.conf" src: "k8s.conf" - name: "Refreshing sysctl" shell: "sysctl --system"
Go inside the “WorkerNode” folder and then go inside the “files” folder. In files folder create daemon.json and k8s.conf
{ "exec-opts": ["native.cgroupdriver=systemd"] }
net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1
Executing the Roles on the respective hosts
Create service.yml
- hosts: localhost gather_facts: no vars_files: - cred.yml tasks: - name: "Installing Boto3 library" pip: name: boto state: present - name: "Security Group for both Master and Worker Nodes" ec2_group: name: MultiNodeCluster description: "This SG allows allows all traffic" region: us-east-1 aws_access_key: "{{ access_key }}" aws_secret_key: "{{ secret_key }}" rules: - proto: all cidr_ip: rules_egress: - proto: all cidr_ip: - name: "Launching Master Node" ec2: key_name: aws instance_type: t2.micro image: ami-047a51fa27710816e wait: true group: MultiNodeCluster count: 1 vpc_subnet_id: subnet-e533d4d4 assign_public_ip: yes region: us-east-1 state: present aws_access_key: "{{ access_key }}" aws_secret_key: "{{ secret_key }}" instance_tags: Name: MasterNode register: master_info - name: "Launching Worker Nodes" ec2: key_name: aws instance_type: t2.micro image: ami-047a51fa27710816e wait: true group: MultiNodeCluster count: 2 vpc_subnet_id: subnet-e533d4d4 assign_public_ip: yes region: us-east-1 state: present aws_access_key: "{{ access_key }}" aws_secret_key: "{{ secret_key }}" instance_tags: Name: WorkerNode register: worker_info - name: "Add MasterNode to host group" add_host: hostname: "{{ item.public_ip }}" groupname: master loop: "{{ master_info['instances'] }}" - name: "Add WorkerNodes to host group" add_host: hostname: "{{ item.public_ip }}" groupname: workers loop: "{{ worker_info['instances'] }}" - name: "Waiting for SSH" wait_for: host: "{{ item.public_dns_name }}" port: 22 state: started loop: "{{ master_info['instances'] }}" - name: "Waiting for SSH" wait_for: host: "{{ item.public_dns_name }}" port: 22 state: started loop: "{{ worker_info['instances'] }}" - hosts: master roles: - MasterNode tasks: - name: "Getting the join command" shell: "kubeadm token create --print-join-command" register: joincommand - debug: var: joincommand["stdout"] - add_host: name: "linktojoin" link: "{{ joincommand['stdout'] }}" - hosts: workers roles: - WorkerNode tasks: - name: "Joining the master" shell: "{{ hostvars['linktojoin']['link'] }}"
Check out the video
GitHub Link:
Thank you
Site Reliability Engineer at Crest data systems
3yGreat work