October 26, 2020

EKS & K8S best practices

Cluster in AWS console

  1. By itself cluster costs 0.10$/h(72$/30 days)
  2. At the creation, it requires to choose IAM role, VPC and its subnets, it can't be changed later(also look at Fargate and ELB subnets requirements)
  3. Each pod(including kube-system) allocates IP in selected VPC
  4. It creates an AWS Security Group that automatically is being added to each internal node

NodeGroups

  • It's casual EC2 instances that are being added to the cluster and maintained by automatically created EC2 autoscaling group(unhealthy node replacement, multi-replica)
  • Only IAM role can't be changed after creation(for updating EC2 instance hardware go to EC2 launch templates)
  • By default, NodeGroups use docker 19.3.6
  • Created instances are displayed in the EC2 Instances table, can be detached from Auto Scaling Group from its console and can be used for debugging

Fargate

  • By default, AWS has a limit of 500 concurrent fargate tasks(pods) per region
  • Fargate works on containerd://1.3.2
  • Restrictions for ports are configured in the cluster's AWS Security Group
  • Fargate pods can run only in private(NAT) subnet
  • Fargate pod doesn't support:
    • ports:
      - hostPort:
      # pod has IP that has opened ports accordingly running services
    • volumes:
      - hostPath:
    • kind: DaemonSet
    • securityContext:
      privileged: true
  • It has a min limit of 0.25 vCPU + 512MB and a max limit of 4vCPU + 30GB
  • Sharing resources between services can be achieved by creating a pod with multiple containers(all of the services will restart during deployment rollout)
  • Monitoring is available by installing metrics-server that fetches metrics directly from kubelet

K8S on EKS

  • Persistence for pods can be achieved using EFS(native filesystem support, doesn't require any updates for running services)
  • Access from the internet mainly achieved by ALB(custom balancers probably available, need research)
  • Using default role from tutorial automatically allows to pull images from ECR(Elastic Container Registry)
  • Secrets can be easily managed by AWS Secrets Manager in pair with External Secrets
  • Kubeconfig requires aws-cli and can be built by aws eks update-kube-config or using the following template:
apiVersion: v1
clusters:
- name: example-cluster-name
  cluster:
    certificate-authority-data: {from AWS console}
    server: {from AWS console}
contexs:
- name: example-context-name
  context:
    cluster: example-cluster-name
    namespace: default
    user: example-user-name
kind: Config
preferences: {}
users:
- name: example-user-name
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      args:
      # optionally you can specify aws-cli profile
      # - --profile
      # - profile-name
      - --region
      - us-east-1
      - eks
      - get-token
      - --cluster-name
      - infra
      command: aws
current-context: example-context-name

ALB and networking

  • Setup
  • Annotations
  • EKS ALB allows using domain routing:
    Ingress is able to point requests with different host headers into different k8s services, this will create AWS Target Group for each rule but still will have only a single ALB
  • Trusted certificates with auto-renewing can be created in AWS Certificates Manager and connected to EKS Ingress in a special annotation field
  • Additional features(details in annotations):
    • IP range access restrictions
    • HTTP -> HTTPS redirect
    • Healthcheck: path, considered success statuses, timeouts, throttling, etc.
  • G-Suite authorization can be implemented using pomerium

AutoScaling

  • For fargate autoscaling is achieved by standard HPA(HorizontalPodAutoscaler)
    It creates pods according to conditions and then the fargate scheduler creates nodes
  • For NodeGroups autoscaling can be customized on the EC2 AutoScaling page in the AWS console

HELM Resources

Jenkins

For using docker-ce it's required to run jenkins on a node that supports the hostPath option(like NodeGroup)

Failed attempts

  • Tried configuring jenkins-agent using different set-ups: couldn't find a way to share scripts from master to agent
  • Tried to use goldfish with vault by hashicorp: goldfish isn't stateful(doesn't support auto-bootstrapping)

In Plans

  • Autotests on k8s(k8s job as primary process + 1 aerokube pod)
  • Migrate backup crontab script to k8s CronJob