Categories
web development

The Anatomy of a URL

To myself… You cannot truly call your self a web guru if you can’t even explain the parts of a URL.

URL is one of the fundamentals to learn in web development. It stands for uniform resource locator. The browser uses a URL to make a request to a server (a kind of resource) for some information. Without URL, it’s going to be impossible or even messy to locate a resource in the vast ocean of the internet.

Components

Basically, a URL is composed of 4 parts

schemeidentifies the protocol to be used to access the resource
hostthe host name that holds the resource
portthe port the the resource is served (https is 443, http is 80)
paththe specific resource in the host (a host can have multiple resources)
queryfilter/s to retrieve the desired resource
scheme://host:port/path?query

Illustration

I have included other nuances of a URL for my own reference 😃

Categories
DevOps networking

What Is a Proxy Server?

To myself…

A proxy server is typically a gateway that sits between two entities involved in a request-response model.

Forward proxy

Before getting to specifics, there are 2 types of proxies used in worldwide web. The first type is the forward proxy server. It sits between a client (usually a browser controlled by a user) and an external network, in this case the internet.

Usually, forward proxy is used to limit the client from accessing certain websites or even malicious ones. Let’s take the example of a university setting with students using the school wifi. If they try to hit certain websites such as Facebook, the proxy server has the ability to block the requests. (sidetrack: and thus making students more focused on their studies rather than watching cute puppies and cats in their feed – just kidding 😆)

Forward proxy is also used to reach servers that are only accessible in certain geographic locations. For example, some tv series are only available in UK however. If you’re in another region, you can use VPN server sitting on UK to make the request on your behalf. It’s tricking the media servers that you are located in UK even though you’re not. Cool right? 🙂

Reverse Proxy

Now that you have the big picture understanding of the forward proxy, let us discuss the second type which is the reverse proxy. This type of proxy sits between numerous private resources and the outside network full of clients.

Again, to be more specific, let’s say the outside network is the internet with users using their browser to access your servers. If your servers contain private data that is only accessible by the users having an account to your website, you will not expose these servers directly to the internet. Typically, you will place a proxy in front of all these public clients to receive, validate and return the response based on their credentials.

Also, validating credentials isn’t only what reverse proxy is meant to do. A well-known function of reverse proxy is load balancing. Let’s take for example an e-commerce website. If your website is popular and millions of people are interacting with it daily, having one server to serve all the requests will likely lead to a slow response time. In most cases, it’s a bad user experience which may lead to users finding better alternatives – you don’t want to lose your customers, right? 😵‍💫 So what you do is spin up multiple servers and balance (I won’t discuss the algorithms here ) the requests coming in to your website.

Wrap up

In this short article, I’ve discussed very briefly what is proxy server and its two types. The forward proxy server works to protect or limit the client from accessing the external network (usually the internet). The reverse proxy server on one hand works to protect and balance the server workloads in your network from the public clients making requests to retrieve information.

Categories
DevOps software architecture

Running PostgreSQL in Kubernetes

To myself…

Gone are the days that we write long scripts to provision and run our databases in an on-premise server, that is, for most cases that don’t need to comply with a lot of regulatory policies. It’s worth looking at how we can deploy a database in the cloud in just a few lines of code.

Steps

Install a Kubernetes operator in a cloud-based VM

Command

helm repo add postgres-operator https://raw.githubusercontent.com/zalando/postgres-operator/master/charts/postgres-operator
helm install postgres-operator postgres-operator/postgres-operator

Output

/workspace $ helm repo add postgres-operator https://raw.githubusercontent.com/zalando/postgres-operator/master/charts/postgres-operator
"postgres-operator" has been added to your repositories

/workspace $ helm install postgres-operator postgres-operator/postgres-operator
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
NAME: postgres-operator
LAST DEPLOYED: Wed Sep 15 02:44:28 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To verify that postgres-operator has started, run:

  kubectl --namespace=default get pods -l "app.kubernetes.io/name=postgres-operator"

Great. The PostgreSQL is now deployed in Kubernetes default namespace.

To check if it’s running,

/workspace $ kubectl get pods -l "app.kubernetes.io/name=postgres-operator"
NAME                                READY   STATUS    RESTARTS   AGE
postgres-operator-978857b4d-z6g88   1/1     Running   0          10m

Install the admin dashboard (optional)

I won’t discuss how to get into the dashboard. I’ve added this so I can refer to this article in the future if I need to configure a dashboard for Postgresql.

Command

helm repo add postgres-operator-ui https://raw.githubusercontent.com/zalando/postgres-operator/master/charts/postgres-operator-ui
helm install postgres-operator-ui postgres-operator-ui/postgres-operator-ui --set service.type="NodePort" --set service.nodePort=31255

Output

/workspace $ helm repo add postgres-operator-ui https://raw.githubusercontent.com/zalando/postgres-operator/master/charts/postgres-operator-ui
"postgres-operator-ui" has been added to your repositories

/workspace $ helm install postgres-operator-ui postgres-operator-ui/postgres-operator-ui --set service.type="NodePort" --set service.nodePort=31255
NAME: postgres-operator-ui
LAST DEPLOYED: Wed Sep 15 02:57:51 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To verify that postgres-operator has started, run:

  kubectl --namespace=default get pods -l "app.kubernetes.io/name=postgres-operator-ui"

Check if it’s running

/workspace $ kubectl get pods -l "app.kubernetes.io/name=postgres-operator-ui"
NAME                                    READY   STATUS    RESTARTS   AGE
postgres-operator-ui-6b4dd8cfbb-gvkqb   1/1     Running   0          90s

Verify that we have installed the PostgreSQL operator and UI

Command

kubectl get pods,services,deployments,replicasets

Output

/workspace $ kubectl get pods,services,deployments,replicasets
NAME                                        READY   STATUS              RESTARTS   AGE
pod/postgres-operator-ui-6b4dd8cfbb-6lrd2   0/1     ContainerCreating   0          19s
pod/postgres-operator-978857b4d-qflvs       1/1     Running             0          32s

NAME                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/kubernetes             ClusterIP   10.43.0.1       <none>        443/TCP        46s
service/postgres-operator      ClusterIP   10.43.145.110   <none>        8080/TCP       35s
service/postgres-operator-ui   NodePort    10.43.7.84      <none>        80:31255/TCP   19s

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/postgres-operator-ui   0/1     1            0           19s
deployment.apps/postgres-operator      1/1     1            1           35s

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/postgres-operator-ui-6b4dd8cfbb   1         1         0       19s
replicaset.apps/postgres-operator-978857b4d       1         1         1       32s

Describe how the database server should be created

Add the following configuration in the /workspace/db.yaml file

/workspace $ ls
db.yaml  metallb-config

/workspace $ cat db.yaml
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: dataops-bootcamp-cluster
  namespace: default
spec:
  teamId: "dataops-bootcamp"
  volume:
    size: 1Gi
  numberOfInstances: 2
  users:
    dataops:  # database owner
    - superuser
    - createdb
    learner_user: []  # role for application foo
  databases:
    dataops: learner  # dbname: owner
  postgresql:
    version: "12"

Apply the configuration to the cluster

/workspace $ kubectl apply -f /workspace/db.yaml 
postgresql.acid.zalan.do/dataops-bootcamp-cluster created

Get the status of the PostgreSQL resource. It should show running.

/workspace $ kubectl get postgresql --watch
NAME                       TEAM               VERSION   PODS   VOLUME   CPU-REQUEST   MEMORY-REQUEST   AGE   STATUS
dataops-bootcamp-cluster   dataops-bootcamp   12        2      1Gi                                     53s   Running

Notice that it has 2 pods because we declare in the configuration file that the instances is 2.

Let’s view the details of the cluster

/workspace $ kubectl describe postgresql
Name:         dataops-bootcamp-cluster
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  acid.zalan.do/v1
Kind:         postgresql
Metadata:
  Creation Timestamp:  2021-09-15T03:09:13Z
  Generation:          1
  Managed Fields:
    API Version:  acid.zalan.do/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:databases:
          .:
          f:dataops:
        f:numberOfInstances:
        f:postgresql:
          .:
          f:version:
        f:teamId:
        f:users:
          .:
          f:dataops:
          f:learner_user:
        f:volume:
          .:
          f:size:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2021-09-15T03:09:13Z
    API Version:  acid.zalan.do/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:PostgresClusterStatus:
    Manager:         postgres-operator
    Operation:       Update
    Time:            2021-09-15T03:09:13Z
  Resource Version:  1045
  Self Link:         /apis/acid.zalan.do/v1/namespaces/default/postgresqls/dataops-bootcamp-cluster
  UID:               96b5b159-bf07-413f-8ff8-5a2204998a07
Spec:
  Databases:
    Dataops:            learner
  Number Of Instances:  2
  Postgresql:
    Version:  12
  Team Id:    dataops-bootcamp
  Users:
    Dataops:
      superuser
      createdb
    learner_user:
  Volume:
    Size:  1Gi
Status:
  Postgres Cluster Status:  Running
Events:
  Type    Reason       Age    From               Message
  ----    ------       ----   ----               -------
  Normal  Create       3m24s  postgres-operator  Started creation of new cluster resources
  Normal  Endpoints    3m24s  postgres-operator  Endpoint "default/dataops-bootcamp-cluster" has been successfully created
  Normal  Services     3m24s  postgres-operator  The service "default/dataops-bootcamp-cluster" for role master has been successfully created
  Normal  Services     3m24s  postgres-operator  The service "default/dataops-bootcamp-cluster-repl" for role replica has been successfully created
  Normal  Secrets      3m23s  postgres-operator  The secrets have been successfully created
  Normal  StatefulSet  3m23s  postgres-operator  Statefulset "default/dataops-bootcamp-cluster" has been successfully created
  Normal  StatefulSet  2m32s  postgres-operator  Pods are ready

Wrap up

In this scenario, we looked into how we can install PostgreSQL in the cloud with ease. We used helm to install a Kubernetes PostgreSQL Operator and UI. Then we created a k8s config file to declare 2 instances of the database. Finally, we instructed k8s with a simple apply command to spin up the 2 stateful pods based on the config file.

Categories
DevOps

Running Kubernetes Cluster in Local Machine

To myself… May you utilise minikube to the fullest so you can master Kubernetes in no time.

Kubernetes cluster in production often don’t run in a single machine. The cluster is often distributed across different machines to fulfil the high workloads of microservices in horizontal scaling fashion. However, before even deploying such distributed apps and services, it is always a good idea to run them in a testing environment. Kubernetes provides a tool called minikube to allow testing of a Kubernetes cluster in local machine.

What is minikube?

minikube is a piece of software that creates a single node cluster to house master processes and worker processes. It makes it possible to run a Kubernetes cluster in one machine.

Installing minikube

minikube requires two things to operate. First, a hypervisor to be able to run the cluster in a virtual environment in a physical machine. Second, an interface to configure the cluster. In Kubernetes lingo, the tool is called kubectl.

To install minikube on macOS, run the following commands

brew update
brew install hyperkit # this is the hypervisor
brew install minikube # it will also install kubectl

Starting the minikube

minikube start --vm-driver=hyperkit

The command should log

😄  minikube v1.22.0 on Darwin 12.0
✨  Using the hyperkit driver based on user configuration
💾  Downloading driver docker-machine-driver-hyperkit:
    > docker-machine-driver-hyper...: 65 B / 65 B [----------] 100.00% ? p/s 0s
    > docker-machine-driver-hyper...: 10.52 MiB / 10.52 MiB  100.00% 25.21 MiB
🔑  The 'hyperkit' driver requires elevated permissions. The following commands will be executed:

    $ sudo chown root:wheel /Users/stella/.minikube/bin/docker-machine-driver-hyperkit
    $ sudo chmod u+s /Users/stella/.minikube/bin/docker-machine-driver-hyperkit


Password:
💿  Downloading VM boot image ...
    > minikube-v1.22.0.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s
    > minikube-v1.22.0.iso: 242.95 MiB / 242.95 MiB [ 100.00% 20.09 MiB p/s 12s
👍  Starting control plane node minikube in cluster minikube
💾  Downloading Kubernetes v1.21.2 preload ...
    > preloaded-images-k8s-v11-v1...: 502.14 MiB / 502.14 MiB  100.00% 20.41 Mi
🔥  Creating hyperkit VM (CPUs=2, Memory=4000MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.21.2 on Docker 20.10.6 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Verify that it works by issue some commands

kubectl get nodes
kubectl get pods # by default, there are no pods yet
kubectl get services

Creating a pod

In Kubernetes, even though pods are the smallest unit of a cluster, it not created using kubectl. There’s an abstraction layer in place that won’t allow the bare metal creation of pods. This layer is called Deployment and it manages ReplicaSet which manages the pods directly.

To create a deployment, create a yaml file with full example referenced here.

apiVersion: apps/v1 # api group is apps
type: Deployment
metadata:
  name
spec:
  replicas
  selector:
    matchLabels:
      app
  template:
    metadata:
      labels:
        app
  spec:
    containers:
    - name
      image

Categories
software design

Why Create a Design Document?

To myself… May you stop the itch of writing a single line of code without thinking through the intricacies of your feature.

As software engineers, we build features with varying complexities. A submit button to send plain text message is a feature. A call button to connect with someone over the internet is a more complex feature.

In order to build the features that is adaptable to changing requirements whether it’s functional or non-functional, it needs to be designed properly.

Software design is a social process

Design, more specifically successful software design, is a very collaborative processes. The image of a guy in a hoodie sitting in one corner and chugging out code like pro is a false representation of actual software design.

One proven way to achieve successful software design is through design document. Google has been an advocate of this artefact – often referred to as design doc.

The purpose of design doc

The purpose of design document is mainly three fold:

  1. To list the contentious parts of the design
  2. To highlight alternative solutions to the design
  3. To open the discussions on how the feature should be implemented

Purpose #1: List contentious components

Contentious design components are those that are likely to be argued by other engineers. One example is, how should the UI component fetch its data from the backend? Is it by a single fetch for one id holding all the desired data? Is is by single fetch for array of ids and subsequently fetching per id and collating the response later? Or something else?

Without getting into an agreement on how to implement a portions of the design, it’s easy to be bombarded with negative feedbacks after all the hard-work of making a piece of feature work. Design doc helps alleviate these types of circumstances by getting a buy-in before you even start coding.

Purpose #2: Highlight better solutions

Two heads are likely better than one. It’s always good to have someone review your work and check if you have overlooked something. With design doc, you allow other engineers to take a peek of your strategy and create threaded discussions in the parts of your design.

You can easily gather suggestions and get insights from the most senior ones. Through this, you can eliminate inefficient and ineffective patterns (a.k.a antipatterns) just by leveraging their experiences over the years.

Purpose #3: Discuss feature implementation

Let’s take a an example of API endpoints. Too often, inexperienced developers or engineers (as what they call themselves) will start implementing immediately only to find out that the logic is wrong or ending up in spaghetti code.

Now, there’s a lot of time wasted in rewrite or throwing away code which should have been caught earlier in design phase when reviewed by other equally, if not, more talented engineers.

To be continued…

Categories
programming

Watch Out When Your Code Seems Scattered

To myself… May you be extra careful when using helpers in your code.

Developers without a lot of experience building large systems tend to use utils or helpers functions or directories.

Be wary when your helper codes seems to be arbitrarily pushed out everywhere. It’s usually a sign of weak cohesion or no sense of purpose when the system was designed.

Categories
self development

Clarifying Your Values

To myself… May I illuminate what’s important to me so it can serve as my guide in my daily activities – align with what I say with what I do.

Definition of Values

Values are beliefs. They are stable and enduring beliefs that have been formed during one’s formative years. Values are what one believes to be good, right and worthwhile; What one believes to be the behaviour that is desirable to achieve what is purposeful.

Guiding Questions

What is important to me?

How would I like to be remembered in my eulogy?

How do I want my family and friends remember me?

How do I want my colleagues say when they thought of me?

How would I like to be remembered in the communities I’m part of?

Examples of Values

Tell the truth.

Come up with at least two ways to solve a problem.

Notice that values are NOT nouns? Yes. They are NOT nouns. They are verbs because they have to be doable. You can’t walk into the room and say everyone, be more of a problem solver than a complainer. Instead, what’s more actionable is, generate more than one solution to a problem you encounter each day.

Categories
Azure Cloud

Provisioning is Not the Same As Configuration

To myself… May you use this term in the right context so you don’t confuse people.

What is Provisioning?

Provisioning is the process of setting up an infrastructure. In generic terms, it is bring resources to life to make them available to users and systems.

Example: Provisioning a load balancer means setting up a load balancer within certain cluster of services.

What is Configuration?

Configuration is the process of establishing resources into a desired state for building and maintenance.

Example: Configuring a load balancer means telling it how it should route requests to the right services within a certain cluster.

Provision Then Configure

You can’t configure something that doesn’t exist right? 😛 As the heading says, once a resource is provisioned, the next step is to tell it how to behave, that is, by configuration.

A good thing to note is that both provisioning and configuration are part of the deployment process of software development workflow 🙂

Categories
networking programming

What is Serialisation?

To myself…

Definition

Serialisation, in computing domain, is the process of converting a data structure or object state into a format for storing (as in a file) or transmitting (as in network), often for later use. Often, sometime after serialisation, the information is reconstructed back for other purposes.

Example in the context of networking

When we want to transmit a file over the internet, the source location of the data will usually not be compatible for various reasons such as programming languages store the data in different locations in the memory of host computer, execution environment uses different object representation that isn’t compatible for network transport. Hence, what we do, as brilliant engineers, we serialised the data so that at the other end of the spectrum, a predefined system can reconstruct data (deserialisation), based on its own set of rules, and can utilise it for some purpose.

Categories
Azure Cloud Internet of Things

Getting to Know Azure IoT Device Twin

To myself…

A Brief History of Time

Before the idea of device twin, it is historically difficult to query the device information in the field. A lot of network hops or even database synchronisation can happen to retrieve this information. But looking closer at the problem, most of the use cases are, to show the information in a dashboard for monitoring or controlling certain assets of an organisation – there’s no hard real-time requirement. Hence, with this clarity of the problem, the device twin was born in Azure.

With device twin, the backend services can interact with Azure IoT Hub to query the metadata and data of each devices. This capability enables scenarios like device reporting on dashboards or monitoring long-running jobs across many devices.

It is important to note that because of the nature of device twin updates is asynchronous in nature, it is not guaranteed that the values you get from the query are real-time. What you will actually read are the last reported values. Most organisations will accept this latency for the majority of their use cases.

What is Device Twin?

A device twin is a JSON document that contains device-specific information. The document has a size limit of 8KB and a maximum depth of 5 levels. The format is:

"identity": {
  ...
},
"tags": {
   ...
},
"properties": {
   "desired": {
      "status": <>,
      "$version": <>,
   },
   "reported": {
      "status": <>,
      "$version": <>,
   },
},

Example of Device Twin JSON File Content

{
   "deviceId":"device-2222",
   "etag":"AAAAAAAAAAc=",
   "status":"enabled",
   "statusReason":"provisioned",
   "statusUpdateTime":"0001-01-01T00:00:00",
   "connectionState":"connected",
   "lastActivityTime":"2015-02-30T16:24:48.789Z",
   "cloudToDeviceMessageCount":0,
   "authenticationType":"sas",
   "x509Thumbprint":{
      "primaryThumbprint":null,
      "secondaryThumbprint":null
   },
   "version":2,
   "tags":{
      "$etag":"123",
      "deploymentLocation":{
         "building":"43",
         "floor":"1"
      }
   },
   "properties":{
      "desired":{
         "telemetryConfig":{
            "sendFrequency":"5m"
         },
         "$metadata":{
            "..."
         },
         "$version":1
      },
      "reported":{
         "telemetryConfig":{
            "sendFrequency":"5m",
            "status":"success"
         }"batteryLevel":55,
         "$metadata":{
            "..."
         },
         "$version":4
      }
   }
}

How is Device Twin Created?

<later>

How Does Device Twin Work?

<insert diagram here… later>

Field Permissions During Device Twin Interactions

fieldbackend servicedevice
identityrnone
tagsrwnone
desired propertiesrwr
reported propertiesrrw

to be continued…