To myself… You cannot truly call your self a web guru if you can’t even explain the parts of a URL.
URL is one of the fundamentals to learn in web development. It stands for uniform resource locator. The browser uses a URL to make a request to a server (a kind of resource) for some information. Without URL, it’s going to be impossible or even messy to locate a resource in the vast ocean of the internet.
Components
Basically, a URL is composed of 4 parts
scheme
identifies the protocol to be used to access the resource
host
the host name that holds the resource
port
the port the the resource is served (https is 443, http is 80)
path
the specific resource in the host (a host can have multiple resources)
query
filter/s to retrieve the desired resource
scheme://host:port/path?query
Illustration
I have included other nuances of a URL for my own reference 😃
A proxy server is typically a gateway that sits between two entities involved in a request-response model.
Forward proxy
Before getting to specifics, there are 2 types of proxies used in worldwide web. The first type is the forward proxy server. It sits between a client (usually a browser controlled by a user) and an external network, in this case the internet.
Usually, forward proxy is used to limit the client from accessing certain websites or even malicious ones. Let’s take the example of a university setting with students using the school wifi. If they try to hit certain websites such as Facebook, the proxy server has the ability to block the requests. (sidetrack: and thus making students more focused on their studies rather than watching cute puppies and cats in their feed – just kidding 😆)
Forward proxy is also used to reach servers that are only accessible in certain geographic locations. For example, some tv series are only available in UK however. If you’re in another region, you can use VPN server sitting on UK to make the request on your behalf. It’s tricking the media servers that you are located in UK even though you’re not. Cool right? 🙂
Reverse Proxy
Now that you have the big picture understanding of the forward proxy, let us discuss the second type which is the reverse proxy. This type of proxy sits between numerous private resources and the outside network full of clients.
Again, to be more specific, let’s say the outside network is the internet with users using their browser to access your servers. If your servers contain private data that is only accessible by the users having an account to your website, you will not expose these servers directly to the internet. Typically, you will place a proxy in front of all these public clients to receive, validate and return the response based on their credentials.
Also, validating credentials isn’t only what reverse proxy is meant to do. A well-known function of reverse proxy is load balancing. Let’s take for example an e-commerce website. If your website is popular and millions of people are interacting with it daily, having one server to serve all the requests will likely lead to a slow response time. In most cases, it’s a bad user experience which may lead to users finding better alternatives – you don’t want to lose your customers, right? 😵💫 So what you do is spin up multiple servers and balance (I won’t discuss the algorithms here ) the requests coming in to your website.
Wrap up
In this short article, I’ve discussed very briefly what is proxy server and its two types. The forward proxy server works to protect or limit the client from accessing the external network (usually the internet). The reverse proxy server on one hand works to protect and balance the server workloads in your network from the public clients making requests to retrieve information.
Gone are the days that we write long scripts to provision and run our databases in an on-premise server, that is, for most cases that don’t need to comply with a lot of regulatory policies. It’s worth looking at how we can deploy a database in the cloud in just a few lines of code.
/workspace $ helm repo add postgres-operator https://raw.githubusercontent.com/zalando/postgres-operator/master/charts/postgres-operator
"postgres-operator" has been added to your repositories
/workspace $ helm install postgres-operator postgres-operator/postgres-operator
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
manifest_sorter.go:192: info: skipping unknown hook: "crd-install"
NAME: postgres-operator
LAST DEPLOYED: Wed Sep 15 02:44:28 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To verify that postgres-operator has started, run:
kubectl --namespace=default get pods -l "app.kubernetes.io/name=postgres-operator"
Great. The PostgreSQL is now deployed in Kubernetes default namespace.
To check if it’s running,
/workspace $ kubectl get pods -l "app.kubernetes.io/name=postgres-operator"
NAME READY STATUS RESTARTS AGE
postgres-operator-978857b4d-z6g88 1/1 Running 0 10m
Install the admin dashboard (optional)
I won’t discuss how to get into the dashboard. I’ve added this so I can refer to this article in the future if I need to configure a dashboard for Postgresql.
/workspace $ helm repo add postgres-operator-ui https://raw.githubusercontent.com/zalando/postgres-operator/master/charts/postgres-operator-ui
"postgres-operator-ui" has been added to your repositories
/workspace $ helm install postgres-operator-ui postgres-operator-ui/postgres-operator-ui --set service.type="NodePort" --set service.nodePort=31255
NAME: postgres-operator-ui
LAST DEPLOYED: Wed Sep 15 02:57:51 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
To verify that postgres-operator has started, run:
kubectl --namespace=default get pods -l "app.kubernetes.io/name=postgres-operator-ui"
Check if it’s running
/workspace $ kubectl get pods -l "app.kubernetes.io/name=postgres-operator-ui"
NAME READY STATUS RESTARTS AGE
postgres-operator-ui-6b4dd8cfbb-gvkqb 1/1 Running 0 90s
Verify that we have installed the PostgreSQL operator and UI
Command
kubectl get pods,services,deployments,replicasets
Output
/workspace $ kubectl get pods,services,deployments,replicasets
NAME READY STATUS RESTARTS AGE
pod/postgres-operator-ui-6b4dd8cfbb-6lrd2 0/1 ContainerCreating 0 19s
pod/postgres-operator-978857b4d-qflvs 1/1 Running 0 32s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 46s
service/postgres-operator ClusterIP 10.43.145.110 <none> 8080/TCP 35s
service/postgres-operator-ui NodePort 10.43.7.84 <none> 80:31255/TCP 19s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/postgres-operator-ui 0/1 1 0 19s
deployment.apps/postgres-operator 1/1 1 1 35s
NAME DESIRED CURRENT READY AGE
replicaset.apps/postgres-operator-ui-6b4dd8cfbb 1 1 0 19s
replicaset.apps/postgres-operator-978857b4d 1 1 1 32s
Describe how the database server should be created
Add the following configuration in the /workspace/db.yaml file
/workspace $ kubectl apply -f /workspace/db.yaml
postgresql.acid.zalan.do/dataops-bootcamp-cluster created
Get the status of the PostgreSQL resource. It should show running.
/workspace $ kubectl get postgresql --watch
NAME TEAM VERSION PODS VOLUME CPU-REQUEST MEMORY-REQUEST AGE STATUS
dataops-bootcamp-cluster dataops-bootcamp 12 2 1Gi 53s Running
Notice that it has 2 pods because we declare in the configuration file that the instances is 2.
Let’s view the details of the cluster
/workspace $ kubectl describe postgresql
Name: dataops-bootcamp-cluster
Namespace: default
Labels: <none>
Annotations: <none>
API Version: acid.zalan.do/v1
Kind: postgresql
Metadata:
Creation Timestamp: 2021-09-15T03:09:13Z
Generation: 1
Managed Fields:
API Version: acid.zalan.do/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:databases:
.:
f:dataops:
f:numberOfInstances:
f:postgresql:
.:
f:version:
f:teamId:
f:users:
.:
f:dataops:
f:learner_user:
f:volume:
.:
f:size:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2021-09-15T03:09:13Z
API Version: acid.zalan.do/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:PostgresClusterStatus:
Manager: postgres-operator
Operation: Update
Time: 2021-09-15T03:09:13Z
Resource Version: 1045
Self Link: /apis/acid.zalan.do/v1/namespaces/default/postgresqls/dataops-bootcamp-cluster
UID: 96b5b159-bf07-413f-8ff8-5a2204998a07
Spec:
Databases:
Dataops: learner
Number Of Instances: 2
Postgresql:
Version: 12
Team Id: dataops-bootcamp
Users:
Dataops:
superuser
createdb
learner_user:
Volume:
Size: 1Gi
Status:
Postgres Cluster Status: Running
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Create 3m24s postgres-operator Started creation of new cluster resources
Normal Endpoints 3m24s postgres-operator Endpoint "default/dataops-bootcamp-cluster" has been successfully created
Normal Services 3m24s postgres-operator The service "default/dataops-bootcamp-cluster" for role master has been successfully created
Normal Services 3m24s postgres-operator The service "default/dataops-bootcamp-cluster-repl" for role replica has been successfully created
Normal Secrets 3m23s postgres-operator The secrets have been successfully created
Normal StatefulSet 3m23s postgres-operator Statefulset "default/dataops-bootcamp-cluster" has been successfully created
Normal StatefulSet 2m32s postgres-operator Pods are ready
Wrap up
In this scenario, we looked into how we can install PostgreSQL in the cloud with ease. We used helm to install a Kubernetes PostgreSQL Operator and UI. Then we created a k8s config file to declare 2 instances of the database. Finally, we instructed k8s with a simple apply command to spin up the 2 stateful pods based on the config file.
To myself… May you utilise minikube to the fullest so you can master Kubernetes in no time.
Kubernetes cluster in production often don’t run in a single machine. The cluster is often distributed across different machines to fulfil the high workloads of microservices in horizontal scaling fashion. However, before even deploying such distributed apps and services, it is always a good idea to run them in a testing environment. Kubernetes provides a tool called minikube to allow testing of a Kubernetes cluster in local machine.
What is minikube?
minikube is a piece of software that creates a single node cluster to house master processes and worker processes. It makes it possible to run a Kubernetes cluster in one machine.
Installing minikube
minikube requires two things to operate. First, a hypervisor to be able to run the cluster in a virtual environment in a physical machine. Second, an interface to configure the cluster. In Kubernetes lingo, the tool is called kubectl.
To install minikube on macOS, run the following commands
brew update
brew install hyperkit # this is the hypervisor
brew install minikube # it will also install kubectl
Starting the minikube
minikube start --vm-driver=hyperkit
The command should log
😄 minikube v1.22.0 on Darwin 12.0
✨ Using the hyperkit driver based on user configuration
💾 Downloading driver docker-machine-driver-hyperkit:
> docker-machine-driver-hyper...: 65 B / 65 B [----------] 100.00% ? p/s 0s
> docker-machine-driver-hyper...: 10.52 MiB / 10.52 MiB 100.00% 25.21 MiB
🔑 The 'hyperkit' driver requires elevated permissions. The following commands will be executed:
$ sudo chown root:wheel /Users/stella/.minikube/bin/docker-machine-driver-hyperkit
$ sudo chmod u+s /Users/stella/.minikube/bin/docker-machine-driver-hyperkit
Password:
💿 Downloading VM boot image ...
> minikube-v1.22.0.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s
> minikube-v1.22.0.iso: 242.95 MiB / 242.95 MiB [ 100.00% 20.09 MiB p/s 12s
👍 Starting control plane node minikube in cluster minikube
💾 Downloading Kubernetes v1.21.2 preload ...
> preloaded-images-k8s-v11-v1...: 502.14 MiB / 502.14 MiB 100.00% 20.41 Mi
🔥 Creating hyperkit VM (CPUs=2, Memory=4000MB, Disk=20000MB) ...
🐳 Preparing Kubernetes v1.21.2 on Docker 20.10.6 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
▪ Configuring RBAC rules ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟 Enabled addons: storage-provisioner, default-storageclass
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
Verify that it works by issue some commands
kubectl get nodes
kubectl get pods # by default, there are no pods yet
kubectl get services
Creating a pod
In Kubernetes, even though pods are the smallest unit of a cluster, it not created using kubectl. There’s an abstraction layer in place that won’t allow the bare metal creation of pods. This layer is called Deployment and it manages ReplicaSet which manages the pods directly.
To create a deployment, create a yaml file with full example referenced here.
apiVersion: apps/v1 # api group is apps
type: Deployment
metadata:
name
spec:
replicas
selector:
matchLabels:
app
template:
metadata:
labels:
app
spec:
containers:
- name
image
To myself… May you stop the itch of writing a single line of code without thinking through the intricacies of your feature.
As software engineers, we build features with varying complexities. A submit button to send plain text message is a feature. A call button to connect with someone over the internet is a more complex feature.
In order to build the features that is adaptable to changing requirements whether it’s functional or non-functional, it needs to be designed properly.
Software design is a social process
Design, more specifically successful software design, is a very collaborative processes. The image of a guy in a hoodie sitting in one corner and chugging out code like pro is a false representation of actual software design.
One proven way to achieve successful software design is through design document. Google has been an advocate of this artefact – often referred to as design doc.
The purpose of design doc
The purpose of design document is mainly three fold:
To list the contentious parts of the design
To highlight alternative solutions to the design
To open the discussions on how the feature should be implemented
Purpose #1: List contentious components
Contentious design components are those that are likely to be argued by other engineers. One example is, how should the UI component fetch its data from the backend? Is it by a single fetch for one id holding all the desired data? Is is by single fetch for array of ids and subsequently fetching per id and collating the response later? Or something else?
Without getting into an agreement on how to implement a portions of the design, it’s easy to be bombarded with negative feedbacks after all the hard-work of making a piece of feature work. Design doc helps alleviate these types of circumstances by getting a buy-in before you even start coding.
Purpose #2: Highlight better solutions
Two heads are likely better than one. It’s always good to have someone review your work and check if you have overlooked something. With design doc, you allow other engineers to take a peek of your strategy and create threaded discussions in the parts of your design.
You can easily gather suggestions and get insights from the most senior ones. Through this, you can eliminate inefficient and ineffective patterns (a.k.a antipatterns) just by leveraging their experiences over the years.
Purpose #3: Discuss feature implementation
Let’s take a an example of API endpoints. Too often, inexperienced developers or engineers (as what they call themselves) will start implementing immediately only to find out that the logic is wrong or ending up in spaghetti code.
Now, there’s a lot of time wasted in rewrite or throwing away code which should have been caught earlier in design phase when reviewed by other equally, if not, more talented engineers.
To myself… May you be extra careful when using helpers in your code.
Developers without a lot of experience building large systems tend to use utils or helpers functions or directories.
Be wary when your helper codes seems to be arbitrarily pushed out everywhere. It’s usually a sign of weak cohesion or no sense of purpose when the system was designed.
To myself… May I illuminate what’s important to me so it can serve as my guide in my daily activities – align with what I say with what I do.
Definition of Values
Values are beliefs. They are stable and enduring beliefs that have been formed during one’s formative years. Values are what one believes to be good, right and worthwhile; What one believes to be the behaviour that is desirable to achieve what is purposeful.
Guiding Questions
What is important to me?
How would I like to be remembered in my eulogy?
How do I want my family and friends remember me?
How do I want my colleagues say when they thought of me?
How would I like to be remembered in the communities I’m part of?
Examples of Values
Tell the truth.
Come up with at least two ways to solve a problem.
Notice that values are NOT nouns? Yes. They are NOT nouns. They are verbs because they have to be doable. You can’t walk into the room and say everyone, be more of a problem solver than a complainer. Instead, what’s more actionable is, generate more than one solution to a problem you encounter each day.
To myself… May you use this term in the right context so you don’t confuse people.
What is Provisioning?
Provisioning is the process of setting up an infrastructure. In generic terms, it is bring resources to life to make them available to users and systems.
Example: Provisioning a load balancer means setting up a load balancer within certain cluster of services.
What is Configuration?
Configuration is the process of establishing resources into a desired state for building and maintenance.
Example: Configuring a load balancer means telling it how it should route requests to the right services within a certain cluster.
Provision Then Configure
You can’t configure something that doesn’t exist right? 😛 As the heading says, once a resource is provisioned, the next step is to tell it how to behave, that is, by configuration.
A good thing to note is that both provisioning and configuration are part of the deployment process of software development workflow 🙂
Serialisation, in computing domain, is the process of converting a data structure or object state into a format for storing (as in a file) or transmitting (as in network), often for later use. Often, sometime after serialisation, the information is reconstructed back for other purposes.
Example in the context of networking
When we want to transmit a file over the internet, the source location of the data will usually not be compatible for various reasons such as programming languages store the data in different locations in the memory of host computer, execution environment uses different object representation that isn’t compatible for network transport. Hence, what we do, as brilliant engineers, we serialised the data so that at the other end of the spectrum, a predefined system can reconstruct data (deserialisation), based on its own set of rules, and can utilise it for some purpose.
Before the idea of device twin, it is historically difficult to query the device information in the field. A lot of network hops or even database synchronisation can happen to retrieve this information. But looking closer at the problem, most of the use cases are, to show the information in a dashboard for monitoring or controlling certain assets of an organisation – there’s no hard real-time requirement. Hence, with this clarity of the problem, the device twin was born in Azure.
With device twin, the backend services can interact with Azure IoT Hub to query the metadata and data of each devices. This capability enables scenarios like device reporting on dashboards or monitoring long-running jobs across many devices.
It is important to note that because of the nature of device twin updates is asynchronous in nature, it is not guaranteed that the values you get from the query are real-time. What you will actually read are the last reported values. Most organisations will accept this latency for the majority of their use cases.
What is Device Twin?
A device twin is a JSON document that contains device-specific information. The document has a size limit of 8KB and a maximum depth of 5 levels. The format is: