kubernetes LoadBalancer appProtocol

Typically, you set the protocol for the Service to TCP and add an annotation (usually specific to your cloud provider) that configures the load balancer to handle traffic at the HTTP level. This configuration might also include serving HTTPS (HTTP over TLS) and reverse-proxying plain HTTP to your workload.

You might additionally want to specify that the application protocol of the connection is http or https. Use http if the session from the load balancer to your workload is HTTP without TLS, and use https if the session from the load balancer to your workload uses TLS encryption.

https://kubernetes.io/docs/reference/networking/service-protocols/

The appProtocol field provides a way to specify an application protocol for each Service port. This is used as a hint for implementations to offer richer behavior for protocols that they understand.

https://kubernetes.io/docs/concepts/services-networking/service/#application-protocol

Java eBPF profiling

When the kernel samples a stack trace, it is a list of instruction pointers in the virtual address space of the relevant process. Symbolization is the task of translating those virtual addresses into human readable symbols, e.g. translating address 0x1234abcd into symbol foo().

Symbolizers for compiled languages that are not JITed (e.g. C++, Golang) work by finding the debug symbol section in natively compiled binaries and libraries. However this is not available for Java byte code, since the code is not statically mapped into the application’s virtual address space. Thus, our original symbolizer could not make sense out of a Java stack-traces (except for parts that are explicitly in the JVM, but these are not usually of interest to application developers).

In brief, we use the Java Virtual Machine Tool Interface — the “JVMTI” — to interact with the JVM running the target Java application. Based on the open source Java “perf map agent”, we wrote our own JVMTI agent that listens to the JVMTI callbacks for CompiledMethodLoad and DynamicCodeGenerated 2. Thus, our JVMTI agent writes each Java symbol and its corresponding address range into a symbol file, and by reading this file, the Pixie data collection process (the Pixie Edge Module or “pem”) symbolizes Java stack-traces.

https://blog.px.dev/cpu-profiling-java/

User inside docker

I built a Docker image that has a user named “appuser” and this user has a defined uid of 1001. On my test server, the account I’m using is named “marc”, and it also has the uid of 1001. When I start the container, the sleep command executes as appuser, because the Dockerfile contains the line “USER appuser”. But this really doesn’t make it run as appuser, it makes it run as the uid of the user that the Docker images knows as appuser.

https://medium.com/@mccode/understanding-how-uid-and-gid-work-in-docker-containers-c37a01d01cf

Helm template context

Inside a range loop in Helm template, the context refers to the current item in the loop iteration. This means that when you use . inside a range loop, you’re accessing the current item in the loop.

$. refers to the top-level context that was passed into the template, and you can use it to access properties that are outside of the current range loop.

If you want to access the .Values variable inside a named template included inside a range loop, you can pass it as an argument to the template.

{{- define “my-template” }}
{{- $myVar := $.Values.myVar }}
{{- printf “myVar: %s” $myVar }}
{{- end }}

{{- range .items }}
{{- include “my-template” . (dict “Values” $.Values) }}
{{- end }}

From ChatGPT and validated/edited by myself

istio tracing decision

[randomSamplingPercentage tracing policy] Controls the rate at which traffic will be selected for tracing if no prior sampling decision has been made. If a prior sampling decision has been made, that decision will be respected. However, if no sampling decision has been made (example: no x-b3-sampled tracing header was present in the requests), the traffic will be selected for telemetry generation at the percentage specified.

https://istio.io/latest/docs/reference/config/telemetry/

[sampling global setting sets] The percentage of requests (0.0 – 100.0) that will be randomly selected for trace generation, if not requested by the client or not forced. Default is 1.0.

https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/

azure disk cache

Host caching works by bringing storage closer to the VM that can be written or read to quickly. The amount of storage that is available to the VM for host caching is in the documentation.

A virtual machine’s cached limits are separate from its uncached limits. This means you can enable host caching on disks attached to a VM while not enabling host caching on other disks.

This configuration allows your virtual machines to get a total storage IO of the cached limit plus the uncached limit.

https://learn.microsoft.com/en-us/azure/virtual-machines/disks-performance

Ephemeral OS disks are created on the local virtual machine (VM) storage and not saved to the remote Azure Storage.

Ephemeral OS disk can be stored either on VM’s OS cache disk or VM’s temp/resource disk.

https://learn.microsoft.com/en-us/azure/virtual-machines/ephemeral-os-disks

Caching uses specialized (and sometimes expensive) temporary storage that has faster read and write performance than permanent storage. Because cache storage is often limited, you need to make decisions as to what data operations will benefit most from caching. But even where the cache can be made widely available, such as in Azure, it’s still important to know the workload patterns of each disk before deciding which caching type to use.

https://learn.microsoft.com/en-us/training/modules/caching-and-performance-azure-storage-and-disks/3-enable-and-configure-azure-vm-disk-cache-by-using-the-azure-portal

Caching in Azure storage is a feature that is used to improve disk performance and reduce latency while accessing the data. Azure premium storage uses Blobcache technology to create a pool of RAM and SSD storage to save cache data. Caching stores data that is being accessed the first time, in a combination of RAM and local SSDs for future uses. When you access that data next time it will not be accessed from a storage device rather it will be accessed directly from the cache. The write operation will first be done in cache data and then replicated this change to the storage system.

We can use the disk caching feature on OS and Data disks to increase their performance.

Disk caching for any Azure storage disks is configured or managed by choosing the appropriate option for Host Caching setting for disk storage.

We can use any disk caching option (Read, Read\write, None) for OS and data disks whereas temporary disk does not use any type of disk caching.

https://www.sqlshack.com/importance-of-disk-caching-in-azure-storage-disks-for-sql-server-workloads/

kubernetes operator custom resource creation

Yesterday I had a terrible doubt about kubernetes operator deployments. Briefly, I wanted to be sure when it is 100% safe for a custom resource to be created and what possible race-conditions should be taken into account between operator controller instances and its corresponding ‘watched’ custom resources.

I dag on google through post and specs but this chatGTP pretty much summarises my conclusions. I did it following the same questions I did to myself when googling and then used the chat to validate them (and play a bit with the bot), hope it is helpful for you too:

me>> what happens if a kubernetes custom resource is created before the corresponding operator controller is running?

If a Kubernetes custom resource is created before the corresponding operator controller is running, the controller will not be able to process the resource until it is started. Depending on the implementation of the operator and the custom resource, the resource may not be fully functional or may not be monitored for updates until the controller is running. It is recommended to start the operator controller before creating any custom resources to ensure proper functionality.

me>> why is it recommended to start the operator controller before creating any custom resources ?

It is recommended to start the operator controller before creating any custom resources because the controller is responsible for managing the lifecycle of the custom resources. This includes creating, updating, and deleting resources based on the desired state specified in the custom resource.

If the controller is not running when a custom resource is created, the controller will not be able to process the resource and ensure that it is in the desired state. Additionally, the controller may not be able to monitor the resource for updates, which can lead to the resource becoming out of sync with the desired state.

By starting the operator controller before creating any custom resources, you can ensure that the controller is able to immediately process and manage the resources as they are created, which helps to maintain a consistent and desired state of resources, which is the main objective of using kubernetes operator.

me>> does kubernetes operator controller use webhooks?

Yes, Kubernetes operator controllers can use webhooks as a way to perform certain actions on custom resources.

Webhooks are a way to trigger an external service to perform an action when certain events occur in the Kubernetes cluster. For example, a webhook could be configured to validate a custom resource before it is created, or to mutate a custom resource before it is updated.

Operators can use webhooks to perform specific actions on custom resources, such as validating the resource’s configuration, mutating the resource’s specification, or performing a pre- or post-hook.

Webhooks are typically implemented using a Kubernetes admission controller and a separate service that handles the webhook requests. The admission controller intercepts the request to create or update a resource, and sends it to the webhook service, which performs the necessary action before allowing the request to continue.

me>> which alternative to webhooks can be used in kubernetes operator controller?

An alternative to webhooks that can be used in a Kubernetes operator controller is to use a Kubernetes controller pattern.

A controller pattern is a way to implement a specific behavior in a Kubernetes operator. It is a specific type of controller that runs in the same process as the operator, and watches for changes to custom resources. It then takes the appropriate actions to bring the custom resource to the desired state.

Controller patterns are typically implemented using the Go client library for Kubernetes (client-go) and the controller-runtime library. This allows the operator to watch for changes to custom resources and take appropriate actions using the Kubernetes API.

Controller pattern is an alternative to webhooks, as it allows the operator to perform actions on custom resources directly, rather than relying on an external service to handle the webhook request. This can be more efficient and easier to implement, but also might require more knowledge and expertise of kubernetes API.

me>> if an operator controller uses controller pattern, can it process existing resources already present when operator starts running?

If an operator controller uses the controller pattern, it can process existing resources that are already present in the cluster when the operator starts running.

When the operator starts, it will typically register a watch on the custom resources it manages. The watch will trigger a callback function for each existing resource in the cluster, allowing the operator to process these resources.

Depending on the implementation of the operator, it may take different actions when processing existing resources. The operator could, for example, check the current state of the resource and ensure it matches the desired state, or it could update the resource to the desired state.

It is important to note that if the resources were created before the operator and not managed by it, they may not match the desired state that the operator is expecting and may require additional logic or steps to reconcile them.

Also, if the resources were created with some properties which are not recognized by the operator, it may not be able to reconcile them and may require manual intervention or updates to the operator’s code to handle those cases.

me>> but what happens if webhook is used instead of controller pattern?

If a webhook is used instead of a controller pattern in an operator, the behavior when the operator starts and processes existing resources will be different.

When the operator starts, it will register a webhook with the Kubernetes API server. The webhook will be triggered for any new resources that are created or updated, but it will not be triggered for existing resources that were already present in the cluster when the operator starts.

This means that the operator will not be able to process existing resources that were already present in the cluster when it starts. The operator will only be able to process resources that are created or updated after it starts.

To reconcile the existing resources and bring them to the desired state, the operator would have to have additional logic that periodically scans the cluster for resources that it is supposed to manage, and if it finds any, it would need to invoke the webhook manually.

It is worth noting that using a webhook can be useful in situations where you want to perform a specific action on a resource before it is created or updated, such as validating its configuration or mutating its specification. But it is less efficient and might require more complexity to reconcile the existing resources, when compared to using a controller pattern.

from https://chat.openai.com/chat

This explains why the results I got changed from one operator to another. Some of them can handle ‘existing’ custom resources created after CRDs are deployed but before the controller is up and running while others require to check the running state of the controller before creating any custom resource.