Talks
k8gb - geoip demo
at KCD Bratislava 25
If you’ve been globally distributing digital content for a while, you’ll understand that merely having numerous datacenters with advanced caching patterns isn’t sufficient. When your users need to retrieve an object that’s available in different locations worldwide, they should ideally be directed automatically to the location that’s nearest and fastest for the best experience. Cloud service providers typically offer services to handle this for you within their own clouds, but what if you are running a multi-cloud or hybrid environment? K8GB is a cloud-native solution that handles GeoDNS across heterogeneous environments and enables you to reach the same level of multiregion service resilience offered by cloud providers.
( recording , slides )
k8gb oss kubernetes 2025Optimizing Metrics Collection & Serving When Autoscaling LLM Workloads
at Kubecon 25 @ London
Balancing resource provision for LLM workloads is critical for maintaining both cost efficiency and service quality. Kubernetes’s Horizontal Autoscaling offers a cloud-native capability to address these challenges, relying on the metrics to make the autoscaling decisions. However, the efficiency of metrics collection impacts how quickly and accurately Autoscaler responds to the LLM workload demands. This session explores strategies to enhance metrics collection for autoscaling LLM workloads with: 1. The fundamentals of how horizontal autoscaling works in Kubernetes 2. The unique challenges of autoscaling LLM workloads 3. A comparison of existing Kubernetes autoscaling solution for custom metrics with their pros and cons 4. How optimizing metrics collection through push-based approaches can improve scaling responsiveness. It will demonstrate an integrated solution using KServe, OpenTelemetry collector and KEDA to showcase how they can be leveraged to optimize LLM workload autoscaling.
( recording , slides )
k8gb oss kubernetes 2025Autoscaling Generative AI Workloads
at KCD Praha 24
Short lightning talk about KEDA being used as autoscaler for AI/ML workload. Stable diffusion model was used as an example that generates images based on the text input. Demo application was scaling the worker pods based on the length of message queue. I also briefly talks about pitfalls of GPU intensive workloads on K8s.
( recording )
KEDA AI/ML KCD kubernetes 2024Multi-Cloud Global Content Distribution at Cloud Native Speeds
at OpenSourceSummit EU 24 @ Vienna
If you’ve been globally distributing digital content for a while, you’ll understand that merely having numerous datacenters with advanced caching patterns isn’t sufficient. When your users need to retrieve an object that’s available in different locations worldwide, they should ideally be directed automatically to the location that’s nearest and fastest for the best experience. Cloud service providers typically offer services to handle this for you within their own clouds, but what if you are running a multi-cloud or hybrid environment? K8GB is a cloud-native solution that handles GeoDNS across heterogeneous environments and enables you to reach the same level of multiregion service resilience offered by cloud providers.
( recording , slides )
k8gb oss kubernetes 2024k8gb meets Cluster API
at FOSDEM 24