Cloud Computing

Scalable AI starts with storage: Guide to model artifact strategies

Managing large model artifacts is a common bottleneck in MLOps. Baking models into container images leads to slow, monolithic deployments, and downloading them at startup introduces significant delays. This guide explores a better way: decoupling your models from your code by hosting them in Cloud Storage and accessing them efficiently from GKE and Cloud Run.

Scalable AI starts with storage: Guide to model artifact strategies Read More »

From legacy to cloud: How Deutsche Telekom went from PySpark to BigQuery DataFrames

In today’s hyper-competitive telecommunications landscape, understanding and maximizing the Customer Lifetime Value (CLV) metric isn’t just a nice-to-have, it’s a strategic imperative. For Deutsche Telekom, accurate CLV calculations are the bedrock of informed decisions, driving crucial initiatives in customer acquisition, retention, and targeted marketing campaigns. The ability to predict and influence long-term customer relationships directly

From legacy to cloud: How Deutsche Telekom went from PySpark to BigQuery DataFrames Read More »

Announcing multi-subnet support for more scalable GKE clusters

We are pleased to announce the preview of multi-subnet support for Google Kubernetes Engine (GKE) clusters. This enhancement removes single-subnet limitations, increasing scalability, optimizing resource utilization, and enhancing flexibility of your GKE clusters. Multi-subnet support for GKE clusters allows you to add additional subnets to an existing GKE cluster, which can then be utilized by

Announcing multi-subnet support for more scalable GKE clusters Read More »

Cloud CISO Perspectives: New Threat Horizons details evolving risks — and defenses

Welcome to the first Cloud CISO Perspectives for August 2025. Today, our Office of the CISO’s Bob Mechler and Anton Chuvakin dive into the key trends and evolving threats that we tracked in our just-published Cloud Threat Horizons report. As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google

Cloud CISO Perspectives: New Threat Horizons details evolving risks — and defenses Read More »

How Keeta processes 11 million financial transactions per second with Spanner

Keeta Network is a layer‑1 blockchain that unifies transactions across different blockchains and payment systems, eliminating the need for costly intermediaries, reducing fees, and enabling near‑instant settlements. By facilitating cross‑chain transactions and interoperability with existing payment systems, Keeta bridges the gap between cryptocurrencies and fiat, enabling a secure, efficient, and compliant global financial ecosystem. Founded

How Keeta processes 11 million financial transactions per second with Spanner Read More »

How Karrot built a feature platform on AWS, Part 1: Motivation and feature serving

This post is co-written with Hyeonho Kim, Jinhyeong Seo and Minjae Kwon from Karrot. Karrot is Korea’s leading local community and a service centered on all possible connections in the neighborhood. Beyond simple flea markets, it strengthens connections between neighbors, local stores, and public institutions, and creates a warm and active neighborhood as its core

How Karrot built a feature platform on AWS, Part 1: Motivation and feature serving Read More »

How Karrot built a feature platform on AWS, Part 2: Feature ingestion

This post is co-written with Hyeonho Kim, Jinhyeong Seo and Minjae Kwon from Karrot. In Part 1 of this series, we discussed how Karrot developed a new feature platform, which consists of three main components: feature serving, a stream ingestion pipeline, and a batch ingestion pipeline. We discussed their requirements, the solution architecture, and feature

How Karrot built a feature platform on AWS, Part 2: Feature ingestion Read More »

Deploy LLMs on Amazon EKS using vLLM Deep Learning Containers

Organizations face significant challenges when deploying large language models (LLMs) efficiently at scale. Key challenges include optimizing GPU resource utilization, managing network infrastructure, and providing efficient access to model weights.When running distributed inference workloads, organizations often encounter complexity in orchestrating model operations across multiple nodes. Common challenges include effectively distributing model components across available GPUs,

Deploy LLMs on Amazon EKS using vLLM Deep Learning Containers Read More »