Cloud Computing

A developer’s guide to architecting reliable GPU infrastructure at scale

Editor’s note: This blog post outlines Google Cloud’s GPU AI/ML infrastructure reliability strategy, and will be updated with links to new community articles as they appear. As we enter the era of multi-trillion parameter models, computational power has transitioned from a utility to a mission-critical strategic asset. To meet relentless training demand, organizations are no […]

A developer’s guide to architecting reliable GPU infrastructure at scale Read More »

Guardrails at the gateway: Securing AI inference on GKE with Model Armor

Enterprises are rapidly moving AI workloads from experimentation to production on Google Kubernetes Engine (GKE), using its scalability to serve powerful inference endpoints. However, as these models handle increasingly sensitive data, they introduce unique AI-driven attack vectors — from prompt injection to sensitive data leakage — that traditional firewalls aren’t designed to catch. Prompt injection

Guardrails at the gateway: Securing AI inference on GKE with Model Armor Read More »

How Estée Lauder Companies uses Cloud Run worker pools for its pull-based agentic workloads

Cloud Run has long provided developers with a straightforward, opinionated platform for running code. You can easily deploy request-driven web applications using Cloud Run services, or execute run-to-completion batch processing with Cloud Run jobs. However, as developers build more complex applications, like pipelines that process continuous streams of data or distributed AI workloads, they need

How Estée Lauder Companies uses Cloud Run worker pools for its pull-based agentic workloads Read More »

Build a multi-tenant configuration system with tagged storage patterns

In modern microservices architectures, configuration management remains one of the most challenging operational concerns. Two gaps emerge as organizations scale: handling tenant metadata that changes faster than cache TTL allows, and scaling the metadata service itself without creating a performance bottleneck. Traditional caching strategies force an uncomfortable trade-off: either accept stale tenant context (risking incorrect

Build a multi-tenant configuration system with tagged storage patterns Read More »

Google Cloud named a Leader in The Forrester Wave™: Sovereign Cloud Platforms, Q2 2026

In today’s global economy, data is a strategic asset. For many organizations — particularly those in highly regulated industries and the public sector — the ability to innovate with AI is often balanced against the rigorous requirements of data sovereignty, residency, and operational autonomy. We are proud to announce that Google Cloud has been named

Google Cloud named a Leader in The Forrester Wave™: Sovereign Cloud Platforms, Q2 2026 Read More »

New GKE Cloud Storage FUSE Profiles take the guesswork out of configuring AI storage

In the world of AI/ML, data is the fuel that drives training and inference workloads. For Google Kubernetes Engine (GKE) users, Cloud Storage FUSE provides high-performance, scalable access to data stored in Google Cloud Storage. However, we learned from customers that getting the maximum performance out of Cloud Storage FUSE can be complex. Today, we

New GKE Cloud Storage FUSE Profiles take the guesswork out of configuring AI storage Read More »

Openness without compromises for your Apache Iceberg lakehouse

Today, at the Apache Iceberg Summit in San Francisco, we are announcing the preview of read and write interoperability between BigQuery and Iceberg-compatible engines, including Trino, Spark, and others in Apache Iceberg tables in Google-managed Iceberg REST Catalog. With this new capability, you get the benefits of enterprise-grade native storage for your lakehouse without sacrificing

Openness without compromises for your Apache Iceberg lakehouse Read More »

Experimenting with GPUs: GKE managed DRANET and Inference Gateway AI Deployment

Building and serving models on infrastructure is a strong use case for businesses. In Google Cloud, you have the ability to design your AI infrastructure to suit your workloads. Recently, I experimented with Google Kubernetes Engine (GKE) managed DRANET while deploying a model for inference with NVIDIA B200 GPUs on GKE. In this blog, we

Experimenting with GPUs: GKE managed DRANET and Inference Gateway AI Deployment Read More »

Claude Mythos Preview: Available in private preview on Vertex AI

Claude Mythos Preview, Anthropic’s newest and most powerful model, is now available in Private Preview to a select group of Google Cloud customers, as part of Project Glasswing.  The availability of Claude Mythos Preview on Vertex AI underscores our commitment to offer our customers access to models from frontier AI labs. Combined with the enterprise-grade

Claude Mythos Preview: Available in private preview on Vertex AI Read More »