Cloud Computing

Behind the Analysis with Google Cloud and Team USA: Architecting AI infrastructure for U.S. Winter Olympians

In freeskiing and snowboarding, traditional video replay shows you what happened during a complex aerial maneuver, but it fails to explain the physics of how it was possible. At the speed of the sport, it’s incredibly difficult to translate high-speed motion into actionable data—joint angles, rotational velocities, body compression. This requires tracking and analyzing a […]

Behind the Analysis with Google Cloud and Team USA: Architecting AI infrastructure for U.S. Winter Olympians Read More »

Migrating to Google Cloud’s Application Load Balancer: A practical guide

Migrating your existing application load balancer infrastructure from an on-premises hardware solution to Cloud Load Balancing offers substantial advantages in scalability, cost-efficiency, and tight integration within the Google Cloud ecosystem. Yet, a fundamental question often arises: “What about our current load balancer configurations?” Existing on-premises load balancer configurations often contain years of business-critical logic for

Migrating to Google Cloud’s Application Load Balancer: A practical guide Read More »

Near-100% Accurate Data for your Agent with Comprehensive Context Engineering

Agentic workflows are already used for initiating action. To be successful, agents typically need to combine multiple steps and execute business logic reflective of real-life decisions. But, as developers rush to deploy these autonomous agents, they are slamming into a wall: the compounding error problem of accuracy. To understand why agentic workflows require near-100% accuracy

Near-100% Accurate Data for your Agent with Comprehensive Context Engineering Read More »

Create Expert Content: Local Testing of a Multi-Agent System with Memory

In support of our mission to accelerate the developer journey on Google Cloud, we built Dev Signal: a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation. In part 1 and part 2 of this series, we established the essential groundwork by standardizing the

Create Expert Content: Local Testing of a Multi-Agent System with Memory Read More »

A developer’s guide to architecting reliable GPU infrastructure at scale

Editor’s note: This blog post outlines Google Cloud’s GPU AI/ML infrastructure reliability strategy, and will be updated with links to new community articles as they appear. As we enter the era of multi-trillion parameter models, computational power has transitioned from a utility to a mission-critical strategic asset. To meet relentless training demand, organizations are no

A developer’s guide to architecting reliable GPU infrastructure at scale Read More »

Guardrails at the gateway: Securing AI inference on GKE with Model Armor

Enterprises are rapidly moving AI workloads from experimentation to production on Google Kubernetes Engine (GKE), using its scalability to serve powerful inference endpoints. However, as these models handle increasingly sensitive data, they introduce unique AI-driven attack vectors — from prompt injection to sensitive data leakage — that traditional firewalls aren’t designed to catch. Prompt injection

Guardrails at the gateway: Securing AI inference on GKE with Model Armor Read More »