Cloud Computing

Implementing High-Performance LLM Serving on GKE: An Inference Gateway Walkthrough

The excitement around open Large Language Models like Gemma, Llama, Mistral, and Qwen is evident, but developers quickly hit a wall. How do you deploy them effectively at scale?  Traditional load balancing algorithms fall short, as they fail to account for GPU/TPU load status, leading to inefficient routing for computationally intensive AI inference with its […]

Implementing High-Performance LLM Serving on GKE: An Inference Gateway Walkthrough Read More »

How to enable real time semantic search and RAG applications with Dataflow ML

Embeddings are a cornerstone of modern semantic search and Retrieval Augmented Generation (RAG) applications. In short, they enable applications to understand and interact with information on a deeper, conceptual level. In this post, we’ll show you how to create and retrieve embeddings with a few lines of Dataflow ML code to enable both of these

How to enable real time semantic search and RAG applications with Dataflow ML Read More »

Engineering Deutsche Telekom’s sovereign data platform

Imagine transforming a sprawling, 20-year-old telecommunications data ecosystem, laden with sensitive customer information and bound by stringent European regulations, into a nimble, cloud-native powerhouse. That’s precisely the challenge Deutsche Telekom tackled head-on, explains Ashutosh Mishra. By using Google Cloud’s Sovereign Cloud offerings, they’ve built a groundbreaking “One Data Ecosystem.” When we decided to modernize our

Engineering Deutsche Telekom’s sovereign data platform Read More »

Unlock AlloyDB performance secrets with new performance snapshot report

In the world of database management, understanding performance bottlenecks is critical to smooth operations and an optimal user experience. Using managed database services can help alleviate mundane management tasks and let you focus on value-added, strategic tasks, while also offering tools for monitoring database and resource performance. But when performance issues arise, gathering and analyzing

Unlock AlloyDB performance secrets with new performance snapshot report Read More »

Google Public Sector awarded $200 million contract to accelerate AI and cloud capabilities across Department of Defense’s Chief Digital and Artificial Intelligence Office (CDAO)

At Google Public Sector, we’re committed to advancing the deployment of innovative technology across the defense ecosystem. Today, we’re announcing that Google Public Sector has been awarded a $200 million-ceiling contract to support the U.S. Department of Defense’s (DoD) Chief Digital and Artificial Intelligence Office (CDAO). This builds on Google Public Sector’s long-standing collaboration with

Google Public Sector awarded $200 million contract to accelerate AI and cloud capabilities across Department of Defense’s Chief Digital and Artificial Intelligence Office (CDAO) Read More »

Manipal Hospitals and Google Cloud partner to transform nurse handoffs with GenAI

As one of India’s largest healthcare providers, Manipal Hospitals serves nearly 7 million patients annually across 37 hospitals. To deliver clinical excellence and patient-centric care at a high standard, we are continually embracing technology.  One of our most significant operational challenges we consistently face is the nurse handover process—a critical but time-consuming task. To make

Manipal Hospitals and Google Cloud partner to transform nurse handoffs with GenAI Read More »

How Jina AI built its 100-billion-token web grounding system with Cloud Run GPUs

Editor’s note: The Jina AI Reader is a specialized tool that transforms raw web content from URLs or local files into a clean, structured, and LLM-friendly format.  In this post, Han Xiao details how Cloud Run empowers Jina AI to build a secure, reliable, and massively scalable web scraping system that remains economically viable. This

How Jina AI built its 100-billion-token web grounding system with Cloud Run GPUs Read More »