Cloud Computing

How Karrot built a feature platform on AWS, Part 2: Feature ingestion

This post is co-written with Hyeonho Kim, Jinhyeong Seo and Minjae Kwon from Karrot. In Part 1 of this series, we discussed how Karrot developed a new feature platform, which consists of three main components: feature serving, a stream ingestion pipeline, and a batch ingestion pipeline. We discussed their requirements, the solution architecture, and feature […]

How Karrot built a feature platform on AWS, Part 2: Feature ingestion Read More »

Deploy LLMs on Amazon EKS using vLLM Deep Learning Containers

Organizations face significant challenges when deploying large language models (LLMs) efficiently at scale. Key challenges include optimizing GPU resource utilization, managing network infrastructure, and providing efficient access to model weights.When running distributed inference workloads, organizations often encounter complexity in orchestrating model operations across multiple nodes. Common challenges include effectively distributing model components across available GPUs,

Deploy LLMs on Amazon EKS using vLLM Deep Learning Containers Read More »

Uber’s modern edge: A new approach to network performance and efficiency

Picture this: You’re ordering an Uber in Lisbon, but your request takes a scenic tour through Madrid, London, and Virginia before confirming your ride. That was the case for millions of users until Uber and Google Cloud set out on an even bigger journey: redesigning how global edge networks should work. Operating across six continents,

Uber’s modern edge: A new approach to network performance and efficiency Read More »

Designing a multi-tenant GKE platform for Yahoo Mail’s migration journey

Yahoo is in the midst of a multi-year journey to migrate its renowned Yahoo Mail application onto Google Cloud. With more than 100 services and middleware components in the application, Yahoo Mail is primarily taking a lift-and-shift approach for its on-premises infrastructure, and strategically transforming and replatforming key components and middleware to leverage cloud-native capabilities.

Designing a multi-tenant GKE platform for Yahoo Mail’s migration journey Read More »

Start and scale your apps faster with improved container image streaming in GKE

In today’s fast-paced cloud-native world, the speed at which your applications can start and scale is paramount. Faster pod startup times mean quicker responses to user demand, more efficient resource utilization, and a more agile development and deployment lifecycle overall. We’re continuously working to enhance the performance of Google Kubernetes Engine (GKE) to help you

Start and scale your apps faster with improved container image streaming in GKE Read More »

How Google does it: Your guide to platform engineering

What guides your approach to software development? In our roles at Google, we’re constantly working to build better software, faster. Within Google, our Developer Platform team and Google Cloud have a strategic partnership and a shared strategy: together, we take our internal capabilities and engineering tools and package them up for Google Cloud customers. At

How Google does it: Your guide to platform engineering Read More »

Smarter Authoring, Better Code: How AI is Reshaping Google Cloud’s Developer Experience

The mission of the Google Cloud Developer Experience team is simple: to help developers get from learning to launching as quickly and effectively as possible. Two of our primary tools for this are the robust hands-on documentation and the ready-to-use code samples embedded directly within it, which developers rely on every day. As Google Cloud’s

Smarter Authoring, Better Code: How AI is Reshaping Google Cloud’s Developer Experience Read More »

The University of Hawaii is helping the state retain top talent with Google AI

As the Hawaiian Islands’ primary higher education resource, the University of Hawaii (UH) System faces a unique challenge among U.S. universities: many graduates–including those with deep family roots in Hawaii – are soon confronted with a competitive job market, and struggle to land a fulfilling entry-level role and starting salary commensurate with Hawaii’s higher-than-average cost

The University of Hawaii is helping the state retain top talent with Google AI Read More »

Tutorial: How to use the Gemini Multimodal Live API for QA

The Gemini Multimodal Live API is a powerful tool that allows developers to stream data, such as video and audio, to a generative AI model and receive responses in real-time. Unlike traditional APIs that require a complete data upload before processing can begin, this “live” or “streaming” capability enables a continuous, two-way conversation with the

Tutorial: How to use the Gemini Multimodal Live API for QA Read More »