Cloud Computing – Page 23 – Experiential Design Group

Boosting LLM Performance with Tiered KV Cache on Google Kubernetes Engine

Large Language Models (LLMs) are powerful, but their performance can be bottlenecked by the immense NVIDIA GPU memory footprint of the Key-Value (KV) Cache. This cache, crucial for speeding up LLM inference by storing Key (K) and Value (V) matrices, directly impacts context length, concurrency, and overall system throughput. Our primary goal is to maximize […]

Boosting LLM Performance with Tiered KV Cache on Google Kubernetes Engine Read More »

Agent Factory Recap: Build AI Apps in Minutes with Google’s Logan Kilpatrick

Cloud Computing /

In our latest episode of The Agent Factory, we were thrilled to welcome Logan Kilpatrick from Google Deep Mind for a vibe coding session that showcased the tools shaping the future of AI development. Logan, who has had a front-row seat to the generative AI revolution at both OpenAI and now Google, gave us a

Agent Factory Recap: Build AI Apps in Minutes with Google’s Logan Kilpatrick Read More »

Build Your First ADK Agent Workforce

Cloud Computing /

The world of Generative AI is evolving rapidly, and AI Agents are at the forefront of this change. An AI agent is a software system designed to act on your behalf. They show reasoning, planning, and memory and have a level of autonomy to make decisions, learn, and adapt. At its core, an AI agent

Build Your First ADK Agent Workforce Read More »

Revealing the unknown unknowns in your software

Cloud Computing /

Ryan welcomes Nic Benders to discuss the complexity and abstraction crisis in software development, the importance of going beyond observability into understandability, and demystifying AI’s opacity for understanding and control.

Revealing the unknown unknowns in your software Read More »

Unlock 2x better price-performance with Axion-based N4A VMs, now in preview

Cloud Computing /

Decision makers and builders today face a constant challenge: managing rising cloud costs while delivering the performance their customers demand. As applications evolve to use scale-out microservices and handle ever-growing data volumes, organizations need maximum efficiency from their underlying infrastructure to support their growing general-purpose workloads. To meet this need, we’re excited to announce our

Unlock 2x better price-performance with Axion-based N4A VMs, now in preview Read More »

Announcing Ironwood TPUs General Availability and new Axion VMs to power the age of inference

Cloud Computing /

Today’s frontier models, including Google’s Gemini, Veo, Imagen, and Anthropic’s Claude train and serve on Tensor Processing Units (TPUs). For many organizations, the focus is shifting from training these models to powering useful, responsive interactions with them. Constantly shifting model architectures, the rise of agentic workflows, plus near-exponential growth in demand for compute, define this

Announcing Ironwood TPUs General Availability and new Axion VMs to power the age of inference Read More »

Announcing Axion C4A metal: Arm-based Axion VMs for specialized use cases

Cloud Computing /

Today, we are thrilled to announce C4A metal, our first bare metal instance running on Google Axion processors, available in preview soon. C4A metal is designed for specialized workloads that require direct hardware access and Arm®-native compatibility. Now, organizations running environments such as Android development, automotive simulation, CI/CD pipelines, security workloads, and custom hypervisors can

Announcing Axion C4A metal: Arm-based Axion VMs for specialized use cases Read More »

From silicon to softmax: Inside the Ironwood AI stack

Cloud Computing /

As machine learning models continue to scale, a specialized, co-designed hardware and software stack is no longer optional, it’s critical. Ironwood, our latest generation Tensor Processing Unit (TPU), is the cutting-edge hardware behind advanced models like Gemini and Nano Banana, from massive-scale training to high-throughput, low-latency inference. This blog details the core components of Google’s

From silicon to softmax: Inside the Ironwood AI stack Read More »

Your First AI Application is Easier Than You Think

Cloud Computing /

If you’re a developer, you’ve seen generative AI everywhere. It can feel like a complex world of models and advanced concepts. It can be difficult to know where to actually start. The good news is that building your first AI-powered application is more accessible than you might imagine. You don’t need to be an AI expert

Your First AI Application is Easier Than You Think Read More »

The AI ick

Cloud Computing /

How we feel about AI-generated content, what AI detectors tell us, and why human creativity matters. Also, what is art?

The AI ick Read More »