AI

Safeguard your agentic AI applications with the Amazon Bedrock Guardrails InvokeGuardrailChecks API

Today, we’re announcing a new API with Amazon Bedrock Guardrails. With this API, you can apply individual safeguards, also referred to as safety checks, at any point in your agentic AI applications without creating guardrail resources. The new InvokeGuardrailChecks API gives you the flexibility to invoke supported safeguards at any turn in the agentic loop

Safeguard your agentic AI applications with the Amazon Bedrock Guardrails InvokeGuardrailChecks API Read More »

MIT’s Initiative for New Manufacturing builds momentum

In May, the Initiative for New Manufacturing (INM) marked its first anniversary with MIT Manufacturing Week, four days of events that attracted more than 800 registrants including students, faculty, industry leaders, investors, entrepreneurs, and government officials to explore topics ranging from how companies are using AI on factory floors to the role of startups in

MIT’s Initiative for New Manufacturing builds momentum Read More »

Introducing container caching in Amazon SageMaker AI for faster model scaling

Today, we’re excited to announce container image caching for Amazon SageMaker AI inference, the next major advancement in our faster scaling optimization journey. This speeds up end-to-end latency by up to 2x for generative AI models during scale-out events. Over the years, Amazon SageMaker AI has continued to reduce latency across these scaling stages: detecting

Introducing container caching in Amazon SageMaker AI for faster model scaling Read More »

Parallelize speculative decoding with P-EAGLE on Amazon SageMaker AI

As large language models (LLMs) grow in size and complexity, maximizing inference throughput while minimizing latency remains a critical challenge for enterprise production deployments. Speculative decoding is one effective strategy to address this, utilizing a lightweight draft model to guess future tokens which are then verified by the target LLM in a single forward pass. While state-of-the-art frameworks like Extrapolation Algorithm for Greater

Parallelize speculative decoding with P-EAGLE on Amazon SageMaker AI Read More »