AI

Researchers discover a shortcoming that makes LLMs less reliable

Large language models (LLMs) sometimes learn the wrong lessons, according to an MIT study. Rather than answering a query based on domain knowledge, an LLM could respond by leveraging grammatical patterns it learned during training. This can cause a model to fail unexpectedly when deployed on new tasks. The researchers found that models can mistakenly

Researchers discover a shortcoming that makes LLMs less reliable Read More »

Amazon SageMaker AI introduces EAGLE based adaptive speculative decoding to accelerate generative AI inference

Generative AI models continue to expand in scale and capability, increasing the demand for faster and more efficient inference. Applications need low latency and consistent performance without compromising output quality. Amazon SageMaker AI introduces new enhancements to its inference optimization toolkit that bring EAGLE based adaptive speculative decoding to more model architectures. These updates make

Amazon SageMaker AI introduces EAGLE based adaptive speculative decoding to accelerate generative AI inference Read More »

Train custom computer vision defect detection model using Amazon SageMaker

On October 10, 2024, Amazon announced the discontinuation of the Amazon Lookout for Vision service, with a scheduled shut down date of October 31, 2025 (see Exploring alternatives and seamlessly migrating data from Amazon Lookout for Vision blog post). As part of our transition guidance for customers, we recommend the use of Amazon SageMaker AI tools

Train custom computer vision defect detection model using Amazon SageMaker Read More »

MIT scientists debut a generative AI model that could create molecules addressing hard-to-treat diseases

More than 300 people across academia and industry spilled into an auditorium to attend a BoltzGen seminar on Thursday, Oct. 30, hosted by the Abdul Latif Jameel Clinic for Machine Learning in Health (MIT Jameel Clinic). Headlining the event was MIT PhD student and BoltzGen’s first author Hannes Stärk, who had announced BoltzGen just a few days

MIT scientists debut a generative AI model that could create molecules addressing hard-to-treat diseases Read More »

Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI

In 2025, generative AI has evolved from text generation to multi-modal use cases ranging from audio transcription and translation to voice agents that require real-time data streaming. Today’s applications demand something more: continuous, real-time dialogue between users and models—the ability for data to flow both ways, simultaneously, over a single persistent connection. Imagine a speech

Introducing bidirectional streaming for real-time inference on Amazon SageMaker AI Read More »

Warner Bros. Discovery achieves 60% cost savings and faster ML inference with AWS Graviton

This post is written by Nukul Sharma, Machine Learning Engineering Manager, and Karthik Dasani, Staff Machine Learning Engineer, at Warner Bros. Discovery. Warner Bros. Discovery (WBD) is a leading global media and entertainment company that creates and distributes the world’s most differentiated and complete portfolio of content and brands across television, film and streaming. With iconic

Warner Bros. Discovery achieves 60% cost savings and faster ML inference with AWS Graviton Read More »