AI

Optimizing costs of generative AI applications on AWS

The report The economic potential of generative AI: The next productivity frontier, published by McKinsey & Company, estimates that generative AI could add an equivalent of $2.6 trillion to $4.4 trillion in value to the global economy. The largest value will be added across four areas: customer operations, marketing and sales, software engineering, and R&D. […]

Optimizing costs of generative AI applications on AWS Read More »

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium

Training large language models (LLMs) models has become a significant expense for businesses. For many use cases, companies are looking to use LLM foundation models (FM) with their domain-specific data. However, companies are discovering that performing full fine tuning for these models with their data isn’t cost effective. To reduce costs while continuing to use the

PEFT fine tuning of Llama 3 on SageMaker HyperPod with AWS Trainium Read More »

Using transcription confidence scores to improve slot filling in Amazon Lex

When building voice-enabled chatbots with Amazon Lex, one of the biggest challenges is accurately capturing user speech input for slot values. For example, when a user needs to provide their account number or confirmation code, speech recognition accuracy becomes crucial. This is where transcription confidence scores come in to help ensure reliable slot filling. What

Using transcription confidence scores to improve slot filling in Amazon Lex Read More »

Improving Retrieval Augmented Generation accuracy with GraphRAG

Customers need better accuracy to take generative AI applications into production. In a world where decisions are increasingly data-driven, the integrity and reliability of information are paramount. To address this, customers often begin by enhancing generative AI accuracy through vector-based retrieval systems and the Retrieval Augmented Generation (RAG) architectural pattern, which integrates dense embeddings to

Improving Retrieval Augmented Generation accuracy with GraphRAG Read More »