AI

Comprehensive observability for Amazon SageMaker AI LLM inference: From GPU utilization to LLM quality

Deploying large language models (LLMs) at scale on Amazon SageMaker AI Inference makes observability a critical pillar of any production machine learning (ML) strategy. Unlike conventional software that returns deterministic outputs, LLMs generate variable, free-form responses that are difficult to validate with standard metrics. LLM output quality can change over time as input distributions shift,

Comprehensive observability for Amazon SageMaker AI LLM inference: From GPU utilization to LLM quality Read More »

Training Azerbaijani language models on Amazon SageMaker AI

This solution builds on open source tools including PyTorch, Hugging Face Transformers, and Liger Kernels. The authors would also like to thank Aiham Taleb, Arefeh Ghahvechi, Manav Choudhary, Rohit Thekkanal, Daz Akbarov, Jamila Jamilova, Ross Povelikin, Almas Moldakanov, Christelle Xu, and Ivan Khvostishkov for their contributions in making this project possible. Azercell Telecom LLC, Azerbaijan’s

Training Azerbaijani language models on Amazon SageMaker AI Read More »

Build a custom portal with embedded Amazon SageMaker AI MLflow Apps

As ML teams grow, embedding Amazon SageMaker AI MLflow Apps into a custom portal requires a scalable approach to access management. Distributing presigned URLs doesn’t scale for teams with dozens of data scientists, and granting individual AWS Management Console access adds operational overhead for administrators managing access controls. Teams who rely on SSO-integrated internal portals

Build a custom portal with embedded Amazon SageMaker AI MLflow Apps Read More »

Streamline external access to Amazon SageMaker MLflow using a REST API proxy

Machine learning (ML) teams use MLflow to manage their ML lifecycle effectively. Amazon SageMaker MLflow provides comprehensive ML experiment tracking and model management capabilities. However, many enterprises have existing infrastructure requirements that need HTTPS-based integrations rather than direct SDK usage. Many organizations need to integrate Amazon SageMaker MLflow with their established systems while maintaining their

Streamline external access to Amazon SageMaker MLflow using a REST API proxy Read More »