Note: This post is co-authored by Simon Aronsson, Senior Engineering Manager for Canonical Observability Stack.
AI/ML is moving beyond the experimentation phase. This involves a shift in the way of operating because productising AI involves many sophisticated processes. Machine learning operations (MLOps) is a new practice that ensures ML workflow automation in a scalable and efficient manner. But how do you make MLOps observable? How can you better understand how your product-grade AI initiative and its infrastructure are performing?
This is where observability comes in. With open source solutions, having observable MLOps is no longer just a nice-to-have, but a business-critical feature. In this post, we will explore this topic together, focusing on how open source helps us level up the reliability, quality and value of our MLOps platform.
An overview of observability
What is observability?
Observability is a measure of how well the behaviour and state of a system can be…