

There are now twelve parts in Amazon SageMaker: Studio, Autopilot, Ground Truth, JumpStart, Data Wrangler, Feature Store, Clarify, Debugger, Model Monitor, Distributed Training, Pipelines, and Edge Manager. When I looked at SageMaker again in April 2020, it was in a preview phase with seven major improvements and expansions, and I said that it was “good enough to use for end-to-end machine learning and deep learning: data preparation, model training, model deployment, and model monitoring.” I also said that the user experience still needed a little work. When I reviewed Amazon SageMaker in 2018, I thought it was quite good and that it had “significantly improved the utility of AWS for data scientists.” Little did I know then how much traction it would get and how much it would expand in scope. It appeared often during talks at AWS re:Invent. This diagram summarizes the AWS Machine Learning stack as of December 2020.


It’s also changing in AWS’s industrial offerings, such as Amazon Monitron and AWS Panorama, which include some edge devices. That is starting to change, at least for big enterprises that can afford to buy racks of proprietary appliances such as AWS Outposts. Historically, AWS has presented its services as cloud-only. SageMaker Clarify integrates with SageMaker at three points: in the new Data Wrangler to detect data biases at import time, such as imbalanced classes in the training set, in the Experiments tab of SageMaker Studio to detect biases in the model after training and to explain the importance of features, and in the SageMaker Model Monitor, to detect bias shifts in a deployed model over time. I honestly don’t know how the company can claim those superlatives with a straight face: Yes, the AWS machine learning offerings are broad and fairly complete and rather impressive, but so are those of Google Cloud and Microsoft Azure.Īmazon SageMaker Clarify is the new add-on to the Amazon SageMaker machine learning ecosystem for Responsible AI. Amazon Web Services claims to have the broadest and most complete set of machine learning capabilities.
