00:00
Hi, I'm Miroslav Klasky. Global practise, lead for analytics and a I with pure storage. Today I want to talk about a I and machine learning with pure artificial intelligence. A. I and machine learning ML are top of mind for enterprises all over the world that are doing research into that area.
00:27
Let's discuss the top considerations when it comes to a I and ML. Well, a. I is a top area of investment. Why is that? The reason is that it's a great way to differentiate and bring value to customers.
00:53
OK, but a. I has a number of challenges that go along with doing it at Enterprise Scale A. I operations happen in data pipelines and those data pipelines start with raw input data and move toward final deployed models that are doing inference on whatever the business problem is.
01:29
We can think of a I pipelines as having five major phases. We'll start at the end and then work backwards to the start of the pipeline. The last stage of that pipeline is a deployed model. The challenges in production with deployed models are being able to scale that deployment and deliver timely results with high service levels.
02:07
So before we can deploy a model, we need to train a model, and training models can run for days, and they involve a number of very complicated steps. The challenge is in training is to be able to reliably complete these complex jobs multiple times while increasing the model accuracy.
02:42
Those trained models come from promising model candidates that are developed by data scientists and those data scientists spend a bunch of time exploring possible model options. OK, the challenge is when you're exploring and trying to find a good model is having the responsiveness and the room to work, having the performance in your system to be able to
03:15
train it multiple times with large data sets and the ability to share your work with your collaborators and colleagues. So the challenges in this space come down to performance space to work and collaboration. OK, so then how do those data scientists get the
03:45
data to explore? Well, the previous step is to be able to clean and transform raw data. To do that, you need to work with lots and lots of raw data, and the challenges are again responsiveness the room to work. So lots of space to transform that data and be able to keep track of complex
04:21
transformation steps. So these challenges end up being again performance space and complexity. OK, and then finally were initially we start with the raw data sources. And to get that data we need to ingest it.
04:52
Ingesting data must be done reliably and in a timely manner. So you know, you're trying to bring that data in through fairly complex pipelines and and processes. So the challenges end up being again reliability, low latency to make sure you don't drop any streaming data and dealing with complexity.
05:29
This is sort of what a AIML pipeline looks like. So again, starting with the ingestion phase, cleaning and transforming that data, exploring the data to find promising models, training those models up and then finally deploying them in production so that they're integrated into your products, services or data driven decisions.
05:52
OK, so now let's switch and take a look at how pure uncomplicate a I. The first way is that pure storage is simple to deploy and easy to manage. Next, Pure makes it simple to share data among different applications. Phases of the pipeline or collaborators.
06:32
Uh, performance with pure is strong for all workloads due to our unique architecture. Pure also allows a I clusters to consolidate all of their data storage on a small number of platforms and be able to scale those platforms on the fly and deliver different service levels.
07:09
Finally, pure delivers reliability that's needed to run training jobs for days at a time and just be able to work with your applications without having to worry about storage. So to wrap up, we learned how storage plays an important role in a IML pipelines and how pure uncomplicate a I. So how do we do that, exactly? Well, we provide storage that is purpose built
07:42
for a I flash blade S has been designed to be a high throughput, shared scale out architecture that delivers, um, the the amazing throughput that large scale a IML pipelines needs. It'll scale with your needs, and it makes it very easy to share data among different researchers and collaborators.
08:05
In addition, we offer AES, which is our a I ready infrastructure that helps you get started very quickly with a proven design and no need for a DIY learning curve to figure out how this all works. To learn more, check out the resources and links in the description below and thank you for watching.