AI at the Edge With Nothing But NATS

Distributed AI Made Simple: Decouple training and inference, move workloads seamlessly between cloud and edge with the built-in location transparency of NATS.

Synadia's Scott Daniels demonstrated how easy it is to use NATS for distributed anomaly detection at the edge. Using an Isolation Forest (iForest) model for lightweight anomaly detection, Scott showcased how easily AI and machine learning workflows integrate with NATS messaging and JetStream for real-time data streams and model updates.

"Should we need more CPU from a model generation perspective, we could move this model generator off to a machine that has more CPU. A simple little Raspberry Pi might not be enough to do our model generation in the real world. But moving it away doesn't really mean that we have to make any changes because the communication still happens via NATS.
If we need to export models to some kind of a external sort or an external repository, say a MongoDB or something, we could introduce a NATS connector that would listen on, the Jetstream key value store. And anytime there was a change to the models, could push one off and put it into a different environment.”

— Scott Daniels, Synadia

Go Deeper

Workloads - Distributed cloud-to-edge compute with NATS Execution Engine

Run cloud-to-edge workloads anywhere with Synadia's NATS Execution Engine

Start a free PoC

Talk to our team about how NATS fits into your architecture, use case, and goals.

Simplifying Microservices Architecture for the Edge and AI Era

Streamline microservices with NATS for fast, secure, scalable edge and AI apps

Full Transcript

Scott Daniels:

Hi. I'm Scott Daniels. I'm a senior software developer with Synadia. And today, I'd like to talk a little bit about distributed anomaly detection at the edge. What we're going to do is to show how a machine learning use case is very easily supported by NATS.

And we're going to do this by having just a very quick overview of what machine learning is as it applies to our demo. We'll take a look at the use case architecture that we've got in place for this, and then we'll show it off, with a little bit of a demo at the end. So what is machine learning? We start with data collection. We have one or more processes that generate some data for us to use, and we collect that into a dataset, typically called training dataset.

This data could also come into our trainer as a live or real time feed. It doesn't really matter. The model trainer's goal is to take this training data and to learn how to classify it and eventually generate a model. And the model that it generates then can be used when we see unknown data, data that we've never seen before. We can use an inference process, which knows how to read the model and knows how to apply the unknown data to that model to classify each of the samples, of unknown data.

Now anomaly detection, anomaly and anomaly is, a sample that is significantly different than the normal. There are several algorithms that we could have chosen to use, and we're using iForest. It and its derivatives, are lightweight. They're very easy to use. They can be quick to train.

For some instances, we only need 25,000 or so samples, and so this is very easy for us to do at the edge. So how are we going to do this today for a demo? We're going to start by having a data source, and our data source today is a camera, and it is feeding us a live stream of data into a collection process. For each frame of data that the camera provides, we're gonna convert that into a histogram of pixel intensities. And that is going to be sent by a NATS pubsub to a process that knows how to convert histograms into samples that can be used by the iForest model generator.

As it converts those samples, those are also passed by a NATS pubsub mechanism, to our model generation process. And like I said before, its goal is to learn how to classify, these samples and to generate a model. Eventually, it does generate a model, and it will store that into a NATS Jetstream key value store or NATS Jetstream object store, and that makes it easy for our inference engine to pick up either when it starts or to get a new one as it generates because as we're running through this pipeline, the model generation is continuous. So the inference engine is notified when the next one's available, and it can start to use that next model. So to demonstrate for that on this today, we're going to run a small little process in a browser, and that process receives histograms.

So our data collection in in addition to sending out frames of histograms or histograms based on frames for model generation, it subsamples that at a rate of about a tenth, because 30 frames a second would just overwhelm us visually. And it sends down the same histogram every 10 samples that it sends to model generation, which is presented for us visually as something that we call heart hotbars. And for each hotbar that the monitor receives, it sends a request to the inference engine asking the inference engine to classify particular hotbar, and it gets a reply back which indicates whether that hotbar is a normal looking histogram or whether it looks to be anomalous. And if it's anomalous, we'll see that flagged as the histograms stream by. So let's have a look at our monitor environment.

Up at the top, we have our hotbars, and it's kinda hard to see that they're moving because we're in a very steady state, but they are moving across from right to left. Each one of these vertical bars represents a frame of of data or histogram that we've received from the camera. Our message types that we're tracking, Histograms are the number of histograms that we've sent from the collection to the histogram to sample converter, and that's roughly 30 per second, which is our camera frame rate. Samples received is the number of samples that have been received by model generation. And, again, that should match very closely to our camera frame rate.

Inference requests are the number of requests that we've made from the monitor to the inference process, and we said that was about a tenth. So that's roughly about three per second. And currently, we're generating about two models an hour, and that's what this model notification rate is. So we're in a steady state now, which means that we don't see a whole lot going on in this set of hotbars, and we don't see any of these tagged. There would be triangles at the top if we had any anomalies.

So what we're going to do is we're going to introduce some anomalies. And to show you that I'm not cheating on this, I'm going to bring up a picture. This is a picture live picture of my camera that is generating the data. And when I wave my hand in front of this camera, we should see some change to the histograms because we've changed the steady state of input to this camera. And those should not match our model, and so those should be flagged as anomalies.

So when I move my hand over and I rub my hand in front of this camera, we do. We start to see some perturbance to the histograms. And each one of those histograms while my hand was in front of the camera is tagged with a little triangle on the leader line indicating to us that they were not classified as normal events from the model's perspective. And I'll do that again. I'll wave my hand in front of the camera here.

And when I do that, I get another set of anomalous events. And when I remove my hand, we're back to a steady state, back to normal. So just to wrap up, I wanna go back to the slides, and I've got one last slide beyond this demo. And this is a demo that proves concept. But what could we do to extend this?

Should we need more CPU from a model generation perspective, we could move this model generator off to a machine that has more CPU. For the demo here, all of these processes are running on a simple little Raspberry Pi, which might not be enough to do our model generation in the real world. But moving it away doesn't really mean that we have to make any changes because the communication still happens via NATS. If we need to export models to some kind of a external sort or an external repository, say a MongoDB or something, we could introduce a NATS connector that would listen on, the Jetstream key value store. And anytime there was a change to the models, could push one off and put it into a different environment.

We might have a need to have more than one source of data. These could be distributed around, but all of them still publishing through a subject via NATS, all still going into this converter here and into model generation. So there's a lot of flexibility to be gained from a distributed perspective in addition to what we've just shown. So I'd like to thank you for your time and that ends our demo today. Okay.

So the question that you just sent me says, LLMs are all the rage now, but they are very much cloud centric. What are the opportunities for AI at the edge? You know, this is certainly not an LLM, and there's a lot more to AIML than the LLMs. The advantages here are we can put the models and we can put the inference very, very close to where the data source is, we can be disconnected in a remote environment from having to depend on anything that's, well, remote. So there's there's a lot of a lot of advantages to that from our perspective.

And NATS, because it's it's a seamless communication for us underneath the covers, it doesn't really matter where we put things. So it allows us to put things where it makes sense. That's the second question. Why is NATS a great fit for AIML cases? To be honest, it is a great fit because it makes sense to write some of these AI as distributed processes where we may have multiple data sources that are feeding into the generation of a single model allows us to collect that data from those remote places.

We don't have to figure out how to get everything to one spot. In this particular case with the single camera, we would only have one data feed in because it doesn't make sense. But if we're running an environment where we're worried about latency across maybe several servers or several different components within the system, you would want to have an average latency computed from all of that information. So all of that comes in. But again, it all feeds in through NATS and all feeds in on subjects that we can listen to.

One of the things we didn't call out in this particular demo, the model trainer actually can listen to multiple subjects and build multiple models concurrently. And so those might be done with different feature numbers or different data rates, things like that. And so we possibly would have multiple models coming out of the same data set at the same time to allow some flexibility there. So just having to be able to divide that up on subject basis is fantastic.

Nate Emerson:

Awesome. Well, thank you so much, Scott, and thank you for being here. And, again, Justina dropped the link in the chat there that talks through as a bit of a summary of that presentation as well. Thank you so much for the time, Scott.

AI at the Edge With Nothing But NATS

Distributed AI Made Simple: Decouple training and inference, move workloads seamlessly between cloud and edge with the built-in location transparency of NATS.

Go Deeper

Full Transcript

Get the NATS Newsletter