Workloads - Distributed cloud-to-edge compute with NATS Execution Engine

Deploy anywhere with the edge-native tech stack that bridges regions, runtimes, clouds—and all the way out to any edge

The new NATS Execution Engine (Nex) and Workloads from Synadia unlock seamless cloud-to-edge computing, enabling deployments across any region, runtime, or environment.

Unlike traditional runtimes, Nex bridges and unifies existing infrastructure, offering secure, lightweight compute capabilities built entirely on NATS primitives like KV, object stores, and streams. During the demo, workloads effortlessly moved across clouds with minimal downtime, illustrating how Nexlets empower developers to extend compute flexibility, from multi-cloud setups down to local edge devices such as Raspberry Pis.

With Synadia-managed Workloads, users can rapidly deploy, migrate, and manage distributed applications globally via the familiar Synadia Cloud interface.

“Well, turns out since we built our connectivity layer first and then our data layer on top of that, Nex inherited security. Nex inherited, replication of data. It inherited KV. It inherited object store.
So we got to use all that for free”

— Jordan Rash, Synadia

Go Deeper

NATS 2.11 - What's New?

Explore key features in NATS 2.11 for better visibility and control.

Start a free PoC

Talk to our team about how NATS fits into your architecture, use case, and goals.

Simplifying Microservices Architecture for the Edge and AI Era

Streamline microservices with NATS for fast, secure, scalable edge and AI apps

Full Transcript

Nate Emerson:

So next up, we are going to have a live presentation from Jordan Rash. He is going to be presenting on a new Synadia product called Workloads, and this is particularly exciting as this enables a lot a lot of use cases and doing processing at the edge. I think there's some really exciting stuff that can be done with distributed inference too, speaking to that AI topic. You know, we heard earlier, I think it was PowerFlex or no. MachineMetrics had talked a little bit about being able to distribute some of the processing and being able to take advantage of edge devices to do little bits and pieces of processing that would otherwise need to happen on cloud servers somewhere.

And, I think that this product is going to lead to a lot more capabilities of doing that and really empowering this kind of nothing but NATS stack. And I won't speak too much more to it, but I believe Jordan should be joining us momentarily here.

Jordan Rash:

There we go. Am I here?

Jordan Rash:

Here. So much, Nate. I'm excited to be able to talk about this today. Let me see if I can share some stuff. Cool.

Well, thanks, everyone. This has been a great Rethink Con. I hope I can keep the vibes going. My name's Jordan. I am one of the members at Synadia trying to bring the NATS Execution Engine and Workloads to the community.

Today, I'm hoping to talk a little bit about the state of the NATS Execution Engine. We've been working on it about a year. We've gotten amazing community feedback, and I kinda hope it shows when I when I bring up the demos here a little bit. We're gonna discuss some motivations on why the changes we made and then hopefully I have a big some exciting announcements. So first off, we can't talk about workloads without talking about Nex, so the NATS execution engine.

So for those of you that aren't aware, Nex was built to try and bring your business logic to the NATS ecosystem. We do that by trying to unify various runtimes and at the same time utilizing all the NATS primitives that's that's made it such a powerful platform. KV, object store, streams. We give you all that for free bundled, when you when you run your workloads. One of the things I wanna point out, and and Derek talked about it early on, was this the order things were built. When I started working on Nex, I was like, we've okay.

We've we've made these engines before. We know all this we know all the things we have to hit. Networking. We gotta get firewalls. We get security.

We gotta get all these things. Well, turns out since we built our connectivity layer first and then our data layer on top of that, Nex inherited security. Nex inherited, replication of data. It inherited KV. It inherited object store.

So we got to use all that for free. And by, you know, starting workloads and align and, you know, trying to, like, associate them with things like scope credentials, we can make the the connection between the two pretty seamless. Alright. So reintroducing Nex. It's been an exciting year.

We early on, we built the platform. We were really happy with it, and they were like, let's take it somewhere other than the cloud. Let's take it somewhere other than the server. And we realized that we had packed everything into a single binary, and it was starting to get really bloated. So we're like, let's take a step back.

Let's consider the architecture. Let's consider, do we need all the dependencies we've brought in, and can we build this completely on NATS primitives? Spoiler alert, we could, and that's that's what I I'll share with you next. One of the biggest ways we did this is we decoupled the core, so essentially the control plane layer, from the agent layer. So you see on your screen this word called Nexlets.

Well, Nexlets is the term we coined for essentially what is going to be, agents in Nexland. So they were they've been built completely on top of the NATS micro, constructs. That means, at this point, we can build Nexlet to extend Nex on any NATS client that supports micro, which is pretty much all of them. So I kinda went over this, but building ballooned. Build tags were getting kind of hard and confusing and debugging was taking a very long time.

Those were a lot of the motivators that that kinda had brought us to the rearchitecture. And then we looked. Let's embrace nothing but NATS. We have Orbit, which we're using, and it took out a lot of the custom logic on how we, like, parsed messages over NATS. We have micro, which brought us pretty much everything we needed from a lightweight agent framework.

And we're really happy to say as of today, Nex is a nothing but NATS application. The only external export it's bringing in is something we use to validate JSON schema. But we're hoping now that we have Orbit looking at schema validation, we can we can pull that in. And then then then our dependency list will be completely Synadia and NATS based. The other big thing was the edge.

We kept talking about the edge, Nex wanted to bring workloads from the edge. And then when we sat down to define the edge, like, what does that mean? Turns out it means something different to everyone. My edge could be a mainframe, someone else's could be the cloud, and someone else's could be an ESP thirty two doing soil sample soil samples on a farm. You know?

The the edge is not concrete. But one of the key values of Nex was these workloads need to be able to move to whatever edge, and we need to be able to meet people where they're at. So that was another big motivator. And lastly, the community. We had so much positive engagement with Nex after a RethinkConn last year.

People sending in, like, what they wanted to see, where the limitations were. We're like, okay. We're not gonna be able to fill all these things, but what we can do is make Nex pluggable and flexible so that the community can build and then we can create kind of an ecosystem around Nex as well. So that's essentially what we did. Next so NexCore is very lightweight and it's essentially what used to be Nex.

You can you can start it. It doesn't have agent or Nexlet dependencies. But something that's very exciting and we'll talk about here in a minute is the type of agencies we can a Nexlet we can plug in. We've built some pluggable pieces such as credentialing and state. So when you start agents, you start workloads, the nodes are actually minting credentials real time that are scoped to the subjects that those things need to talk to.

So if you start a workload in account a, it's not gonna be able just to listen to and talk in the account b unless you intend to do that, which is which is another flexibility thing that I think is gonna be really valuable. And then we come to the type of Nexlet. So Nexlet is one of the things I'm most excited about because it does let us extend to wherever we want. The node itself has the ability to run three types of Nexlet. They are all built on the same SDKs, but they're just started differently.

An embedded Nexlet is essentially something that uses the Go SDK because Nexus Go. You can compile it in, it makes a single deployable unit. So that means you can take that node with that Nexlet single binary, put it wherever you want. That really helps with ease and I'm that's how I deployed in my home lab just to make my life easier. However, that limits you to go and that's you know, that does not embrace flexibility.

So local based necklace are something that are you can something Nex will run as a child process. These can be written in any language you want. Rust, anything that NATS client micro supports, which then when you look at things like Firecracker and all the other micro VMs, they're all natively used typically written in Rust. So instead of doing like the Go SDK wrappers to interact with Firecracker, we'll be able to talk to them, directly. And then the last one's the most exciting, remote.

So we thought how do we get Nexus far out to the edge as we can when it doesn't make sense that the, control client is or the, you know, the control layer is running out there as well. So we made it, so we've we created a way that you can compile an Nexlet, make it aware that it is itself is it will be connecting remotely. You ship it to wherever you want, and then you you maintain, like, state process of it. So you put it in system d. You keep it running.

And then from there, it extends the Nexall the way or it extends Nexall the way out to wherever you put it. In the wind, I have a, I think, fun demonstration here in a second, to show just that. Just a few more things about Nexlets life cycles. So the types of workloads that we, you know, we decided we should be able to support.

Services, job, and functions, they're all kind of self-explanatory and they just kind of mean different thing different things to how we monitor. So if someone were to submit a service, that's them telling you like, hey. If this dies, please restart it. Please try to keep it up. Functions are something a little different.

They're kinda like a service in which we keep them warm. However, they're not they they get they get put on the node and the code is invoked via a trigger. So you would send a message over a NATS topic or subject, and the next would be kept warm. It's listening on the subject. As soon as it gets a valid request, it spins up your workload, runs it, and shuts it down.

This is a way just to, like, minimize the amount of downtime. It kinda like functions as a service. The other big flexibility thing is Nex doesn't define what your start workload request has to look like. When you're building your Nex, you get to tell us that and upon registering with the node upstream, you say, this is what my start workload request looks like. The nodes like, got it.

And anytime a workload request comes in, we actually validate it so that it doesn't get sent all the way back to you if it's bad. And then if it's good, send it back. And then from there, the Nex takes it takes it on. And then we kind of make things, you know, DevX things like the Nexfile. It's essentially a a more succinct way to start a workload without having to type out a bunch of commands.

So I'm going to try a demo. I will explain it first. So locally on my computer, I'm going to start Nex. I'm gonna start it with no agents. Then I have a Raspberry Pi in the other room that I've SSH into.

I'm gonna start the agent or the Nexlet. You're gonna see the registration, and then we're gonna start a workload on that Nexlet. Now what this is real fun because what I've tried to do even though it's it's a little silly, but it's fun. We will send a message to the to the workload. The workload is aware of GPIO on the on the Raspberry Pi.

It will then send the message over, 915 megahertz to this Raspberry Pi that I have sitting behind me that is airgrapped, not on the Internet, nothing. It's just listening on 950 megahertz. That Raspberry Pi will then do sentiment inferencing. It'll say, it'll it'll look at the message, decide if it's neutral, negative, or positive, and then it'll send it back, then the agent will send it back to our node. So if it all goes well, we should be able to see sentiment analysis over the air.

Alright. Now let me let me change some screens around. A quick download of the screen. So over here, what we're gonna do is we're gonna start next. And we're going to look at the node and we're gonna see that we have no agents running.

So there's really nothing that this this node could do at the moment. So if we start the external agent over here on the pie, you'll see all of a sudden an agent's registered. We'll relook. We'll say we now have this remote agent running with no workloads. Now I'm gonna show you what the that agent's unique start workload request looks like real quick.

So the only thing that they that that word that agent's expecting is a NATS object a NATS object store reference. And so what that means is we're going to send it with the a reference to

Speaker 2:

sorry. Find it. There we go.

Jordan Rash:

We're gonna send it with a reference to this to this binary right here that's sitting in the object store on the NAT server on my computer. The agent's then gonna go pull the binary, save it locally, and then start it. So

Speaker 2:

alright.

Jordan Rash:

So now we have one workload running on this agent over here, and you see that it saved the binary from the object store to its local disk. Now if everything works, we're going to send it. Rethink Con is great. You'll kind of see you can see the messages coming in over the air. They come in chunks.

So the agent puts them back together and then submits them. And then what we see black in in, you know, under a second is that Rethink Khan is great. That is a positive the positive comment, and it was done via air gap sentiment analysis. Fun example kind of shows you inference on the edge and kinda how flexible that, we can get and how far we can push workloads. So this brings me to a really fun announcement, Synadia managed workloads.

So much like we have the NATS global supercluster, we have deployed a Nexcluster, which we call a Nexus, in all the same regions as NGS. So a multi node, multi geo high availability Nexlist that you will be able to run your workloads on in any cloud, in any region that we support. It's fully integrated into the UI of Synadia cloud, which I'll show you in a second. Placement feels the same as Jetstream. When you wanna go move your Jetstream resources from Geo US to Geo Asia, you just retag them.

Same with the workload. If you want a workload from AWS to Azure, you you just tag it and it moves it for you. And then we're gonna try and demonstrate that now. Alright. So now we're looking in Synadia cloud.

Looks very familiar to everything Seth showed earlier except we have this workloads tab. And if you look at kind of what I've got deployed here, you'll see that it kind of mirrors Scott's demo he just did for us. And in fact, it is Scott's demo. The only difference is I don't have a camera, so I have a a a load generator. And if we look here, I think someone will drop over drop a link in the chat.

We're actually running Scott's demo across three clouds running on top of Nex, in Synadia cloud. The bummer is there's no data, but that's because I haven't turned it on and we're gonna do that together. So let's both. Alright. Let me let me get these screens situated.

Alright. So I'm gonna go to this tab first. What we're going to do is we're going to start and since Synadia Cloud is multitenant, it is namespaced to only my account. So you'll see now I am using normal a namespace reference to my account and user credentials that are essentially the exact same credentials a user of NGO of ControlPlay would use. So we'll send it and you'll see am I not I'm not sharing my screen there, am I?

Sorry. Alright. There we go. So you saw we so we sent the workload, and almost immediately, we should start to see data get generated. There we are.

And, obviously, a lot of you have found the button and you're clicking it. So what the button's doing is kinda where Scott put his hand in front to create the anomalies. You're all doing it with fake data. So that that's a lot. I'm glad you all found it.

Now what's what's fun is if I go over here, I find my nope. Come on. I get logged out. Oh, you know what that is? There's so many of y'all on this.

You took up all my connections. Connections. Hold on. I can't even log into my own thing. Connections.

Nope. I need, like, 500. Sorry. Save.

Speaker 2:

Alright. Workloads.

Jordan Rash:

Alright. So now we've seen that our data source is living in GCP. And I don't like that. I wish I would have deployed it in AWS. So what we're gonna do is while you're all playing with that, I'm going to take that workload.

We're going to move it real time from GCP to AWS. And you should see almost no downtime. So there you go. If we go back to our hot bars, you can see it's speeding up a little bit. That's because for a brief second, both training mod or both, load generators are running.

And as we turn it back down, you you'll see it mellow back out. So essentially what we've just done and I kinda wanna show, we have we now have our data source generator running in AWS. We have our inference engine, I believe, running in Azure. And I believe the other three, the a the HTTP website y'all are all interfacing with is running in AWS. So multi cloud, multi geo, application, all this is running over NATS using NATS primitives.

The state of this app pulls from a KV store. I believe that is yep. Right there. So that so it's all sharing state centralized. So that's how we can start and stop things, without missing a beat.

So I believe yeah. My only other slide said thank you. Really, you. The community's been great around, the Nexexecution engine. I I hope that we can continue bringing y'all great stuff.

And, yeah, any feedback is, really welcome. Thanks all.

Nate Emerson:

That is one of the coolest demos, I think, and seeing the workload move across. It's just awesome way to put that together and also, you know, live debugging as well, getting those connections numbers sorted out real quick. That was perfect.

Jordan Rash:

I knew as soon as I saw those purple markers, I'm like, oh, I wasn't expecting that many.

Nate Emerson:

We got one asking, do you use Jetstream under the hood or only core NATS? I assume this is for Nex.

Jordan Rash:

Yeah. So Nexis pretty much core NATS under the hood. It's all published, you know, request reply. We are Nexdoes go one step further. It is emitting events that we are saving to a stream so that we can do out of bound event sourcing eventually.

I do know that the the only other piece is by default Nexis stateless. So if you were to drop a, like, kill a node and turn it back on, you would have lost your workloads. However, in order to remedy that, there's an interface there that lets you, implement state in different ways, and the official binaries will come with a KV backed, state. So every time you start a workload, it'll store it. You stop it, it'll take it out.

If you kill the node and turn it back on, and this is how the cloud nodes work, it'll go, it'll read all the workloads that pertains to it and it'll auto start them all. So that's kind of a disaster recovery mechanism.

Nate Emerson:

Very cool. We got another oh, it just went away. I think Don was typing an answer. There was talk of being able to deploy based on type so the agent pulls the appropriate binary type for the operating system or architecture. And is this still on the road map for Nex?

Jordan Rash:

Yep. So a lot of the responsibility we did, we took from Core Nexand moved it to the agent. So things like if you had if if, you have a Nexlet that is, sensitive to OS and ARC, one of the ways you do it is you'd either detect that once you got the binary or you'd stick it in your start workload request. Right now, we bundle, just just like we have since the beginning, a native Nexlet, which is essentially, takes a binary and runs it as a child process. We do bundle that with the actual node by default.

You can turn it off. It will do a little bit of validating to make sure if you send it an AMD binary on an ARM system, it it doesn't run it. But the good thing about it is we've made it we have pushed it down to where the author of the Nexlink can, do what they please with that.

Nate Emerson:

Right. And then we had Howard in chat just asking, where does Firecracker fit into Nexnow? Yeah.

Jordan Rash:

That's it's a great question. So as it sits now, the biggest problem with Firecracker, was we we went it was our it was our main way of, you know, gel housing all the, our workloads, but it was it didn't run on Windows. It was hard to run on OSX. And then people tried to start putting it in the cloud and we found out the only cloud it ran in without problem was GCP because it was the only one that allowed nested virtualization. You could move it to a bare metal VM instance in AWS and Azure, but they got expensive very quickly.

There are other ones like the ARM instances, I think, that are cheaper, but it just it it kind of became a a, a non happy path. That said, firecracker is still on the short list for, like, a a Nexlet that Synadia might make first class. That list is still being, defined. However, a community firecracker Nexlet would be awesome.

Nate Emerson:

And I saw somebody, I don't know if you touched on this much, but somebody was asking about, Dino Nexslit. And I was wondering if you touched on the isolates.

Jordan Rash:

So so we, the way originally we used a V eight implementation called VA GO in the first version of Nex. It didn't feel like that was being maintained. So again, so we started looking out and we landed on Dino isolates. In our cloud offering, if you are to submit a JavaScript function, so as a function as a service type thing, which I did not demonstrate today, that is running as a Dino isolate. So we take your we take your function, we wrap it to be, you know, to protect ourselves.

We we expose, some global functions. So, like, interacting with KV, interacting with PubSub, all that's gonna be free. You won't have to code that. You'll just have to call the global functions that'll all be, documented for use. And then

Nate Emerson:

Felix and Chad asking if if Nexlet is possible on Windows.

Jordan Rash:

Yeah. So that's why that's why we broke it apart. So if our clients so if our if the NATS client of your language will run on Windows, a Nexthat will run on Windows. We build completely on top of that, so much so once you have your your Nexus deployed, if you do a NATS micro LS, you can actually view all the free stuff we give you there. We give you like error codes.

We give you like, metrics, all that stuff you'll be able to leverage to kind of see the health of your health of your, Nexus.

Nate Emerson:

This was, definitely got a lot of chatter. People are very excited about workloads on that. And, again, I think your your demo is just so cool. So

Jordan Rash:

Awesome. Well Great work. I have I have bad news for everyone. That was exciting. Wait till the next talk because we dogfooded Nex, and it's really exciting.