See how NATS solves Intermittent Connectivity at the Edge
Robert Hughes shares how PowerFlex leveraged NATS in their tech stack modernization, and how NATS ease of use and functionality organically fueled the expansion of NATS within PowerFlex energy management solutions.
“I will say the one really appealing feature of NATS JetStream is the automatic sends, or the guarantees of sends, even when the Internet connection goes out. That's become extremely useful time and time again. And again we have hundreds and hundreds of leaf nodes. And the Internet is bad at most of them. So this is a really useful thing that we get out-of-the-box with NATS JetStream.”
— Robert Hughes, Engineering Manager, PowerFlex
Go Deeper
Full Transcript
Nate Emerson:
Here we're speaking with Robbie Hughes from PowerFlex provides clean tech solutions for carbon free energy, things like solar storage and EV charging. Robbie thanks so much for joining us today. Can you tell me about yourself and what PowerFlex does?
Robert Hughes:
I'm an Engineering Manager at PowerFlex. I've been at PowerFlex for almost half a decade. I've worked kind of all over at the company from Mobile App to back end, and my biggest, newest venture was setting up a massive JetStream for PowerFlex. PowerFlex as you mentioned is in the EV space. We do solar storage, EV charging, obviously, and we're really an energy management solution. So if you find yourself interested in wanting to set up EV chargers at your apartment complex or company. Let us know.
Andrew Connolly:
Awesome. Well, why don't we start then, with a little background on the tech stack and sort of what your infrastructure looked like when you got to that NATS and NATS JetStream project?
Robert Hughes:
Yeah. So when I started at Power Flex a long time ago the company looked completely different. We were pretty new. We had a couple of EC2 instances per customer, and we were manually deploying. Everything was HTTP and we went through this big initiative to modernize everything. So we went to Kubernetes with automatic deployments. And we actually moved cloud providers to GCP. And during this modernization phase an engineer introduced NATS, and nobody knew what NATS was – it was just kind of like the hot new broker. And little by little, teams just kind of started using NATS more and more, especially from our edge-to-cloud. You know every customer of ours has their own edge Kubernetes cluster. In order to talk to our cloud Kubernetes cluster, all the traffic sort of started gradually going towards NATS, and it was great. And that's you know, NATS is super fast. But what we started doing was like really important transactional stuff. And, as you can imagine, in a parking garage, maybe not the best Internet connection all the time, right? And so we that's when I was like, well, we may need to start using this thing called JetStream that I've read about here and there.
Andrew Connolly:
Nice. So it sounds like a very common story, maybe the most common story across our users. NATS starts small. Gets a foothold somewhere. Then, one or two folks evangelize it and then it spreads organically and starts sucking up use cases. So you guys were probably using core NATS in some kind of fire-and-forget type configuration, and then you realize you needed delivery guarantees and that persistence layer. Is that right?
Robert Hughes:
Yeah, exactly. We were using fire-and-forget for things that shouldn't be fire-and-forgot about.
Andrew Connolly:
Was there some pain associated with lost data or lost transactions?
Robert Hughes:
Yeah, here and there, but not as much as you would think. I feel like when you read, or you start learning about NATS versus JetStream, you kind of start thinking like, “oh, no”, NATS will drop messages, which wasn't really that common. But it definitely happened. And so that's when we started building out our JetStream architecture. And as a company, we want to move more towards event driven architecture, anyway. So JetStream has kind of been perfect for that.
Andrew Connolly:
Very nice. Can you tell us a little bit more about that edge-to-cloud architecture that you mentioned? Sort of how that factors into the services you guys provide and where NATS plays in that architecture?
Robert Hughes:
Yeah, sure. So we have kind of broken out a certain number of streams and we call them domains. And we have these in our cloud. And every single edge site has these same streams, and they're sourcing it from one another. Right? So let's choose a domain orders. So at edge site A, if all of these order events are happening in this one stream, it's getting sourced into our cloud; what's kind of unique is we found instead of sourcing one-by-one-to-everything, we actually found some efficiencies if we bucketed all the edge sourcing streams into one upload stream. And then uploaded all of those with one sourcing to the cloud.
Andrew Connolly:
Okay, was that an efficiency that you guys stumbled on or worked with anyone?
Robert Hughes:
Oh, no. We worked very closely with Synadia during this whole process.
Andrew Connolly:
It sounds like, at least for this part of the architecture, it was sort of a Greenfield solution. You guys weren't replacing anything or moving away from a legacy tech, is that right?
Robert Hughes:
Yeah, correct. We had that luxury.
Andrew Connolly:
Very nice. So when it came time: to add that persistence layer or to get higher delivery guarantees for certain situations was JetStream the obvious choice? How did you kind of make that selection?
Robert Hughes:
Well, because we had core NATS, and we were so tightly coupled to core NATS, I mentioned before that our edge-to-cloud traffic was all through NATS, but really all of our edge traffic at our edge sites are through NATS services. We were pretty tightly coupled to NATS to start, and so we knew that we really wanted JetStream to work. And JetStream has worked well, but it wasn't like we really had the option to go and look for other solutions. I will say the one really appealing feature of JetStream is the automatic sends, or the guarantees of sends, even when the Internet connection goes out. That's become extremely useful time and time again. And again we have hundreds and hundreds of leaf nodes. And the Internet is bad at most of them. So this is a really useful thing. We just get out of the box with JetStream.
Andrew Connolly:
You actually took the next question right out of my mouth. I was going to ask about intermittent connectivity. And it sounds like iit's a thing that's plagued or plagued you guys on a regular basis.
Robert Hughes:
Well, not anymore. Yeah, just the power of JetStream.
Andrew Connolly:
So that architecture is basically leaf nodes running at the edge. I think sometimes we call it store and forward, right? Like it's gonna collect data locally and then when the connection is back…
Robert Hughes:
Yeah, and that’s done through the stream sourcing, yes. And then the broader architecture is we have N leaf nodes, one per site, and each leaf node is a three node cluster of servers. And then our cloud cluster is obviously way bigger. And every server is a part of the same cluster with just a unique server name.
Andrew Connolly:
Gotcha. Can you give us a sense of at the business or functional level, what sorts of data are you guys collecting from the customer installations? Are we talking like telemetry? Or what sort of things is going on with that data?
Robert Hughes:
Yeah, telemetry is a big one for sure, especially with our storage and solar. But outside of telemetry we have, as I mentioned before, adaptive load management. So there's a lot of core business logic to EV charging happening at the edge that needs to get sent up to our cloud.
Yeah, so not just telemetry, which telemetry is pretty fire-and-forgetty.
Andrew Connolly:
Yeah, the higher value data tends to have the persistent delivery guarantees around it.
Robert Hughes:
Yeah, we definitely need that.
Andrew Connolly:
How has NATS and/or Synadia impacted the developer or operator experience at PowerFlex? Has it made a difference in the way you guys operate internally?
Robert Hughes:
Yeah, NATS and JetStream by itself are easy to work with. We've created some things on our side like a specification repo. It's kinda always sometimes hard to keep track of all the events in the system. And so we've gone above and beyond, and kind of librarying out what we have which is helped with documentation. But I mean, NATS and JetStream, it's been really easy to use. JetStream requires a stream setup. So every time we get a new customer, we have to commission a new site, we have to add new streams. That's maybe the biggest hiccup. But it's all been pretty, we're pretty happy with the technology. And Synadia, I mean the support we've gotten through Synadia building this system out has been unbelievable. They've been so helpful and so nice. I've gotten to know a lot of the support engineers over there.
Andrew Connolly:
Very cool. How about a little more forward looking? Anything on the roadmap that NATS will factor into? Either because it's an obvious fit, or because NATS sort of makes it uniquely possible?
Robert Hughes:
I don't wanna get too deep into some competitive advantages we're going towards. But I mean, we are totally shifting to more of an event-driven architecture. And we couldn't do that without NATS JetStream, especially with the intermediate connectivity. It would otherwise take years to set something else up right.
Andrew Connolly:
Put yourself, if you will, in the shoes of somebody in the audience, maybe they're new to NATS or just digging in a little bit, any advice or lessons learned for folks watching things that you picked up early in your NATS journey that you would pass on?
Robert Hughes:
Yeah, when I first started learning about JetStream the documentation really confused me, and I found YouTube tutorials that were way better. So I would recommend going down that route.
Andrew Connolly:
We have definitely heard a lot of praise over the years for the YouTube content. That's good advice.
Robert Hughes:
And server events, those are really useful. If you start admining NATS systems. Listening to server events is wildly useful.
Andrew Connolly:
Can you say a little bit more about what you mean by server events, and how you guys are taking advantage of or capitalizing on those?
Robert Hughes:
Well, it's really just when something's going wrong, I'll sub on the NATS Server prefix. I don't remember what it is. I'd have to look. I have a book of handy things I listen to, but there's all sorts of internal subjects that you can just sub on when you're debugging that are really useful. You know, if you want to see if a consumer is hacky or something you can... I think it's $JS API and things like that. There's just these events that are wildly useful and it's in the documentation. So it's definitely useful to learn those.
Andrew Connolly:
How about interoperability? So it sounds like you guys are all in on NATS. A lot of, if not all, of your sort of event-driven messaging and communication is NATS based? Are there any integration points or handoffs or situations where NATS needs to work with other databases or work with other protocols that are worth talking about?
Robert Hughes:
It's actually funny. You mentioned that we have a customer who wants us to start using our math system with MQTT. And so we're we're in the early stages of getting that pushed out. But yeah, it looks like we're gonna have NATS function as an MQTT broker pretty soon.
Andrew Connolly:
Is that because this customer wants to do some of their own analysis or reporting?
Robert Hughes:
Yeah.
Andrew Connolly:
It's interesting. Almost that exact use case has come up recently in another chat and people often don't even know that NATS is an MQTT broker. So it's a nice little add on, it's just kind of there waiting to be leveraged when needed.
Robert Hughes:
Yeah, I didn't know it either. So I looked, and it turns out, Yeah.
Andrew Connolly:
Cool. Robbie thanks so much for chatting. I really enjoyed learning about how PowerFlex is leveraging NATS and looking forward to when I encounter a PowerFlex EV charger out in the wild. Thanks.
Nate Emerson:
Alright. We've got Robbie joining us live. Thank you so much for the time, Robbie. Thank you for sharing your story with Power Flex, and we've got two questions from the audience.
First, what does the scale look like in terms of how many events are you seeing sent from the edge? And how many leaf servers and hub did you have? What kind of auth did you use?
Robert Hughes:
Yeah, we have about 500 leaf nodes. Each leaf node is a 3 server cluster. We have one hub, I think it's 13 servers and they're all connecting as leaf nodes. In terms of scale, we send on average a few gigs per leaf node a day. It equates to like a few million, so pretty small messaging.
We use pretty similar to MachineMetrics. We use the decentralized JWT end keys. We have a single account from our hub to our leaf nodes. And each customer, through signing keys, gets a unique end key for authentication, but managed with one signing key, and permissions and wildcards.
Nate Emerson:
Can you tell us a bit more about the specification repo you mentioned? So what was the motivation behind that? And how to interact with it?
Robert Hughes:
Kind of similar to MachineMetrics. It's a thing we wrote ourselves a PowerFlex we have a Github repo of the cloud event. It's following the cloud event specification, async API. But we take it a step farther, we actually have a CICD that turns it into libraries. So we have python, we have a typescript node, we have rust. So it takes the spec, and it generates event payloads and even publishers into libraries that developers can use by just updating a package.
Nate Emerson:
Awesome, and then one kind of fun one, but just out of curiosity. Do you have any major wish list items or features that you're looking forward to in new net server developments, especially particular to edge and leaf node deployments?
Robert Hughes:
I'm pretty stoked about the tracing personally. It's something that I wish NATS has had. Yeah, I couldn't be more excited about it.
Nate Emerson:
Awesome. Well, hey, thank you so much for the time today, Robbie, thank you for joining us and looking forward to charging my EV Sometime at a PowerFlex station.
Robert Hughes:
Oh, yeah. Yeah, please do take care.