NATS Weekly #17

NATS Weekly #17

Week of March 7 - 13, 2022

🗞 Announcements, writings, and projects

A short list of  announcements, blog posts, projects updates and other news.

️ 🎤 Events

⚡Releases

Official releases from NATS repos and others in the ecosystem.

📖 Articles

💬 Discussions

Github Discussions from various NATS repositories.

💡 Recently asked questions

Questions sourced from Slack, Twitter, or individuals. Responses and examples are in my own words, unless otherwise noted.

What is the behavior of a pull consumer subscription Fetch?

In issue #15, I answered a question What is the purpose of a pull consumer timeout? I described the (tricky) balance of choosing a proper batch size and timeout. This behavior of "waiting for the full batch or time out" has been changed in recent months (possibly back in January in fact), however most client libraries are still in process of being updated to the new behavior. The Go client, however, does support this behavior change. (I defer to the NATS core team for details on when this change occurred and when client libraries will be updated 😉.)

With that bit of history/context aside, what is the new behavior? The primary difference is that rather than waiting for the full batch of messages before the call to Fetch returns, it now returns if at least one message is available. The timeout applies when there are zero messages available.

Here is a code snippet with inline comments to demonstrate the behavior. Assume there is a stream with subject foo, no messages in that stream yet, and a pull consumer on that stream.

// No messages in the stream, so this will wait for a second

// and then return with a timeout error.

msgs, err := sub.Fetch(5, nats.MaxWait(time.Second))

// Publish one message.

js.Publish("foo", []byte("..."))

// Try again... this will now return right away with a single

// message in the msgs slice.

msgs, err = sub.Fetch(5, nats.MaxWait(time.Second))

// len(msgs) == 1

// If another call to fetch is made here, it would block again

// since there are no messages.

// Publish 10 messages.

for i := 0; i < 10; i++ {

js.Publish("foo", []byte("..."))

}

// Another fetch call will return as soon as the messages are

// received with five in the batch.

msgs, err = sub.Fetch(5, nats.MaxWait(time.Second))

// len(msgs) == 5

// Another since there are five still remaining in the stream..

msgs, err = sub.Fetch(5, nats.MaxWait(time.Second))

Given that the full batch is no longer required before returning, trying to balance the batch size and timeout is no longer necessary. The batch size should be considered "the maximum number that can be handled at one time" (since there may be fewer per call) and the timeout is now "how long to wait to get any messages before doing something else."

What are valid characters for subjects, streams, sever names etc.?

This is a pretty common question and there is a great architecture decision record that defines the specification. I don't have much more to add since it's a spec, but if you run into errors indicating some kind of name is invalid, ensure the names are compliant.

Is NATS synchronous or asynchronous?

I intentionally phrased this question ambiguously, but it is a common enough question (also often with ambiguous intention) that it is worth exploring in more detail.

There are two general themes when talking about [a]synchrony, interaction and behavior. Interaction is fairly easy to observe. If a function or method call blocks to wait for the expected response, then this is a synchronous interaction. Core NATS request-reply as well as (the default) JetStream publish API (which uses request-reply internally) both block and wait for a reply. In the case of core NATS this would be a reply from the subscriber of the subject and with JetStream, an ack that the message was written to a stream.

There is also an async JS publish (PublishAsync in the Go client) which immediately returns with a PubAckFuture which will eventually be populated with the acknowledgment or error. This may be the preferred API when publishing in a tight loop and you just want to wait until the end to check if all outstanding writes were completed via PublishAsyncCompleted().

Working with synchronous code is often easier to understand, however this asynchronous model of a future (there are other programming models), is fairly straightforward to understand since you can choose to get the ack/error info at a later point in the program. The asynchrony enables other progress to still make progress in lieu of the network call to NATS.

That said, if a successful acknowledgement of the publish is required and other operations depend on this success, then order does in fact matter unless you've considered a local compensating action if a publish error occurs. Choosing the right interaction model highly depends on local dependencies in your code and comfort with dealing with errors, i.e. non-happy-path coding.

The second kind of asynchrony is the behavior of the service the client is interacting with. With NATS, a service is ultimately a subscriber that receive a message and does something with it. If a client is expecting a reply, the service could either do the work (synchronously) and then reply or queue the work and reply that it received the message only (implying the work will be handled later). Obviously the semantics here are different since in the later case, the result/output/effect has not happened yet.

The reason these two synchrony models are distinguished is because the client could still make a synchronous request to a service that handles the message asynchronously. The client would still block on the network interaction, but not for the time to do the work. For those familiar with HTTP, this is similar to the 202 Accepted status code. In both cases, any kind of follow-up interaction by the client needs to be defined. For NATS this could be a unique subject to subscribe to for a future message the service will publish to when the work is done. Alternately, it may be assumed the effect will be observed in some future query interaction.

The combination of an async interaction by the client and async behavior by the service, would result in a future being returned that, when resolved, indicates the message was accepted and will be done later.

Knowing which interaction type to use and/or whether your service should be implemented with async behavior/handling, depends on a variety of factors. However, generally it is easier to start with a synchronous model to get things working and refactor to an asynchronous model where necessary to satisfy latency or scale requirements.