AWS PubSub to a WebHook

In my last post I showed how to send a Simple Notification Service (SNS) message to an email endpoint. Now I show how to easily add a WebHook endpoint. WebHooks are a design pattern using an HTTP POST to send a notification to a URL which the “subscriber” has registered with the service. WebHooks are being used in an increasing number of web APIs and there is an interesting interview with Jeff Lindsay on this topic at IT Conversations.

A useful test platform for WebHooks is PostBin.org. Simply click on the “Make a PostBin” button and you will be presented with a new URL for your notification messages – something like “http://www.postbin.org/1hf0jlo“. This is the URL you register with SNS.

Turning to the SNS dashboard, add a new subscription to a topic that you’ve already configured in SNS. Specify protocol “HTTP” and enter the PostBin URL as the endpoint. SNS will post a confirmation message to this URL before you can send through messages.

Go back to your PostBin URL and you should see the confirmation message.

Buried in the message is the SubscribeURL which you need to hit in order to confirm the subscription. I pasted it into notepad and edited “cleaned up” the URL before pasting it into a browser. This confirms the subscription with SNS.

Now back in the SNS Dashboard you can send a new message. In my case, since I still have my email endpoint, the same message is sent to both the email and the WebHook endpoints…thus:

…and thus:

AWS PubSub

AWS is the latest to the join the cloud pub/sub hubbub with the (beta) launch of its Simple Notification Service. I took it for a spin this morning while waiting for my flight.

Once I signed up for the service in my AWS account, it was a simple matter of opening up the AWS Console and selecting the SNS tab.

You create a topic (I called mine “MyPersonalTweeter”).

Then create a subscription.

Subscribers can be HTTP/HTTPS WebHooks, or an email address or an SQS queue. I selected email.

AWS sends a confirmation to the nominated email address (damn, there go the SPAM opportunities).

Then just publish a message to the topic consisting of a subject and a message body (for some reason my first message never arrived, so I’m showing message number two sent later that day).

Digging a bit deeper, topics have policies which allow reasonably fine-grained control over who can publish and subscribe to your topic.

Beautiful Data Polling

O’Reilly have released a new book in their “Beautiful…” series called “Beautiful Data.”  There’s a very comprehensive review on Slashdot which I highly recommend. The description of chapter eight caught my eye:

Chapter Eight is about social data APIs and pushes gnip heavily as the de facto social endpoint aggregator for programmers. The chapter mentions WebHooks as an up and coming HTTP Post event transmission project but doesn’t offer much more than a wake up call for programmers. The traditional polling has dominated web APIs and has lead to fragile points of failure. This chapter is a much needed call for sanity in the insane world of HTTP transactional polling. Unfortunately, the community seems to be so in love with the simplicity of polling that they use it for everything, even when a slightly more complicated eventing model would save them a large percentage of transactions.

The link “fragile points of failure” is worth following as it leads to a robust slashdot discussion on Twitter APIs and polling versus push for the web.

I think for a long time, the “web” as we know it has suffered from the lack of the Event/Listener paradigm. This is a pretty simple design concept that I’m going to refer to as the Observer [wikipedia.org]. Let’s say I want to know what Stephen Hawking is tweeting about and I want to know 24/7. Now if you have to make more than one call, something is wrong. That one call should be a notification to Twitter who I am, where you can contact me and what I want to keep tabs on–be it a keyword or user. So all I should ever have to do is tell Twitter I want to know everything from Stephen Hawking and everything with #stephenhawking or whatever and from that point on, it will try to submit that message to me via any number of technologies. Simple pub/sub [wikipedia.org] message queues could be implemented here to alleviate my need to continually go to Twitter and say: “Has Stephen Hawking said anything new yet? *millisecond pause* Has Stephen Hawking said anything new yet? *millisecond pause* …” ad infinitum.

And yet…

That’s not easy to do on a large scale. A persistent connection has to be in place between publisher and subscriber. Twitter would have to have a huge number of low-traffic connections open. (Hopefully only one per subscriber, not one per publisher/subscriber combination.) Then, on the server side, they’d have to have a routing system to track who’s following what, invert that information, and blast out a message to all followers whenever there was an update. This is all quite feasible, but it’s quite different from the classic HTTP model.

It’s been done before, though. Remember Push technology [wikipedia.org]? That’s what this is. PointCast sent their final news/stock push message [cnet.com] in February 2000. There’s more support for “push” in HTML5, incidentally.

Ahhh yes, I remember PointCast well. One of the early darlings of the dot-com era. This reply points at some new hope:

For messaging architectures (like, say, the internet), the pattern is usually described as “Publish/Subscribe”. All serious messaging protocols support it (XMPP, AMQP, etc.) and some are dedicated to it (PubSubHubbub). The basic problem with using it the whole way to the client is that many clients are run in environments where it is impractical to run a server which makes recieving inbound connections difficult.

There are fairly good solutions to that, mostly involving using a proxy for the client somewhere that can run a server which holds messages, and then having the client call the proxy (rather than the message sources) to get all the pending messages together.

Keep watching.

Push versus Pull

From OSCON via O’Reilly Radar here’s a good case study of an architectural decision driven by the system requirements rather than the usual religious considerations that pollute the bloggosphere.

FriendFeed needed update info from Flikr but a REST-based “pull” approach is highly inefficient in this case. Instead the solution architects opted for a “push” approach using xmpp as the message transport. This is a really good presentation because it goes into the architectural choices and implications of “push” versus “pull”.

I characterize this as “pull vs push” rather than “REST vs xmpp” (or “REST vs *” or “why REST is crap”) because fundamentally it comes down to the best choice of how to synchronize changes between systems. You make this choice based on the usage characteristics of the different systems, the likely traffic volumes this will result in and the consequential resource impacts. Having made the choice between push or pull you then choose the appropriate message transport.

The web doesn’t do a lot of “push” and consequently there is not a lot of discussion about push and REST. Dare Obasanjo characterises it nicely:

Polling is a good idea for RSS/Atom for a few reasons

  • there are a thousands to hundreds of thousands clients that might be interested in a resource so the server keeping track of subscriptions is prohibitively expensive
  • a lot of these end points aren’t persistently connected (i.e. your desktop RSS reader isn’t always running)
  • RSS/Atom publishing is as simple as plopping a file in the right directory and letting IIS or Apache work its magic

The situation between FriendFeed and Flickr is almost the exact opposite. Instead of thousands of clients interested in document, we have one subscriber interested in thousands of documents. Both end points are always on or are at least expected to be. The cost of developing a publish-subscribe model is one that both sides can afford.

Inside the firewall, the situation is often more akin to that between FriendFeed and Flikr. This is why messaging is more common inside the firewall than outside – not because of any universal superiority between REST versus messaging, but because the system requirements are different and often favour a push approach rather than pull.

While your over at Dare’s excellent Blog, be sure to also check out his discussion of push versus pull in the context of scaling Twitter and MS Exchange.  These are important considerations for designers of federated systems such as federated databases or federated messaging systems. The example of FriendFeed to Flikr could be considered as the first incremental step toward a federation.