Beautiful Data Polling

O’Reilly have released a new book in their “Beautiful…” series called “Beautiful Data.”  There’s a very comprehensive review on Slashdot which I highly recommend. The description of chapter eight caught my eye:

Chapter Eight is about social data APIs and pushes gnip heavily as the de facto social endpoint aggregator for programmers. The chapter mentions WebHooks as an up and coming HTTP Post event transmission project but doesn’t offer much more than a wake up call for programmers. The traditional polling has dominated web APIs and has lead to fragile points of failure. This chapter is a much needed call for sanity in the insane world of HTTP transactional polling. Unfortunately, the community seems to be so in love with the simplicity of polling that they use it for everything, even when a slightly more complicated eventing model would save them a large percentage of transactions.

The link “fragile points of failure” is worth following as it leads to a robust slashdot discussion on Twitter APIs and polling versus push for the web.

I think for a long time, the “web” as we know it has suffered from the lack of the Event/Listener paradigm. This is a pretty simple design concept that I’m going to refer to as the Observer [wikipedia.org]. Let’s say I want to know what Stephen Hawking is tweeting about and I want to know 24/7. Now if you have to make more than one call, something is wrong. That one call should be a notification to Twitter who I am, where you can contact me and what I want to keep tabs on–be it a keyword or user. So all I should ever have to do is tell Twitter I want to know everything from Stephen Hawking and everything with #stephenhawking or whatever and from that point on, it will try to submit that message to me via any number of technologies. Simple pub/sub [wikipedia.org] message queues could be implemented here to alleviate my need to continually go to Twitter and say: “Has Stephen Hawking said anything new yet? *millisecond pause* Has Stephen Hawking said anything new yet? *millisecond pause* …” ad infinitum.

And yet…

That’s not easy to do on a large scale. A persistent connection has to be in place between publisher and subscriber. Twitter would have to have a huge number of low-traffic connections open. (Hopefully only one per subscriber, not one per publisher/subscriber combination.) Then, on the server side, they’d have to have a routing system to track who’s following what, invert that information, and blast out a message to all followers whenever there was an update. This is all quite feasible, but it’s quite different from the classic HTTP model.

It’s been done before, though. Remember Push technology [wikipedia.org]? That’s what this is. PointCast sent their final news/stock push message [cnet.com] in February 2000. There’s more support for “push” in HTML5, incidentally.

Ahhh yes, I remember PointCast well. One of the early darlings of the dot-com era. This reply points at some new hope:

For messaging architectures (like, say, the internet), the pattern is usually described as “Publish/Subscribe”. All serious messaging protocols support it (XMPP, AMQP, etc.) and some are dedicated to it (PubSubHubbub). The basic problem with using it the whole way to the client is that many clients are run in environments where it is impractical to run a server which makes recieving inbound connections difficult.

There are fairly good solutions to that, mostly involving using a proxy for the client somewhere that can run a server which holds messages, and then having the client call the proxy (rather than the message sources) to get all the pending messages together.

Keep watching.

Bookmark and Share

Short Film Finalist

My daughter Ruby and her friends are finalists in Shoot It ’10. The short film is called “The Human Disposition”. Check it out and please vote!

Update: Ruby and Georgina won the “Peoples Choice” category! Thanks to everyone who voted.

Bookmark and Share

Privacy, Censorship and the new Oligarchs

I’m crotchety enough to have used the world wide web in the early nineties and before that, Usenet in the eighties. Back then the internet was a wild west where anarchy was viewed as a benefit. Openness and “freedom” – however you defined it – was ruthlessly defended, often to lunatic proportions. Back then the press blamed everything bad on the internet and privacy was a big issue. We seem to have come a long way from that world, but still the press blames everything bad on the internet and privacy is still a big issue.

Even bigger in the last few weeks with the furore over Facebook’s antics and Google’s drive-by privacy violations. A couple of good commentaries on this have been the Background Briefing report on the “Privacy Paradox” and Nicolas Carr’s observations on Facebook’s identity lock-in.

It is not surprising that the anarchy of the early web led to the building of “walled gardens” as a form of protection (this was AOL’s first web business model). It’s perhaps also not surprising that some of those walled gardens have become fortresses where the new Oligarchs exploit their netizens as “bonded labour”. Meet the new Oligarchs:

  • Steve Jobs rules fortress AppStore. He is chief censor and code reviewer and wants to protect our iP* user experience. All for our own good.
  • Sergei and Larry rule fortress Google. They have the largest correlation engine on the planet. They “[W]on’t be Evil” but we must trust that they know where lies the boundary. Google only wants to improve our search experience. All for our own good.
  • Mark Zuckerberg rules fortress Facebook. He generally only opens his mouth to change feet, but lately says he wants to unburden us of all this privacy nonsense. Privacy is just so 20th century. All for our own good.
  • Stephen Conroy wants to rule fortress Australia. Protecting us all from internet nasties by throwing a big censorship net around the country – just like China and Pakistan. All for our own good.

The problem with these new Oligarchs is that they purport to have our best interests at heart, but there is no openness or recourse, no rationale as to how they will separate our interests from their own. Google is too protective of their information assets and conveniently forgets to tell us about much of what they gather. Facebook views our private data too much as their own property to be onsold to others without our knowledge. AppStore acceptance or rejection appears arbitrary and fickle. As for the Net Filter, Conroy claims that a democratically elected government is more trustworthy than Facebook & Google – but there is nothing so undemocratic as a secret blacklist.

The new Oligarchs have built their fortresses on the architecture of the internet. Capitalising on Metcalfe’s law to build unbelievably valuable networks. But Metcalfe’s law also applies to our personal information. The value of any one piece of data about us is proportional to the square of all the other pieces of information they can correlate it with.

Is it all bad? Not really, the lesson of the web is that networks can provide powerful advantages. The Google search engine is testament to the power of massive collaborative filtering. Social networks such as Facebook have opened up wonderful social landscapes. The iP* AppStore has revolutionised the way we go mobile. To some extent this is also what was bought-into when we smothered all internet business models that involved payment. Web users want everything free but data centres, unlike clouds, don’t build themselves. The only currency we have allowed on the Web is that which can be obtained covertly. The real danger arises when power becomes so concentrated and subject to the whims of a few individuals. This is the lesson in the architecture of the underlying packet-switched internet.

From the old anarchic internet to the new oligarchic internet – everything and nothing has changed. Perhaps we should feel a lot less safe now when such people have our own interests so much at heart.

Bookmark and Share

Service Providers and One-Way MEPs

Service oriented architecture centres heavily on the concepts of service providers and consumers. It’s easy with request/reply web services to fall into the lazy habit of thinking of the provider as being the “server” side of the request/reply interaction. The consumer requests information from the provider, which the provider – naturally – provides! But this is wrong.

What happens in an N-tier architecture where there may be many “servers” in the stack? What happens with JMS-based services using a one-way message exchange pattern (MEP)? If one application is using SOAP/JMS to send a message to another application, which is the consumer and which is the provider?

On the face of it, you might say the “sender” is the “provider” and the “receiver” is the “consumer”, but that ignores the fact that there are two types of one-way MEP – “one-way out” and “one-way in.” (Actually there are many types of MEP and they differ slightly depending on the version of WSDL you use, see the WSDL standard for more confusing details).

We really need to look beyond the technology to find the answer and the Web Services Glossary gives a clue. It splits the model into an “agent” (software or process) that operates on behalf of an “entity” (person or organization). Specifically a Provider Agent and a Requester Agent operate on behalf of a Provider Entity and a Requestor Entity respectively.

So the “provider” of a Web service is basically the person or organization responsible for that service. It is the person or organization that you contact to get permission to use the service, or obtain the WSDL, or give your credit card details for charging.

An example will help to clarify the relationship between provider and consumer in one-way MEPs. Suppose a service provides alert notifications. Multiple consumers subscribe to this service to receive alerts on subjects that are important to them. At the messaging level, the provider puts a message onto a JMS Topic and multiple consumers receive the message. This is a “one-way out” MEP.

Another service might be a central audit service where multiple agents send messages via a JMS Queue representing steps in a distributed process. This is commonly used for “track and trace” in distributed workflows. In this case, the message senders are not responsible for the audit system, they are “users” or “consumers” of the service. This is a “one-way in” MEP.

In summary, service providers and consumers can be confusing in an N-tier architecture or with one-way MEPs. The fundamental consideration is more “business” than “technical”. Who is the organization or person responsible for the service? Then the way consumers interact with them determines the MEP that is being used.

Bookmark and Share

A Crap Customer Experience

Customer self service is usually just secret code for pushing cost and effort onto the customer, but sometimes it can be “win-win” where the provider saves money and the customer avoids dealing with the dreaded call centre. But I recently had an experience where poor integration leads to a “lose-lose” situation.

Qantas airways offers codeshare flights with its budget subsidiary Jetstar. The booking is with Qantas, the flight number is a Qantas flight number, but the plane is operated by Jetstar. Sometimes these can’t be avoided and I recently had to take this option.

Overall I like online web check-in. It seems to save me time and gives some autonomy over seat selection. When I try this with my codeshare flight, I naturally go to the Qantas website to check-in. The “manage my booking” page offers me the usual facilities, except the “check-in now” button is very hard to find. There is no indication that I’m in the wrong place, perhaps I’m just not looking hard enough, or I’m just dumb.

A call to the Qantas helpdesk confirmed that indeed I need to check-in on the Jetstar website. Thanks for telling me. But – get this – I have to use a special booking reference number that the operator gives me over the phone. The reference number supplied on my Qantas ticket won’t work.

Ok, so over to the Jetstar website and nothing works! I can’t even get in, but apart from blinking at me when I hit “enter” there is no indication of the problem. I call the Jetstar helpdesk (yes, I’m stubborn) and am told to enter my surname in uppercase. Apparently when Qantas makes the Jetstar booking my surname is entered in uppercase and the surname field is case sensitive (why?).

The integration fiasco is now clearer. It seems that when I make a Qantas booking on a codeshare flight, Qantas makes a “proxy” booking in their system and a real booking in the Jetstar system. But all the details given to me refer to the fake booking, not to the real booking. I’m not even aware there are two bookings until I persist with online check-in (which most wouldn’t).

When faced with an integration problem most people take one of two approaches: either make the user experience seamless or allow the stitches to show, but give users the tools to navigate the business process. Qantas/Jetstar just ignores the whole problem and leaves its customers dangling. The result is wasted time and frustration for anyone wanting to check-in early, plus additional cost to Qantas/Jetstar in helpdesk calls.

The solution to this is mind-bogglingly trivial. The “manage my booking” web page could offer me the usual “check-in now” button hyperlinked to the Jetstar site and containing the Jetstar booking reference as a parameter. Hell, they could even uppercase my surname on the way!

Bookmark and Share