Web 2.0 Discovers Integration 2.0

My last entry was about Gnip and I take this subject up again following on Dare Obasanjo’s blog entry on the same subject where Dare grapples with the problem of how to share social activity streams across the web. The fundamental problem is not new to anyone dealing with the problems of integration inside the firewall, a plethora of different apis and protocols leads to spaghetti architecture. What we seem to have now in the social web is JBORS – Just a Bunch of RESTful Services.

This is only a problem if people want these services to collaborate, but there certainly seems to be sufficient desire and opportunity to make this happen. I also think that the social web – chock full of personal and dynamic information, collaboratively maintained by an army of “ordinary people” is a good testbed for the techniques which may be applicable in other applications such as Electronic Health Records.

The Gnip approach to the challenge is to provide a type of ESB in the cloud. The basic functions of an ESB – transport and mediation – is applied to the various protocols which are exported or imported by popular social applications.

Dare worries that this approach puts Gnip in the middle of the social network as a “single point of failure”. This is the classic problem with the ESB and the source of much derision about “ESB in a box”. You’re not solving the problem of protocol/api/schema proliferation, you’re simply pushing the mappings into a central logical component which – depending on its architecture – may be a single point of failure. So what are the alternatives?

One approach would be to standardize the interfaces so as to minimise or completely eliminate the need for mediation. One might refer to this as a service-oriented architecture, but I hesitate to use such an unfashionable term. But there’s the rub. The social web is a dynamic, anarchic frontier on the bleeding edge of information technology. What effect would standardization have on that eco-system? Would it slow down or block off avenues of innovation? Would new business models be choked off, trapped in a proprietary cul de sac? Maybe! These are the dangers of premature standardization which are best applied to mature technologies and processes.

Even if you could standardize to the required level of completeness, how do you coordinate dozens of different companies to support the standards? It’s pretty much impossible. We’ve struggled with this inside the firewall for decades. For example how do you get Siebel and Metasolv and Retek and Primavera all standardizing around the same set of services? Well in this case, they all get acquired by Oracle and it becomes Oracle’s problem.

The first step in the standardization of the social web has been taken by Google with OpenSocial. If standardization of the social web ever happens I think it will be a long time before the requirement for mediation disappears. In the meantime we have Gnip, and Friendfeed and…watch this space.

Thylacines and Wolves

If you look closely – right now – as we speak – a new ecological niche is opening up in the web.

It is just over eight years since we first saw the read-write web and a little more than three years since we heard about web 2.0. In that time, everyone has been gradually building up more and more valuable assets in the web: emails, photos, blogs, collaborations, videos, social networks. For some of us, a large chunk of our lives now has an independent existence in the web.

But something about this has started to become problematic. Our social assets are splattered across dozens of different sites and platforms. Multiple social networking sites vie for our attention. The result is increasing fragmentation of information and its associated problems – duplication and inconsistency. The world-wide-web has rediscovered that old enterprise bogey-man – integration! (or the lack thereof).

Some recent examples include John Udell asking where is SOA when you really need it, and Loic Le Meur lamenting the fragmentation of his social map.

At the same time a raft of new applications is attempting to address these issues:

  • Google Open Social aims for social network interoperability.
  • OpenID has now achieved broad support (if not success) as a way of managing distributed identity and authentication.
  • Ping.fm unifies message posting while FriendFeed aggregates the receiving side.

The key thing about these initiatives is that they all start at the edge of the integration problem. They attempt to support interoperability by unifying the interfaces to these web 2.0 platforms.

The new player that caught my attention recently represents the genesis of “web middleware” in the form of Gnip which bridges the “air gap” between the Producers and Consumers of the social web. And in a beautiful example of parallel evolution, Gnip makes use of wholly web protocols such as XMPP and Atom to provide the functions which are familiar inside the enterprise as JMS and SOAP. Gnip provides connectivity, message delivery and mediation between different data formats. Pinch me if that doesn’t sound just a little like an ESB. But it lives in and has evolved entirely from the web! This is the IT equivalent of discovering the Thylacine in the new world as an evolutionary parallel to the Wolf in the old world.

The funny thing is that while some middleware vendors are trying to figure out how to colonise the cloud (e.g. here and here), the natives are already evolving into that niche.

Push versus Pull

From OSCON via O’Reilly Radar here’s a good case study of an architectural decision driven by the system requirements rather than the usual religious considerations that pollute the bloggosphere.

FriendFeed needed update info from Flikr but a REST-based “pull” approach is highly inefficient in this case. Instead the solution architects opted for a “push” approach using xmpp as the message transport. This is a really good presentation because it goes into the architectural choices and implications of “push” versus “pull”.

I characterize this as “pull vs push” rather than “REST vs xmpp” (or “REST vs *” or “why REST is crap”) because fundamentally it comes down to the best choice of how to synchronize changes between systems. You make this choice based on the usage characteristics of the different systems, the likely traffic volumes this will result in and the consequential resource impacts. Having made the choice between push or pull you then choose the appropriate message transport.

The web doesn’t do a lot of “push” and consequently there is not a lot of discussion about push and REST. Dare Obasanjo characterises it nicely:

Polling is a good idea for RSS/Atom for a few reasons

  • there are a thousands to hundreds of thousands clients that might be interested in a resource so the server keeping track of subscriptions is prohibitively expensive
  • a lot of these end points aren’t persistently connected (i.e. your desktop RSS reader isn’t always running)
  • RSS/Atom publishing is as simple as plopping a file in the right directory and letting IIS or Apache work its magic

The situation between FriendFeed and Flickr is almost the exact opposite. Instead of thousands of clients interested in document, we have one subscriber interested in thousands of documents. Both end points are always on or are at least expected to be. The cost of developing a publish-subscribe model is one that both sides can afford.

Inside the firewall, the situation is often more akin to that between FriendFeed and Flikr. This is why messaging is more common inside the firewall than outside – not because of any universal superiority between REST versus messaging, but because the system requirements are different and often favour a push approach rather than pull.

While your over at Dare’s excellent Blog, be sure to also check out his discussion of push versus pull in the context of scaling Twitter and MS Exchange.  These are important considerations for designers of federated systems such as federated databases or federated messaging systems. The example of FriendFeed to Flikr could be considered as the first incremental step toward a federation.