Saturday, December 31, 2005
Being a defensive Web 2.0 data consumer
Proper handling of the inevitable evil of outages/downtime in a Web 2.0 world from a data consuming standpoint is critical. In an evolving environment where access to information is opening up and entire sites are often based exclusively on the data of others, programmers have to take into consideration situations when such assets might not be available.
On a marketing level we can scheme, dream and strategize all we want about the commercial potentials of (re)distributing, using and sharing data in new and creative ways, but tenets we have to solidify from a standpoint of responsible engineering are:
Unfortunately as developers we can't normally directly control how a client consumes our data, and we can only hope such is done in ways proactively taking into account possible losses of service. As an example of how to code defensively when bringing in remote data, consider the following C# block, used in a .NET 1.x web client obtaining data from a remote web service:
if(Cache["RemoteData"] == null)
{
// if the Cache object doesn't contain data, call the web service
GetData();
}
// apply the data from the Cache some page-level DIV element
someDIV.Text = (string)Cache["RemoteData"];
private void GetData()
{
string theData = string.Empty;
try
{
// 1. do a data fetch operation by calling a web service (could also be I/O on a remote file, reading-in an RSS feed, etc.)
SomeWebService ws = new SomeWebService();
theData = ws.FetchData();
// 2. place the data in a caching layer relative to the executing application and expire it in a reasonable amount of time
Cache.Insert("RemoteData",theData,null,DateTime.Now.AddHours(3),Cache.NoSlidingExpiration,null,null);
}
catch(Exception e)
{
// in the event of a server error, cache a temporary message that expires quicker, assuming the data source will come back up soon
theData = "The remote source is unavailable. Please try again later!";
Cache.Insert("RemoteData",theData,null,DateTime.Now.AddMinutes(20),Cache.NoSlidingExpiration,null,null);
}
}
But examples such as this from a consumer standpoint are utopian, and I'd dare guess the vast minority. Most people are going to hack away and get at your data, not accounting for possible stoppage of service, and then be very unhappy when outages prevent them from getting at it, or negatively impact their site.
So how do we encourage those who consume our data to program defensively so that performance degradations, including total absence of service, won't bring their entire site down? We establish patterns. We publish formal documentation containing examples from as many languages and platforms - vendor and open source - as we can, to appease the masses and at least recommend a uniform mechanism of access. We release downloadable libraries that perform operations in a best-practices way (Ruby on Rails does this nicely with the Prototype JavaScript library for AJAX functionality). We conduct smart programming overall and evangelize the same for those who grab our stuff. The result is a base level of control.
And we not only educate - we learn from those who are smarter than us, adopting their practices and innovative ways to not only work with, but workaround our data, to ensure quality control and quality of service.
On a marketing level we can scheme, dream and strategize all we want about the commercial potentials of (re)distributing, using and sharing data in new and creative ways, but tenets we have to solidify from a standpoint of responsible engineering are:
- Effective scalability - managing the additional ways our stuff is accessed, like RSS, web services, remoting, and mash-ups/remixing; in addition to more crude traditional methods like screen scraping
- Contingency planning - making sure internal misfortune doesn't damage the quality of life of those using our stuff beyond simple frustration
Unfortunately as developers we can't normally directly control how a client consumes our data, and we can only hope such is done in ways proactively taking into account possible losses of service. As an example of how to code defensively when bringing in remote data, consider the following C# block, used in a .NET 1.x web client obtaining data from a remote web service:
if(Cache["RemoteData"] == null)
{
// if the Cache object doesn't contain data, call the web service
GetData();
}
// apply the data from the Cache some page-level DIV element
someDIV.Text = (string)Cache["RemoteData"];
private void GetData()
{
string theData = string.Empty;
try
{
// 1. do a data fetch operation by calling a web service (could also be I/O on a remote file, reading-in an RSS feed, etc.)
SomeWebService ws = new SomeWebService();
theData = ws.FetchData();
// 2. place the data in a caching layer relative to the executing application and expire it in a reasonable amount of time
Cache.Insert("RemoteData",theData,null,DateTime.Now.AddHours(3),Cache.NoSlidingExpiration,null,null);
}
catch(Exception e)
{
// in the event of a server error, cache a temporary message that expires quicker, assuming the data source will come back up soon
theData = "The remote source is unavailable. Please try again later!";
Cache.Insert("RemoteData",theData,null,DateTime.Now.AddMinutes(20),Cache.NoSlidingExpiration,null,null);
}
}
But examples such as this from a consumer standpoint are utopian, and I'd dare guess the vast minority. Most people are going to hack away and get at your data, not accounting for possible stoppage of service, and then be very unhappy when outages prevent them from getting at it, or negatively impact their site.
So how do we encourage those who consume our data to program defensively so that performance degradations, including total absence of service, won't bring their entire site down? We establish patterns. We publish formal documentation containing examples from as many languages and platforms - vendor and open source - as we can, to appease the masses and at least recommend a uniform mechanism of access. We release downloadable libraries that perform operations in a best-practices way (Ruby on Rails does this nicely with the Prototype JavaScript library for AJAX functionality). We conduct smart programming overall and evangelize the same for those who grab our stuff. The result is a base level of control.
And we not only educate - we learn from those who are smarter than us, adopting their practices and innovative ways to not only work with, but workaround our data, to ensure quality control and quality of service.
Subscribe to Posts [Atom]


Post a Comment