Friday, February 26, 2010

Well this is unfortunate...

From Nick Johnson's excellent blog: "Fortunately, the PubSubHubbub protocol provides for the possibility of hubs doing polling on feeds that do not support PubSubHubbub themselves. The public hub on http://pubsubhubbub.appspot.com doesn't currently have this enabled, but there are plenty of alternatives."

Bummer.

Posted via email from jasonsalas's posterous


Monday, February 15, 2010

7 Questions for Superfeedr

Superfeedr is the brainchild of French developer Julien Genestoux, a leading authority on the emerging crop of applications and systems based on delivering data with extremely low-latency.  The project is the evolution of Notifixious (which I've previous profiled), which sought to help content creators by making the distribution of RSS- and Atom-based feeds currently in use more expedient.

Julien benefited from some very generous and impressive external investment, allowing him to expand his platform to be a service using a variety of real-time technologies including XMPP, PubSubHubbub and SUP to harvest distant-end feeds.

Here's 7 Questions for Julien about Superfeedr:

1. Superfeedr was the driving force (your custom feed-fetching/parsing technology) behind Notifixious, and it's now become the brand of your service.  The service's major notable form of growth is notifications via PubSubHubbub, alongside pure XMPP.  What's the reaction been to expanding your supported real-time protocols?

We love XMPP and we think it's the most powerful technology for real-time web. Yet, with a little pragmatism, it's obvious that this protocol scares a lot of people. HTTP is something quite well-known by many people: how to scale it, cache, etc.; it's pretty much a required computer science skill. Adopting PubSubHubbub brought us a lot of attention from people who would have never otherwise looked into our direction.

On top of that, PubSubHubbub is a very well thought protocol and covers the publish/subscribe mechanism in a much clearer and precise (albeit not as rich in terms of features than XMPP PubSub): they both belong to the real-time web world.

 

2. You're perhaps only the second major instance of a deployed PubSubHubbub hub, aside from the initial instance the Google engineers who developed it put out there.  Have you seen a rise in interest in using such a technology stack for enabling web-based pubsub systems?

Yes, definitely. The pubsub pattern was known for years. Yet as I said previously, this implementation of it is very elegant and the massive use of WebHooks makes it very attractive. The emergence at the same time of the real-time web interest was also a major traction factor.

 

3. Superfeedr is based on a very clever business plan.  Describe it.

For subscribers: stop polling, we'll do that for you. Tell us how much it costs you, we'll match that anyway. The assumption is that we fetch the same data for several people. We're selling the same thing several times, when the cost of the n entries is the same as one entry. 

For publishers: we'll make your feeds real-time for free (and we benefit from that for the subscribers, too). If you want cool features (analytics, customization, etc.) then we'll charge you, based on the volume of data which transits through us on your behalf.

 

4. You've received investment from some pretty well-known sources - Mark Cuban among them.  What has the capital infusion allowed you to do in terms of your technology and supporting infrastructure?  And explain the composition of your backend - specific languages, servers and technologies.

This was a small seed round. I'm no "Internet superstar" and I don't host parties. Yet, this investment was made for us to invest in servers (we basically tripled our number of servers since then), as well as - more importantly - key hires. Speed is our only competitive advantage, staying small is important to stay fast, but this money also helped us get a few great engineers who are helping us with new features to be announced!

Our architecture is quite simple: almost like a distributed BotNet. We have many parsers who receive feed URLs from dispatchers and then return the parsed content to it. The dispatchers send the feeds based on a pre-determined 'next-fetch' time. And that's pretty much it.  Everybody is connected via XMPP, which brings stuff like presence, querying and XML - convenient when dealing with feeds!

 

5. I see that some of your more notable clients include Posterous, Tumblr and twitterfeed.  In what ways have they leveraged low-latency push notifications and with which technologies?  And you've also got FriendFeed as a client, which uses SUP (its own protocol).  How is Superfeedr being used for that platform?

We host PubSubHubbub hubs for them. We believe publishers should focus on publishing, and we can help them improve their deliverability, by providing them with a hosted solution where the only things they have to do is (1) add some discovery inside their XML feeds, and (2) ping us whenever they update these. We will deal with the subscription and notification processes. Most of them have some kind of ping mechanism in place; we just make sure we translate these pings into an universal protocol, which is PubSubHubbub.

As for FriendFeed, we also use their Simple Update Protocol (SUP), on the subscriber side to know when a feed has been updated. It's not a ping protocol per se, but we consider it as similar - if a publisher already uses SUP, they can very easily turn on a hub at Superfeedr.

 

6. The service works great for relatively low-volume RSS/Atom feeds.  But as we move away from that format towards update-intensive stream-based data, how does your system scale under periods of heavy duress with feeds like Digg's 'Popular Stories' or firehoses? 

Well, there's heavy and then there's heavy. I think we could pretty much handle Digg's "Popular Stories" feed, but we couldn't handle Twitter's firehose.  PubSubHubbub, given the fact that it's build on HTTP, can't go seriously and reliably under a few seconds of latency. So feeds like the Twitter firehose couldn't really work (at more than 1 entry/sec, it's not reasonable to expect anything just yet). However, all feeds can be seen as an aggregate of many other feeds. The Twitter feed is nothing more than the sum of all the user feeds, and, expect maybe for Robert Scoble (kidding!), a user wouldn't update his Twitter feed more often than once every few seconds.  And then, we're good to go!

 

7. Superfeedr now promises to provide notifications to feeds "within 15 minutes, or it's free".  What continue to be some of the challenges of trying to deliver instantaneous alerts about content updates?

This 15-minute guarantee comes from the fact that we still have to do some polling in the worst-case scenario, and if we have to do polling, going much less than that is hard to do. So our approach is to decrease the average detection time rather than the maximum. If for 95% of our feeds we can guarantee 1 minute, then who cares about the 5% that are guaranteed to be below 15 minutes?

As long as they're will still be content that isn't pushed anywhere, we will have no way to get it without polling.

Thanks Julien, and congratulations, again! Good luck in all your work!  :-)

Posted via email from jasonsalas's posterous


Sunday, February 14, 2010

Breaking into The Worldwide Leader - my ESPN interview

You need only to spend five minutes with me to pick up on my affinity for sports, and a few more moments will inevitably lead to my fancy with ESPN.  As a lifelong sports fan, athlete, broadcaster, writer and trivia buff, it's literally the ultimate gig for people in my line of work.  Recently, the TV sports network considered me for a position; while the process didn't work out in my favor, I'm motivated to document my courting (pun most certainly intended).

While I won't divulge the particularities of the interviewing process, I will use this space to give aspiring and prospective staffers of the self-proclaimed "Worldwide Leader in Sports" an abstract view of what to expect when interviewing.  Hey, I'm not working there and I signed no non-disclosure agreements, so all's fair in love and human resources, right?  :-)

So last week I get an e-mail from an HR staffer letting me know I'm being considered for the organization's vaunted Stats & Information Group.  The process, I was quickly told, involved me securing a timeslot in a published schedule for a quick oral exam over the phone on general sports knowledge.  I booked such an appointment, which wound up being the next morning.  The exam was administered by the HR staffer himself, taking 7 minutes and consisting of 5 questions.  You really can't prepare for the grilling - by the company's own admission, it's trivia. You either know the material or you don't.

I assume the questions change from time to time and candidate to candidate, so I'll say only that the inquiries probed my acumen of baseball stats (dissecting the mathematical formulas for statistics), college and pro football (significant events, all-time leaders), golf and the NBA.  While I flubbed the baseball section, which in my case was the first question, I rebounded well and apparently did well enough to allow me to advance to the second phase - the written assessment.

The next stage was administered the very next day, and was a "speed drill exam" - a series of categorical questions on sports math, assuring accuracy in data from wire reports, general knowledge of players' collegiate associations, player names, terminology and esoteric rules.  The exam is long, and you only get 45 minutes to complete as much of it as you can and then return it to your proctor within the time limit.  It's open book/web, but the more time you spend looking up stuff the less time you're working on the exam.  

I drew from my SAT days and skipped an entire section I was spending a little too much time on to get to stuff I could breeze through faster.  I was tons of fun, and very draining.

I was informed two days later that the conclusion based on my scores was that I wouldn't be proceeding to the next phase, which is actual phone time with a hiring manager.  I was down, but still grateful for the opportunity, and hopeful another might pop-up in the near future.

To their credit, the gentleman helping me was incredibly gracious and patient; he answered all my questions honestly and responded quickly.  When it was determined that I didn't have what it took to proceed any further along the interview trail, I was informed right away.  This was actually the second time I'd interviewed for a spot - the first wasn't as enjoyable, being run by a junior staffer for a JavaScript developer position in 2005 who wasn't into sports at all.  Needless to say, this time I had much more fun and encourage anyone interested in a career in sports at that scale and scope to give it a shot.

If you're up for a challenge, the testing alone is worth the price of admission.  So to speak.

Posted via email from jasonsalas's posterous


Book review: Professional XMPP

Book review: Professional XMPP Programming with JavaScript and jQuery

Since the Jabber project officially became XMPP, this book is the second official tome of knowledge on the subject, and the first to specifically concentrate on XMPP development within the vein of leveraging the protocol to power low-latency web-based applications.  Author Jack Moffitt does a tremendous job of introducing, working with, and building systems based on the often-confusing but critically important topic of creating online experiences based on real-time. 

The book focuses on using Bidirectional streams Over Synchronous HTTP (BOSH) for empowering real-time communications over the web.  The basic layout of an infrastructure to support such systems over current web technologies is dissected, and in so doing being one of the better discussions on the topic.  This is helpful given the pushback many web devs typically have expressed in embracing a new technology stack.  

After well-written overviews of XMPP, its lifecycle and requirements, the book is all about using BOSH to build practical, real-world demos.  The examples are based on Strophe, a JavaScript library written by Moffitt, using a surprisingly simple and consistent pattern even beginning developers can pick up and be productive with in their own projects.  It's code you can easily understand and use for your own work.

The book is divided into 14 chapters that won't take you all day to read and follow along with.  Each chapter is about 20-30 pages, intelligently written, logically organized and appropriately enhanced with URLs, illustrations, screengrabs and syntactical explanations to support the subject.  Moffitt's voice is very friendly, and the chapters are long enough to give attention to the topic at-hand, but not drawn out to be boring.  You can tackle each demo at a single sitting, run the code, and then expand upon it.

The appendices are also extremely helpful, focusing on introduction to the jQuery framework, and working with BOSH connection managers.  Both are very concise and helpful (although I would have appreciated an additional appendix that gets more in-depth on working with Strophe).  

As far as the book's physical qualities, the Wrox binding is sturdy with thick paper, so it'll survive the process of violently flipping back and forth and forcing the book to lie flat as you work through the examples.

In short, Professional XMPP explores just a corner of the full range of services you can create with XMPP within the browser - not just merely IM for the web.  That serves to be the underlying theme and key message for this book: you can create great, powerful, push-based apps with the universally familiar toolset of HTML, CSS and JavaScript.

It's a great read for anyone wanting to get up and running with XMPP for the web, and will make a very welcome addition to any developmental library.

Posted via email from jasonsalas's posterous


This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]