February 9, 2008

The comprehensive guide to upgrading – or replacing – Twitter

Twitter is a rather new communications service, wildly popular in the technology blogging and podcasting communities. There are close to a million registered accounts or users, but I’d guess the active users number in the low-mid five figures. Even at that low usage, Twitter is on overload, plagued with outages and data loss.

Scaling Twitter is a huge challenge. Doing so will involve changing just about every aspect of what Twitter is. A number of commentators have suggested lesser fixes, but none that I’ve seen is apt to work. (Generally, they forget that UI options will need to change as usage grows.) However, I think I’ve come up with an approach that would indeed work, for:

The sections below cover:

If you’re not familiar with Twitter, you probably should be. Crunchbase gives a decent overview, and the link above is a live look.

Twitter posts need more metadata

Twitter’s limit of 140 characters /message is cute, and maybe even sustainable for actual text. But that doesn’t allow for much metadata, @ replies and # tags notwithstanding. And the reliance on TinyURL is a kludge. The minimum metadata Twitter posts (aka tweets) need going forward is:

Twitter needs many more tweet filters

Even today, Twitter writers and readers would benefit from more ability to filter tweets. If the number of users went up 10X or 100X, better filtering would become an absolute need. Even absent such growth, if users join who are less technosocial than the early adopters – or if current users tire of the distraction Twitter now causes — filtering will be a need for them too.

Examples of filters that I think Twitter should develop or support include:

User-group filters are crucial, because the current model of listening to a whole “stream” doesn’t scale. Right now, Twitterers only fit into two groups – those you listen to and those you don’t. But as usage grows, we’ll need to be able to deploy filters such as:

The need goes even further than that. Already today, some people tweet publicly that they want to read Dave Winer’s views on technology but not on politics, or Robert Scoble’s actual tweets but not his automated notifications of podcasts. What’s more, we may prefer different filter sets for real-time streams on our phones, real-time streams on our PCs, and occasional archival lookups.

Twitter needs to expand its use cases

Right now, there are two main ways to use Twitter – like high-tech CB radio, broadcasting to all who listen, or in “private update” mode, communicating only with your friends. As I’ve suggested above, there needs to be a lot more variety than that, with user groups and subjects freely filtered in and out. If that functionality is added, Twitter could have a number of major uses, include:

In addition, Twitter should be integrated with instant messaging. Right now, many people use Twitter through AIM or GoogleTalk. The tighter that integration gets, the better. Seamless switching between mass Twitter and reciprocal IM would be a nice improvement. (Just remember not to broadcast intimate love notes to your entire Twitter following.)

And it’s not just IM integration. For example, a group of Twitterites tweeting just at each other would be a whole lot like an IRC or AOL chat room, if filtering functionality worked that way. However – and this is a big advantage – it would be easy to be “in” multiple rooms at once.

Twitter needs a different architecture (CEP/database)

The essence of Twitter is accepting and distributing messages in real time. As I’ve already pointed out, this should be done via complex event/stream processing (CEP), not by writing everything first to a database. The need for much more complex filters just makes the case for CEP overwhelming. Of course, there also has to be a persistent message store, but database writing only should happen after real-time needs have been met.

This could scale nicely. Suppose there were 1,000,000 users online in any given hour. Suppose for each of those users the system maintained a cache of 500 16-byte message IDs. We’d only be talking about 8 gigabytes of RAM for that portion, no matter how many followers the most popular Twitterers each have.

So far, I’ve begged the question of whether

  1. Each user would get a personal representation of her full Twitter stream on disk, or
  2. Her Twitter stream would be recreated by a full database query each time she logged on or drilled back in her archives.

What I suggest is a hybrid. When a user is online, whichever tweets she sees should eventually be persisted out to disk, in batches (at least their message IDs). When she first signs on (assuming she’s a frequent user), there should be a cache of tweets waiting for her in memory. But if she ever wants to do an archival search beyond those two groups of tweets, a slowish database lookup will have to do. That said, if it turns out to be a useful performance speedup hack to persist complete Twitter streams for the most active users, I won’t be at all astonished.

Sometimes there would indeed be a complex query to fetch all or part of somebody’s Twitter stream. It would start with a set of rules that generated a list of tweet authors, perhaps executed against a persistent list of all the authors that user ever follows (or against some other kind of cache). Then it would look for all messages, in an appropriate time period (key point for performance optimizations), on the desired subjects. And last it would apply any negative filters (e.g., strong language. But if this were done against a real data warehouse DBMS, I don’t see why it would be a terribly big deal at all.

Twitter needs an enterprise version
I think Twitter could be a valuable enterprise tool. In particular, much of what email is used for would work better on a sufficiently spruced-up Twitter — namely quick notifications, often with an associated URL. (There anyway should be fewer emails with file attachments in the world, as those should be replaced by URLs. This is especially true at enterprises where good downloading connectivity can be assumed.)

Obviously, enterprise Twitter would need better archiving and integrations than the public version. I think it would actually need better filtering too. On the other hand, scalability would be much less of a challenge.

Voila! We have a monetization model for Twitter. However, we also have a huge reason for Microsoft to competitively blow Twitter out of the water. Make that “another huge reason” — the first one lies in the potential for Twitter to be a major enhancement to IM.

Twitter is very vulnerable to competitors

As popular as Twitter is, it doesn’t have a lot of built-in loyalty. Tweets are ephemeral; walking away from one’s archive of them would not be a terrible loss. Rebuilding the network of people one follows is a bit of a pain, but we’ve all done that multiple times before. And a new improved version could build a user base quickly by being more proactive about invites than Twitter is.

Above all, there’s rampant dissatisfaction with Twitter’s system robustness. As I’ve noted above, there’s also a lot of room for feature improvement.

Twitter is very vulnerable to being blown away.

Comments

14 Responses to “The comprehensive guide to upgrading – or replacing – Twitter”

  1. Sergey on February 9th, 2008 11:38 pm

    Well, I think it would be a good idea to build on XMPP and its pubsub extensions, but I don’t think that will happen — there’s not enough wheel reinventing to be done there.

  2. Curt Monash on February 10th, 2008 7:20 am

    Sergey,

    I think scaling discussions have to start with functionality and data query infrastructure.

    I don’t think those problems are apt to be solved by any kind of global message bus in which every message whizzes by and you only pick out the ones you want. Rather, I think there has to be a central server (suitably replicated, etc., but logically central) that only sends you the messages you actually asked for, or at most a SMALL superset of those.

    CAM

  3. UJ on February 10th, 2008 2:21 pm

    I think you’ve turned Twitter into a whole other application, into a blog almost. 140 characters isn’t supposed to be “cute,” it’s the whole freakin paradigm!

    My suggestion is that if you want those features, try WordPress. Twitter is something else. Applications can’t be everything to everyone. Ask Microsoft.

  4. Curt Monash on February 10th, 2008 3:13 pm

    Community-oriented software often doesn’t scale in pleasant usability as usage grows. I’m trying to figure out how Twitter can be an exception.

    This is related to the questions about how to make it scale technically, of course.

  5. Text Technologies»Blog Archive » Enterprise Twitter on February 11th, 2008 7:43 am

    [...] long discussion Saturday of how to evolve (or replace) Twitter included a short discussion of what might be called Enterprise Twitter. Dennis Howlett just alerted [...]

  6. Sergey on February 11th, 2008 2:47 pm

    Curt, maybe it’s my misunderstanding, but XMPP is more about routing messages than a global bus. With Publish-Subscribe Extensions it already has the core functionality of Twitter. It’s also proven to be scalable. One can pretty much reimplement Twitter on top of existing infrastructure without any problem. I have to admit that I don’t use Twitter myself, so maybe I’m missing some feature that doesn’t project itself too well onto existing Jabber network?

  7. Curt Monash on February 11th, 2008 7:04 pm

    Sergey,

    If you send things to a group of 30 people, or receive them from a group of 50, and you change who those groups are from hour to hour or message to message, where is that filtering going to be enforced? The UI can work on your Blackberry or iPhone, but will the logic work there too? I don’t think so; you have to go to a server. Could that server be your personal choice of “parent” server for your clients? I guess so. I also must confess that there’s an inevitable element of distribution once enterprises start messaging behind their firewalls, yet wanting to connect to the outside world, and I haven’t really thought that aspect through.

    But I still think the idea of writing messages to disk before sending them onward is just braindead. Send them first, THEN persist them quickly, but without allowing that persistence to be a bottleneck. XMPP doesn’t obviate the need for CEP, any more than CEP would replace XMPP.

    CAM

  8. More Twitter weirdness : e-Spot.se on March 14th, 2008 4:26 am

    [...] mars 2008 Twitter commonly has the problem of duplicate tweets. That is, if you post a message, it shows up twice. [...]

  9. DBMS2 — DataBase Management System Services » Blog Archive » More Twitter weirdness on April 25th, 2008 12:11 am

    [...] Twitter commonly has the problem of duplicate tweets. That is, if you post a message, it shows up twice. After a little while, the dupe disappears, but if you delete the dupe manually, the original is gone too. [...]

  10. Text Technologies » Blog Archive on April 25th, 2008 11:49 am

    [...] Twitter’s case, a mass-successful form will necessarily look utterly different from what exists today.  Techie early-adopters are not going to recruit a critical mass of users into a system that [...]

  11. Communication, culture, and short text messages | Text Technologies on July 7th, 2008 12:37 am

    [...] advocated recently for increased use both of simple instant messaging and filtered microblogging. The main reason I like short text messages so much is that, at least in theory, they improve on [...]

  12. The New Twitter : Social Media Mafia on September 20th, 2008 6:31 am

    [...] Text Technologies has an even more comprehensive guide of changes that they would like to see from Twitter which include, [...]

  13. Thoughts on the rumored Google/Twitter deal | Text Technologies on April 3rd, 2009 2:57 am

    [...] I’ve been suggesting all along that Twitter needs radical user experience enhancements. But when has Google ever made made user experience enhancements to a service? Its core search [...]

  14. Google Wave — finally a Microsoft killer? | Text Technologies on May 29th, 2009 5:49 am

    [...] needs to be integrated with other forms of communication. What’s more, Twitter’s functionality needs to be drastically extended. Google Wave is the best hope I know of to meet those needs.  Enterprise Twitter is just a special [...]

Leave a Reply




Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.