Web 2.0 and the limits of owning your data
I’m hesitant to enter into the lively discussion surrounding exactly what “Web 2.0” means, but I’ll venture this: one important part of Web 2.0 is the separation of user data from the applications that use it, and the idea that users should own and control this data. In this vein, Dorrian Porter recently pointed me to a post he made a couple of months ago following up on a post made by Robert Young as a guest blogger for Om Malik.
Robert was pointing out that the switching costs imposed by Web 1.0 companies to get a competitive advantage are being replaced by different switching costs created by the *users* of Web 2.0 companies. As an example he cited the switching costs created by the value of a social network at MySpace or a reputation on eBay, as opposed to the switching cost created by the email address and “walled garden” at AOL.
Dorrian then points out that users increasingly want to own their data, from their blog post comments to their purchase history at Amazon to their social network at LinkedIn. Ownership of this data implies portability, which lowers switching costs — even the above costs created by the user. But hey, that’s part of the evolution to Web 2.0, and the removal of such barriers should force companies to compete on the basis of better functionality and service.
On a theoretical level, I completely agree with Dorian; but more practically, I think that not all data is created equal in terms of how easy it is to separate from applications. For example, ownership of blog comments would be pretty simple to make happen, but it just hasn’t been important enough for long enough to really happen yet (beyond using trackbacks). Ownership of something like purchase history (or attention) is a bit more difficult: even if a standardized format exists for the data, it is harder for users to review it for accuracy or for applications to use it effectively. Moreover, the increased competition based on features will mean that many applications are likely to find any standard lacking when trying to differentiate their service. Thus it will still be necessary in many cases for the application to own user data, perhaps used in conjunction with the data owned by the user.
For sites like MySpace that build a multi-faceted environment for users to communicate and share interests, it’s hard to see how the data could be practically separated out. After all, MySpace is essentially a music-centered conglomeration of home pages, blogs, social networks, bulletin boards, etc. All of these components are available elsewhere separately, but users value the community architecture built by MySpace, and so let the service manage their data. Even if an easy way existed to export a specific user’s posts, links, forum comments, etc. it seems doubtful that many users would take advantage of it.
All that being said, it seems clear that the way things are evolving is towards more user control, and that the specific path that this evolution takes place will be driven by user demand. One can imagine a world where everyone owns their links and writings and social networks, and then leases them to applications; a MySpace then could build value by providing an easily navigable workspace and links to extra services like music streaming. But this will only happen if users want it. For user data that has little utility outside of MySpace, there will be little demand for inclusion in such an architecture; but users will want to own data that does have external utility, and will increasingly gravitate towards applications that respect this desire.
October 19th, 2005 at 3:07 pm
Over at Shelly’s site you said “if you make your data public, you necessarily lose a great deal of control over it” … which struck me as ooh so true. Information is not like physical property. It is fundamentally different. The value of information lies in its ability to be repeated and still remain useful. If it can be used in only one spot, in only in one context, then is its value is substantially diminished.
Now I certainly subscribe to the idea that people should be able to export\import their data between web services. But the ability of an author to restrain the flow and interaction of the information that they generate when they write on the web is not, in general, something that is going to improve our culture. In fact that ability is something that we would be better off if it were just forgotten.
October 20th, 2005 at 5:30 pm
That’s my first reaction as well; but then I think, what about authors who are depending upon income based on their writing on the web? Even if the feed is only excerpts, the entire article can always be screen scraped — that’s why I think the ongoing debate around *public* data is more about monetization and recognition for the content creator.
In fact, I’d say it’s really not a technology or business issue at all, it’s a legal issue: if society wants authors (or musicians, artists, etc.) to be able to somehow profit from their public works in a digital world, legal restrictions have to be made on what is permissable. DRM, robot.txt mods, and similar tech doodads are just enforcement techniques, but they all only make repurposing more of a hassle, they’ll never really prevent it.
In the end, enforcement has to center around a clear idea of what everyone has decided is right and wrong, and right now, with so many new ways of repurposing data being created, no one is really sure.
October 21st, 2005 at 3:13 pm
For me the debate has shifted a little from a point I was trying to make (less about data). I was mostly concerned with elements of identity and social networks, and the impact on switching costs. I’m most torn over the problem that a website or service is the holder of reputational, preference and identity data, and not the Internet itself. Of course it is in the interest of any website or service to want to guard that data and create switching costs, but it just seems to be so damn inefficient. I’m tired of hearing about any company that thinks it can get me to create a social network within its site. Yet, the problem is even bigger when you think of the major players - where they believe, based on size, bredth and scale, that they are a natural fit for holding all of this information across all categories. It’s scary. That’s when I think the solution must lie in giving the individual more control.
The content creation/management side is a separate albeit equally interesting and challenging issue. Lessig made probably the most compelling point to me for the reason CC exists in a recent post. “If copyright regulates “copies,” then while a tiny portion of the uses of culture off the net involves making “copies,” every use of culture on the net begins by making a copy.” That sure made me stop and think just how troubling old laws applied to the new world might be. Laws definitely need to change.
October 26th, 2005 at 9:36 am
Dorrian, I couldn’t agree more — at least in copyright law, it seems to me that the root issue is that the nature of a “copy” has changed, and therefore so must the laws (although I’m not sure exactly in what way). I also agree with your first point, which to me says that yes, personal data such as identity and social networks have value to the user and so are subject to a demand by users for decoupling from apps; but even for data that has no such demand associated with it (e.g. for users that do everything within Yahoo or Google), decoupling is still important to avoid a potentially dangerous concentration of power. Thanks for the comment.
October 30th, 2005 at 4:36 am
Independent of Web 2.0 developments, frameworks and specs. for Web services have also been addressing the problem of isolating the (meta)Data from the implementation.
http://www-128.ibm.com/developerworks/library/specification/ws-resource/
Best practices dont seem to be saying that web20 implementations should be WebServices based. However, there is something to a framework that allows you to build things using a sort of template like:
1. tell me what you want to do
2. tell me where the WS-Resource compliant data is
3. provide an implementation that can use any source for #2 but that in this particular instance will use the data at #2
November 14th, 2007 at 2:51 pm
[…] I’d recently been reading this post on EconoMeta, in which Adam talks about our changing relationship with our personal data: one […]