ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > Living Room

How Digg Gets Everything Backwards.. And How to Fix It

(1/9) > >>

1. Digg is a wonderful idea.. but it's horribly broken.

Of course many people have been raising concerns about the manipulation and irrationality of Digg front page items (for example here, here, here, here, here, and here).

Recently the problem of "cabals" of Digg story promoters is getting more and more attention.  To their credit, the Digg administrators have made it possible to track who is submitting and promoting which stories, and the results are dramatic.  A tiny portion of Digg members are submitting stories, and tiny networks of friends are promoting each other's stories, resulting in a very tiny elite group of people determining an overwhelming amount of content that gets attention on the Digg front pages.

Kevin Rose, one of the Digg founders, has recently announced new efforts to try to outsmart these organized groups of co-promoters, in an effort to "catch" them and downgrade their influence on voting.  The idea is to identify non-diverse voting patterns and flag those as less important.  The effort is misdirected.

Digg suffers from a fundamental flaw in design. In fact its entire approach to leveraging the crowd's wisdom is completely backward.

2. First Things First - What Constitutes a "Good" Story?

Before we can talk about fixing the Digg-style model, we need to have some agreement about what constitutes a "good" story.  I define a "good" story as one which is considered good by those who actually take the time to read the stories and have some interest in the subject area, as opposed to stories which simply sound appealing based on their title.  The objective of this discussion is to identify why the Digg model is bad at finding such stories, and propose a model which would do a better job.

3. Crowds Don't Do A Good Job of Voting on Stories

The idea of leveraging the "wisdom of crowds" is an enticing one.  In his famous book of the same title, James Surowiecki suggests that large populations of (average intelligence) people can often perform better than presumed-superior elite.  The basic idea is that instead of employing experts to make decisions, we can leveredge the power of large groups of people voting at once, and averaging votes.

The problem is that crowds aren't equally good at making all decisions.  Common sense questions, and questions where the population has a reasonable chance of possessing the background knowledge available to tackle a problem, are well solved by a "crowd vote."  But some questions, like estimating the predictive power of astrology, or estimating the gravity of Pluto, are not handled well by popular crowd vote, either because the crowd members don't possess the background domain knowledge necessary to make an informed opinion, or because they are highly biased for some reason.

We have plenty of anecdotal evidence already that a large portion of stories that make their way to the front page of Digg are:

* Either driven there by small groups collaborating in order to artificially inflate a story for personal gain (financial or otherwise)
* Or elevated to front page (prominent) status because of a cascade of mass crowd action, based not on the actual "value" of the content of the story to these people, but on the "catchiness" and sensationalism of the title of the story.
Furthermore, cites like Digg are highly susceptible to irrational trends and epidemics of attraction to keywords and slogans.  If an important event happens on a busy news day where Britney Spears gets married, it can be lost forever.  Because votes accumulating rapidly in tight temporal proximity is so important to the ratings on sites like Digg, there is an undo emphasis on stories that have titles which appeal immediately to mass audiences.  Digg turns out to be very efficient at identifying catchy headlines, and very good at weeding out all stories that don't have catchy headlines.

4. Too Big an Incentive to Game The System

Part of the problem with services like Digg and Google is that in such a big marketplace, where attention is so financially valueable, the monetary benefits to prominent placement is so huge that it serves as a irresistible incentive to figure out ways to game the system.  An arms race is in place between the groups trying to exploit the ratings algorithms and the services which are only mildly interested in curbing the behavior, and usually only when their own financial interests are at stake.

The current approach by Digg, to try to outwit these manipulators and reduce their corruptive influence is pure folly.  First, because it's not practical to beat such manipulations - in the end such behavior is impossible to discern from actual voting.  And second because it doesn't address the other core problem: Crowds are not good at identifying good stories.

5. Crowds are not good at identifying good stories - they are only good at identifying sensational and catchy headlines.

Whether it's because the title has a dominant biasing effect or because they don't actually read the page content before they vote, the result is the same: A clear pattern of predictably shallow, duplicative content pages being promoted which have little value to the readers.  And the ease in capturing the attention of voters makes it all the more trivial for the small groups of manipulators to structure story titles to secure crowd votes.

6. Gets Everything Backwards.

Digg is using crowds in the wrong way, for the wrong role.  They have a very small group of elite people submitting stories, a shadowy network of collaborators who work together to artificially promote stories of their own choosing, and a crowd that is led around by the nose like sheep.

If we ask instead what is the crowd good at, and when do we need domain experts, we end up with a completely different model.

7. A New Model: Crowd Suggestions  and Public Expert Filtering

There is too much information on the internet for a small group of experts to find all of the interesting stories each day.  For finding potentially good stories, we need to leverage the power of a large group of distributed people.  One wants a way of making it as easy as possible for people to submit new potential stories, putting as little obstacles in their way as possible.  For a Digg-like site this would mean removing the need to describe the story, title it, register, etc.  It would also mean welcoming people to submit their own sites and authored articles, rather than treat such things as spam.  After all, the objective here is like the objective of brainstorming - we want to welcome a wide variety of suggestions from any sources.

Where Digg ends up with a small elite group that submit stories, and a larger population of crowd voters,  instead we want to shift the emphasis on large numbers of crowd submitters, by making it as easy as possible to submit, and perhaps limiting the number of story submissions per day (which would be a complete anathema to the current dig model with elite submitters do most of the story submissions).

8. What About Crowd Voting? Eliminate it Completely.

That's right, you heard me - eliminate the ability of normal users to vote on stories.  They may enjoy it, and they may end up with a certified 100% user content created "web 2.0" site, but the bottom line is that the content sucks.  If the crowd is not good at identifying good stories then they should be removed from the loop.

9. What's the Alternative to Crowd Voting?

The alternative to having the masses vote based on the headlines on stories they don't read should be obvious:  Let voting/filtering be done by domain experts with some background and context for evaluating the value and interest of stories.

Just as you don't ask a crowd to perform dental surgery, you shouldn't be asking a crowd to evaluate the worth of a story on quantum physics.

Let's compare the Digg model with the proposed model along two dimensions:

Discoverers/SubmittersEditors/SelectorsOld Mediaelite domain expertselite domain(?) expertsDigg Modelsmall group of hyperactive elitecrowd + underground manipulation groupsRecommended Modelcrowdelite domain experts

10. How Do We Choose Experts?

An inevitable question that arises with this model is how to choose experts.  The answer of course is that you choose experts for a content site the same way that you choose experts for any task: in a wide variety of ways designed to ensure diversity, quality, judgment, and integrity.

For example you might have a central body that interviewed qualified candidates in different fields and assigned them to specific domains of submitted stories.  Experts who are voting and filtering stories should be publicly identified so that watchdog organizations and the public could investigate the possibility of bias or corruption.

11. A Representative Elected Body of Expert Voters

One particularly interesting possibility is the idea of publicly electing representatives who would run for office as domain-specific experts.  Here normal people would vote for candidates in specific fields of expertise, based on their past performance (which would be a matter of public record), and their background experience.  This would be a true representative system, where users are selecting domain-knowledgeable proxies for their votes, entrusting these representatives to make informed decisions about the veracity, value, and novelty of new stories.

This small group of experts would be much more easily monitored by users in terms of tracking their recommendations about stories, which should be public.  Readers will be able to see exactly which editors selecting or rejected which stories.

Various hierarchies of experts are possible.  In one extreme you could employ a single domain expert in each domain area, with an expert making the single decision each day about the ranking of stories, with total transparency to readers.  At the other extreme one might create a hierarchy of voting experts with votes weighted based on domain knowledge and experience.  Open elections could be used to let readers identify and weight different experts differently based on past performance.  Regardless of the arrangement, decisions by experts should be transparent and available to anyone.

There is one real practical impediment to using experts to do the final level of filtering and selecting - the cost in time and money involved in supporting this small group.  Given the money generated through ad revenue by sites like Digg, this really shouldn't be a serious problem, and sites like netscape have begun moving in this direction.  Netscape may or may not suck, but the idea to use expert editors to help filter stories and add background context is an improvement on the Digg model.  Alternatively one might imagine that volunteers would be willing to fill these jobs and would appreciate the added public recognition than would be due a small number of domain experts in the new model.

12. Summary

We are currently in a period where "user-generated" content is king.  In the rush to produce sites built from user-generated content, we've seen a mass removal of the role of domain experts and proxy representatives.  Whether it's Digg or Wikipedia, there has been a move to treat everyone as if they had exactly the same background level of expertise on every subject.  This is surely a temporary aberration. Not everyone is qualified to take part in every decision.  Eventually we are going to have to return to a more balanced solution where users influence content in a way that makes sense according to their interests, background knowledge, available time, and abilities, and where domain experts provide a necessary element of context, continuity, consistency, and informed judgement.

See also these links (found on digg btw):

If you like, you can: Digg This Article.

More essays on recent trouble in diggland:

And an older thread here:

Since I mentioned netscape I thought maybe i should comment a little on the things going on at netscape and how they fit in.

One of the principle ideas here at is the idea of trying to find a fair way to reward those who create content on the site.

The netscape idea of paying their top users makes some sense to me, in that it is keeping with the idea of returning some of the money being made on the site from advertisements, back to the users responsible for the content which is generating this revenue.

One of the difficulties that sites like netscape have is in getting right the "incentive" issue.  If top users are determined and paid based on the traffic they generate, then the *incentive* for them is to create traffic, whether by hype or manipulation, is substantial.  This is part of the danger of paying people based on some metric - their is a strong incentive to play to that incentive.

(As an aside, this is one of the reasons we have tried to avoid any system on this site to reward people based on traffic, downloads, etc., but instead try to facilitate and encourage DIRECT user-to-user donations.  The hope is that the incentive will therefore be on creating content that users actually benefit from and appreciate. This approach is, like the others, not with it's own dangers, such as the possibility that content creators will be motivated to focus only on content which generates donations..)


[0] Message Index

[#] Next page

Go to full version