I am currently pursuing an idea for better presentation of comments on blogs. I would like some suggestions on this:
The Problem:
On popular blog posts, there can be over a few hundred or thousand comments, spread across several threads on the same page. After a while, the few thousand comments fall into separate distinct threads talking about different things. Some posters also discuss multiple aspects of discussion in one reply. For the blog authors, it can be quite difficult to follow all the comments, especially across several blog posts. I am trying to find a solution that can help blog authors quickly prioritize and reorganize the entire thread in a more presentable format in order of post importance.
Example 1: "A Blog post supporting the use of national level internet censoring"
Comments for such a post can easily be categorized in 3 categories:
- In Favour of the author
- Against the Author
- Uncategorized (unable to categorize)
In this case, the author may wish to reply to the posts that oppose his own viewpoint first to further the discussion.
Example 2:
Blog post about: "Small independent hardware company announcing a new low voltage processor for netbooks"
Comments for this post may be more varied in categories:
- Posts pointing out other existing low voltage processors on the market - nature: informative
- Posts appreciating the product
- Posts critisizing the product for various reasons
- other
I am wondering if there is any existing Natural Language processing algorithm or software that can analyse complete sentences or comments and determine their nature as per the categories I have stated above.
From my research, the closest example I have found is Slashdot, where each post is categorized as informative, funny, insightful etc. and given a score from -1 to 5. But this is done by other visitors, not by algorithms. It works on a popular site like slashdot, but I am not sure other posters would bother doing this on a small blog.
Other solutions I have seen include methods of sorting comments based on location (IP address), time since blog post was published, length of post etc.
Any suggestion is appreciated!