topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday April 19, 2024, 6:44 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Article: Please don't steal this Web content  (Read 8823 times)

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,900
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Article: Please don't steal this Web content
« on: August 02, 2007, 11:29 PM »
Nice short article about the practice of automated website "scraping", where sites are created by scripts to steal content from other sites, in order to show up in web searches and get advertising money.

VanFossen isn't referring to the kind of plagiarism in which a lazy college student copies sections of a book or another paper. This is automated digital plagiarism in which software bots can copy thousands of blog posts per hour and publish them verbatim onto Web sites on which contextual ads next to them can generate money for the site owner.

Such Web sites are known among Web publishers as "scraper sites" because they effectively scrape the content off blogs, usually through RSS (Really Simple Syndication) and other feeds on which those blogs are sent.
...
"It wasn't the issue of money," Leder added. "When other people's business model is based on stealing content, that's a significant problem."


The article mentions an interesting search engine for finding copies of your site content on the web: CopyScape.


from http://www.sutor.com
« Last Edit: August 02, 2007, 11:34 PM by mouser »

iphigenie

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,170
    • View Profile
    • Donate to Member
Re: Article: Please don't steal this Web content
« Reply #1 on: August 03, 2007, 04:12 AM »
I can understand.
Where I work, we have writers, editors and researchers employed all year round to produce and keep up to date information.
You wouldnt believe how many people but, even worse, companies, just nick it and dont even bother to change the wording  >:(

app103

  • That scary taskbar girl
  • Global Moderator
  • Joined in 2006
  • *****
  • Posts: 5,884
    • View Profile
    • Donate to Member
Re: Article: Please don't steal this Web content
« Reply #2 on: August 04, 2007, 03:40 AM »
What is worse is when these sites have a higher search engine ranking than the sites they steal content from.

It's even worse when your site's entire content ends up as a single blog post and that ends up on the front page of digg, with absolutely no credit given to your site. (I have had this happen to me)

It can be a pretty discouraging situation to be in when someone steals your content like this. It really doesn't do much for encouraging the victim to keep working hard and producing new content.

And I have some mixed feelings about software download sites that do this. On the one hand, they are promoting my software.

On the other, they are copying partial content from my web pages, leaving out important information or presenting it in a way that makes no sense. This is unfair and can make me and my applications look bad.

Most of the time it is for applications I don't have PAD files for. If I had wanted these applications to be included on software sites, I would have created PAD files for them and submitted it, myself.

The biggest problem with software sites doing things this way, without asking, is they never update the pages with correct or current info when you ask them to, and even if you have PAD files for your applications, changing the info in the PAD doesn't result in these sites getting updated. And they don't care if the info they give is inaccurate or outdated.

And it's not just the smaller, unknown sites that do this. Softpedia, a major site, seems to do this quite often.

I don't know if I should look at it as content and bandwidth theft or not. (many are hotlinking my screenshots and offering direct download links)


iphigenie

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,170
    • View Profile
    • Donate to Member
Re: Article: Please don't steal this Web content
« Reply #3 on: August 06, 2007, 05:14 AM »
I never thought about that, the software sites.
I would see why a software site would go and copy the product description from the developer's site without thinking... but now that you mention it, I can see how if done badly it can be harmful, and as you say it can also hurt you with search engines, as those sites have the higher pagerank and your site might get penalised.

And some of those sites don't even post a link to the developer's website, and as someone who always checks the developer's site prior to downloading, that is really infuriating. I can imagine that as the developer of a product if they nick your copy and dont even link back to you this would be the most harmful thing (search engine wise, but also from a marketing point of view)

Is there a thread somewhere on this site on the good, average and less good software sites, based on how they treat the authors, link back (or not), nick copy without asking (or not)? I'd be curious.

nudone

  • Cody's Creator
  • Columnist
  • Joined in 2005
  • ***
  • Posts: 4,119
    • View Profile
    • Donate to Member
Re: Article: Please don't steal this Web content
« Reply #4 on: August 06, 2007, 01:28 PM »
i didn't appreciate how bad things were. seems like you need to provide crippled freeware until the person using it directly contacts you to obtain a key to enable it.

i know that isn't what many developers will want to do but after hearing the argument put forward by app that's the only solution - except for abandoning the freeware way completely so these despicable leaching websites can't get away with it.

whatever, you need to implement something to get the user to return to your website. and i'd go as far as saying you need to force them, don't give them the choice to avoid doing it - if they don't like the idea, fine let the ungrateful sods go elsewhere and use something else.

i think protecting your software is paramount - lazy ungrateful users that refuse to check out your site can go to hell. of course, you can provide a nice friendly message saying why you've limited the function of the software, blah, blah.

anyway, i'm not the right person to give this advice as i'm not a developer. but i know i'd be mighty pissed off if stuff i'd made was being abused online - and i'd try to implement ways to stop it - at the expense of annoying all those ungrateful b*stards that think everything should be free.

housetier

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 1,321
    • View Profile
    • Donate to Member
Re: Article: Please don't steal this Web content
« Reply #5 on: August 10, 2007, 04:02 AM »
I am very worried about this too: people stealing my content. I am (slowly) writing a book; everyone can read it online, but I don't want people to just copy its content and sell it as their own.

It is my work! Well in fact it is a collaborative work, but none of the other writers want their work stolen as well. The whole book is under a CC-license which allows copying for non-commercial use and requires full credentials. But I don't really have a way of enforcing this license.

I shall take a look at copyscape.

housetier

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 1,321
    • View Profile
    • Donate to Member
Re: Article: Please don't steal this Web content
« Reply #6 on: January 17, 2008, 08:57 AM »
There is a web-monitoring service that scans the internets for illegitimate copies of your content. I have not tried their services, because it is not free. But when you love off writing it might be interesting...

attributor_logo.gif