Jump to content

www.archive.org copied your site???


Recommended Posts

One of my sites was copied by a bot running for

http://www.archive.org. Go out there and search for your domain

name and see if you get a hit.

 

I read through some of their "FAQs" and basically they want me to do

some work on MY site to keep THEM from copying my work (ie add a

robots.txt). What the heck is that all about? From a legal

standpoint, since when do I have to make changes to my site to keep

someone from stealing my logos, pictures, etc. Now I understand in

reality that I should do things to keep people from stealing my work

because it happens, but I don't understand how they can so blatantly

steal parts of my site from a legal standpoint.

 

I want to send them an email asking them to explain themselves, but

before I do, I thought I'd see what you guys think about this.

 

TIA,

 

Kirk

Link to comment
Share on other sites

1) archive.org is a hugely valuable internet service that has no parallel. When you want to

look up some content that was on a website that has long since disappeared, what are you

going to do?

 

2) They are no more 'stealing' your site than google is. They are

indexing it, and they are not making or attempting to make profit from that. If a user

wants to view your site now,

they'll just visit you. If they want to view it how it was a few years ago, they'll visit

archive.org. If you don't want them to do that, you block the spider.

 

3) archive.org are up front about what they do, and a robots.txt file is a very _very_ simple

thing to add. If you add the few things they need you to add, they won't index you. It's

simple. While you're at it, you can add rules that stop other well-behaving robots from

sucking up your bandwidth.

 

4) Worry more about people who are going to steal your stuff and pretend it's theirs, and

who won't be deterred by a simple robots.txt.

Link to comment
Share on other sites

Adding entries to robots.txt is a standard, documented procedure for anyone running a website who does not wish one or more pages to be indexed by search engines. Since when? Since pretty much as long as search engine services like Google have existed. Remember that if you do this for your home page, you'll drop off Google's radar (and all other search engines, too). Typically, you might add entries just for specific folders. Anyway, as inferred above, these guys aren't actually stealing anything from you.
Link to comment
Share on other sites

This is the web. Various sites index the web. Archive.org is one of the nicer ones in that they actually honor your preferences you have expressed (via the robots.txt) for indexing and archiving your site.

 

There's no reason to get upset. They are not doing anything wrong.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...