[squeak-dev] SqueakSource indexability (aka should we just ask crawlers to desist?)

Wed Apr 28 19:59:18 UTC 2010

On 28.04.2010, at 21:07, Ken Causey wrote:
> 
> At times access to source.squeak.org becomes slower, as has been the
> case today.  I can see in the logs that various web-crawlers are the
> likely culprit.  Having the information there accessible via search
> engines is a wornderful thing but I have to suspect that the Seaside
> session IDs eliminate this option.  (Of course when URLs like
> http://source.squeak.org/trunk.html are found on other sites they then
> become indexed.)

Which URLs are the bots accessing?

> Unless I'm mistaken about this, and I would appreciate any guidance, it
> seems like we need to add a robots.txt to the site which guides or
> simply asks crawlers to stay away.  Thoughts?  I'm no SEO export.

We do have a robots.txt:
http://source.squeak.org/robots.txt

- Bert -