PDA

View Full Version : Spiders, spiders everywhere



Hlafordlaes
2013-Aug-20, 06:28 PM
Being of the oblivious type, I noticed last night for the first time (had never looked before) that we had some 30 users online and over 700 visitors. Asked a mod about it, and he thought the visitors were probably most web-crawling spiders doing their stuff. I was initially worried that hundreds of innocent schoolchildren were all watching our every move, and it made me think I ought to watch myself a bit more. That's still the case, of course, but the numbers seemed, well, too astronomical.

So I just checked over on Anandtech, and the ratio right now is 1,000 signed on, and +/- 12,000 visitors. Unlike CQ, it is impossible to imagine too many people reading there for pleasure, though I could see some 500 or so checking out answers posted for tech issues at any given moment. So the spiders and web-crawling AIs are the vast majority of visitors, it would seem.

Anyone know how to determine the number of humans visiting sites like CQ? Not that it's urgent; I'm just sort of curious. I should think forum site owners would be keen to know the mix, and gain insight into how to attract some of the passive readers onto a site.

caveman1917
2013-Aug-25, 01:21 PM
Being of the oblivious type, I noticed last night for the first time (had never looked before) that we had some 30 users online and over 700 visitors. Asked a mod about it, and he thought the visitors were probably most web-crawling spiders doing their stuff. I was initially worried that hundreds of innocent schoolchildren were all watching our every move, and it made me think I ought to watch myself a bit more. That's still the case, of course, but the numbers seemed, well, too astronomical.

So I just checked over on Anandtech, and the ratio right now is 1,000 signed on, and +/- 12,000 visitors. Unlike CQ, it is impossible to imagine too many people reading there for pleasure, though I could see some 500 or so checking out answers posted for tech issues at any given moment. So the spiders and web-crawling AIs are the vast majority of visitors, it would seem.

Anyone know how to determine the number of humans visiting sites like CQ? Not that it's urgent; I'm just sort of curious. I should think forum site owners would be keen to know the mix, and gain insight into how to attract some of the passive readers onto a site.

Spiders should obey the robots.txt file (it's a file on the server where you can put information for web spiders), so one way to count the percentage of spiders versus normal people is checking how many guests have accessed the robots.txt file. At least that is assuming that all spiders correctly implement this.

ETA: however this is not something you can do yourself, this requires server access.

HenrikOlsen
2013-Aug-25, 02:53 PM
The server isn't set up with a robots.txt file.

caveman1917
2013-Aug-25, 03:17 PM
If one is going to implement a script to count accesses to robots.txt it's trivial to also create a dummy robots.txt file in the first place.

Hlafordlaes
2013-Aug-25, 05:28 PM
Never programmed web sites or servers and have very little knowledge regarding the ins and outs.

Apart from my curiosity, one good reason to think about who or what is visiting is to then start using cookies to count visits by a given IP, and, say, use a pop-up on the nth visit to ask maybe two questions about what they were looking for, if they found it, and what might attract them as members to the site. I know the NYT dutifully blocks my access every month after the 10th visit, and asks me to pay for a subscription with a pop-up, so it can be done.

Sometimes, when real life is too hairy, I only come read as a visitor, without signing in, for up to weeks on end. I see no pop-ups over those periods, so I don't think much is being done with visitors currently.

Of course, an open question is what profile one wants to attract. More experts? More curious? More potential participants in CQ activities (which I don't really have a handle on yet, admittedly)? At any rate, if, as it seems, board finance is an issue, a broader user base may be one part of the answer.

I say all this while my inner fuddy-duddy rebels. Quite happy at the moment with the pool of experts around, and not very keen on the chance of a heavy influx of woo-woo.

HenrikOlsen
2013-Aug-26, 06:42 AM
Cookies are used to store information temporarily on the used machine which can them be requested by the website.
It's normally used for such things as recognizing which user they come from for checking usage patterns, and to store a key that tells the user has already logged in.

For tracking ip numbers it makes more sense to look at the ip number the request comes from rather than mess with cookies. :)