What is it with search engines? and web-traffic rankers?
This blog has done enough whining about Technorati’s randomness. It’s well overdue to say that it’s probably working far more consistently and reliably than most of the facilities that claim to find Internet resources. (On a note that shows how shamelessly susceptible to flattery we are at whydontyou.org.uk – others please take note – it puts this blog at under 60,000 in the blogosphere which is almost beyond its wildest dreams.)
As an experiment, look up your blog in a few search engines. See if you can find any points in common between them.
Here’s one of my favourites in that I suspect they actually must a randomiser to generate web traffic numbers and links. Pick a blog, look at it in technorati’s blog directory.
Go to the traffic rank bit and click on it. You will find yourself in the realm of Alexa. This will probably show you that the traffic isnt really counted because the blog isn’t in the top 100,000. The daily page views are shown as a percent of people using the whole Internet, i.e., if the site isnt in the top 100,000 sites in the world, you wont get any figures. (If you come in at a newbie 5,195,452 – as this blog does – you may wonder if you are even reading the blog yourself)
100,000 sounds like a lot of sites. However, if you consider, global players (like Google or Microsoft), then big online retailers (like Tescos and Dell), then news sites (CNN, BBC) and national government information sites, you can see it must be pretty difficult to get into the club.
Beneath this blank chart, you will see “Percent of Internet users who visit this site” with a fraction of a percent if it’s anything like this one. (Maybe you’re Microsoft, in which case i guess it will be higher. Will check shortly.)
Then “average number of pages visited” and “3 months average traffic rank” (risibly low) and average page views per visitor (1) (1 Do you suspect that’s hard-coded?)
But the next bit is what creases me up for its randomness. People who visit this site come from (in order of most visits):
United States 40.0% (fair enough, the blog’s in English. Most English-speaking Internet users are in the USA)
Costa Rica 10.0%
United Kingdom 10.0%
Whydontyou.org.uk traffic rank in other countries: (These seem to be the same countries to me)
Costa Rica 46,349
United States 658,841
United Kingdom 703,872
Come on…. To what do we owe this unprecedented popularity in Costa Rica? India? France? This is a UK-based blog. Most of the stuff we witter on about, apart from atheism and technology, relates to the UK.
It’s not that I don’t want to believe it. A central American flavour to its posts would make this blog much more interesting. I just think the figures have been made up.
OK, let’s look at the sites that link here, according to Alexa. These are so out of date, that it’s obviously not been updated since the blog was a couple of months old. In fact, until I submitted a more recent image, Alexa had a screen shot of the blog that was well over a year old. (Yes, I know, that’s like saying “We don’t get enough spam here, please deluge us with as much as you can possibly manage”.) Maybe because of their age, the sites listed in some of these links are unrecognisable. In fact none of the blog links would be counted by Technorati, being over a year old, but then, it shows no links that Technorati counts (under 90 days.)
Let’s search for this blog on Google. Here, it’s wierder. There are few points of comparison between different Google results, if you repeat the search over a day or so. Maybe it’s just how Google treats blogs, but the post that comes up first is always the same one from a few months ago. Other posts can only be seen by asking for similar results, excluded the first time for being the same. Well, guess what Google, every post is different. It’s a blog. Lots of the other Google results for the blog are bits of the RSS feed. I’d like to think that lots of people are devouring the RSS feed, but, unfortunately, these tend to be link farms. In fact, lots of obscure references to the blog linkfarm sites turn up on Google, most being complete news to us. Real human-created references to the blog don’t turn up as often as they actually happen.
I could go on to the point where I was boring even myself.
None of this would matter if getting seen and indexed correctly wasn’t crucial to getting any visitors. I know that indexing engines and search engines are bomabarded with spammers trying every trick there is to get high on the first results page. The search engines have algorithms that are supposed to penalise sites and blogs that don’t match their definition of legitimate – density of keywords, number of inbound links, and so on. I believe that not only are these not working, they are often acting in exact reverse to their intentions.
Content from blogs get scraped and put into blag sites that exist just to spew out other people’s content. Google then decides the original source site has “duplicate” content and downranks it. How do you stop this without stopping legitimate blogs from commenting on your posts?
Keywords in the metatags don’t match teh keywords in the text? Well, duh, normal human beings aren’t thinking only of page rank. So they put keywords in their metatags then write content, without remembering to keep changing the metatags. Only people obsessed with search engine rankings do that and ,of course, a fair percentage of them aren’t just bloggers or normal website owners.
It’s not just a question of getting visitors. Anyone who wants to bring in revenue from their site or blog by displaying adverts gets judged by these bizarre standards. Some schemes base what they send you on your Alexa rating, which is itself derived from Google’s well-nigh arbitrary page rank . If you’ve ever tried to have GoogleAds on a site, you’ll see how abstract the GoogleAds process is. In fact, visitors who think they’re helping you pay for the site, so click a few times on your ads every time they visit will get you disqualified. Ditto, your rivals……. (It seems as if you get automatically disqualified anyway, at the very point that you might actually receive any revenue.)
I know it must be well nigh impossible to filter the enormous volume of material in the Internet, especially in the face of the number of spammers there are. However, there must be better ways of doing it. I am always amazed when people find things here and comment or email us about them. How do they manage to find it?
So here, is an unaccustomed prop for Technorati (unaccustomed for this blog, anyway, whioch has done its fair share of ranting about it). For all the irritating Technorati monster error messages and totally inconsistent service, Technorato remains the best performing indexing service that I’ve come across yet. The tags are really helpful when they work. You can still find an interesting read on someone’s first post. And Technorati isn’t yet totally under the sway of the giant players. The fabled Web 2.0 stuff really does still have something going for it.