bots – A variety of (dis)information

Are you using bit.ly stats to measure interest in the links you post on twitter?

I’ve been hearing for a while about people claiming to get the majority of their traffic originating from twitter these days.

Now, I’ve been playing with the twitter ruby gem recently, doing various experiments which I’ll not go into detail here because they could be regarded as spamming… if I’d conduct them on a large scale, that is.
It’s scary to see people actually engaging with @replies crafted with some regular expressions and eliza-like trickery on status updates found using the twitter api. I’m wondering how Twitter is going to contain the coming spam-flood.

When posting links I used bit.ly as url shortener, since this one seems to be the de-facto standard on twitter. A nice thing about bit.ly is that it shows some basic stats about the redirects it performs for your shortened links.

To my surprise, most links posted almost immediately resulted in several visitors. Now, seeing that I was posting the links together with some information concerning what the link is about, I concluded that the people who were actually clicking the links should be very targeted visitors.
This felt a bit like free adwords, and I suddenly started to understand why everyone was raving about getting traffic from twitter.

How wrong I was! (and I think several 1000 online marketers with me)

On the destination site I used a traffic logging solution that works by including a little javascript snippet in your pages. It seemed that somehow all visitors disappeared after the bit.ly redirect and before getting to the site, because I was hardly seeing any visitors there. So I started investigating what was happening: by looking at the logfiles of the destination site, and by making my own ‘shortened’ urls by doing redirects using a very short domain name I own. This way, I could check the apache access_log before the redirects.

Most user agents turned out to be bots without a doubt. Here’s an excerpt of user-agents awk’ed from apache’s access_log for a time period of about one hour, right after posting some links:

AideRSS 2.0 (postrank.com) Java/1.6.0_13 Java/1.6.0_14 libwww-perl/5.816 MLBot (www.metadatalabs.com/mlbot) Mozilla/4.0 (compatible;MSIE 5.01; Windows -NT 5.0 - real-url.org) Mozilla/5.0 (compatible; Twitturls; +http://twitturls.com) Mozilla/5.0 (compatible; Viralheat Bot/1.0; +http://www.viralheat.com/) Mozilla/5.0 (Danger hiptop 4.6; U; rv:1.7.12) Gecko/20050920 Mozilla/5.0 (X11; U; Linux i686; en-us; rv:1.9.0.2) Gecko/2008092313 Ubuntu/9.04 (jaunty) Firefox/3.5 OpenCalaisSemanticProxy PycURL/7.18.2 PycURL/7.19.3 Python-urllib/1.17 Twingly Recon twitmatic Twitturly / v0.6 Wget/1.10.2 (Red Hat modified) Wget/1.11.1 (Red Hat modified)

Of the few user-agents that seem ‘real’ at first, half are originating from an ip-address used by Amazon EC2. And I doubt people are setting op proxies on there.

Oh yeah, Googlebot (the real deal, from a legit google owned address) is sucking up posted links like fresh oysters.
I guess google is trying to make sure in advance to never be beaten by twitter in the ‘realtime search’ department. Actually, I think it’d be almost stupid NOT to post any new pages/posts/websites on Twitter, it must be one of the fastest ways to get a Googlebot visit.

Same experiment with a real, established twitter account

Now, because I was posting the url’s either as ‘status’ messages or directed @people, on a test-account with hardly any (human) followers, I checked again using the twitter accounts from a commercial site I’m involved with. These accounts all have between 500 and 1000 targeted (I think) followers. I checked the destination access_logs and also added ‘my’ redirect after the bit.ly redirect: same results, although seemingly a bit higher real visitor/bot ratio.

Btw: one of these account was ‘punished’ with a 1 week lock recently because the same (1 one!) status update was sent that was sent right before using another account. They got an email explaining the lock because the account didn’t act according to their TOS. I can’t find anything in their TOS about it, can you?
I don’t think Twitter is on the right track punishing a legit account, knowing the trickery I had been doing with it’s api went totally unpunished. I might be wrong though, I often am.

On the other hand: this commercial site reported targeted traffic and actual signups from visitors coming from Twitter. The ones that are really real visitors are also very targeted. I’m just not sure if the amount of work involved could hold up against an adwords campaign.

Reposting the same link over and over again helps

On thing I noticed: It helps to keep on reposting the same links with regular intervals.
I guess most people only look at their first page when checking out recent posts of the ones they’re following, or don’t look too far back when performing a search.

Now, this probably isn’t according to the twitter TOS. Actually, it might be spamming but no-one is obligated to follow anyone else of course.

This way, I was getting more real visitors and less bots. To my surprise (when my programmer’s hat is on) there were still repeated visits from the same bots coming from the same ip-addresses. Did they expect to find something else when visiting for a 2nd or 3rd time? (actually,this gave me an idea: you can’t change a link once it’s posted, but you can change where it redirects to)
Most bots were smart enough not to follow the same link again though.

Are you successful in getting real visitors from Twitter?
Are you only relying on bit.ly to provide traffic stats?

Recent Posts

Categories

Archives

Tag: bots

Twitter traffic might not be what it seems

Same experiment with a real, established twitter account

Reposting the same link over and over again helps