On Attribution and Links

May 12th, 2008

As a blogger or site owner, you are keenly aware of the activities of other sites on the internet, especially in your field. In particular, you might notice when other sites properly credit your site for either original news or even finding a particularly unique link. This business, however, can get messy with sites accusing other sites of wrongdoing or simply poor sportsmanship. As a result proper attribution can become a big issue. It comes to my mind now as TheInquisitr wrote an excellent summary of their linking and attribution policy.

In some cases the offense is clear: exclusive content or screenshots that are simply “lifted”, republished and no attribution given to you. Of course, this is the most offensive of actions, and fortunately, it is a relatively rare occurrence. Ironically, this used to be a bigger problem with mainstream news sites. As blogs were just emerging as a news source, I’d frequently see “legitimate” news sites reference emerging news with a wave-of-the-hands “word on the internet is…” and refuse to link to the original sources. Fortunately, this trend has dwindled as the line between blogs and media have blurred.

An equally deceptive, yet growing trend I’ve noticed is one where sites will rewrite original content and even include a link back to the original source, but hide the link in a way that makes it entirely un-obvious that the original site even exists.

Instead of writing “BlogXYZ.com writes…”, they might just report the news “We’ve heard that ABC is going to be great…” and then maybe link one obscure word later in the text back to the original source. While they’ve technically linked back to the source, the end result is the same as the first scenario, taking credit for the information.

I can’t really offer a great solution on this issue. I do know what the end result: you are far less likely to link to them, which, in turn, over time, will result in them being far less likely to link to you. This circle tends to feed on itself.

So, in the end, play nice and attribute properly.

On Twitter…

May 11th, 2008

About a month ago, I joined Twitter.

My account: http://twitter.com/arnoldkim

I’d heard all the commotion about it for some time, but finally decided to join in on the fun. I’m still a relatively sporadic poster of Twitters, but find the conversation interesting and distracting. I think for all the hype about twitter, the descriptions of it never really “sold” it very well.

For my part, the best way to describe is group instant messaging. For the old-timers, it most feels like sitting in an IRC channel all day long.

If you have any interest in topics I post about in this blog, feel free to follow me on Twitter.

Application Icons and Domain Names

May 9th, 2008

Years ago, Apple published a developer magazine. I don’t even remember the name of it, but it covered various topics on programming on the Mac or Apple II, but it would also occasionally have humor articles. One in particular stuck with me.

The author said that when you are getting ready to start developing your application, the single most important thing to do is you need to develop a killer icon. The desktop icon could make or break your application and it really should be your first priority.

As humorous a suggestion as it was, I think what I found most amusing was that there was a slight bit of truth to it… or at least it didn’t come from that ridiculous a place in the mind of the developer.

I’ve long thought the web-developer equivalent to that is the domain name. Like most web-folk, I have a lot of ideas for websites on a regular basis. Some are just passing thoughts, others, I might sit down and consider developing at some point in the future. Once I feel serious enough about an idea, my first priority is to find the perfect domain name. Without the perfect domain name, it’s very hard to make any further progress on the idea.

Like the icon, it’s the centerpiece of the idea, and can make or break the entire project. I don’t believe everyone necessarily thinks like this, though. I’ve heard stories of people just picking a domain and planning on changing it later, or maybe even growing into it… but that sounds crazy. The domain is the brand, and if I don’t feel like it represents the site or idea perfectly (or as perfectly as I can afford), then I have a hard time proceeding.

Anyhow, it’s this line of thinking and my unnatural love of domains that has caused me to become a domain name hoarder, with numerous names in my portfolio waiting to be developed.

Compete.com’s Margin of Error

March 15th, 2008

When we try to figure out the growth of a site, or the relative traffic between sites, we often turn to the only free public tools available for the job.

That would be Compete.com and Alexa.com. Both are known to be inaccurate, however, Alexa perhaps more notoriously so.

The reason for their inaccuracy is the method of their tracking. Alexa relies on traffic stats from their toolbar that users must install. So, it’s not a representative cross section of the internet. Compete incorporates data from ISPs, toolbar users, and opt-in panels. Despite their added efforts, I’ve found their numbers to also be way off.

Finally, there’s Quantcast — another traffic estimator service. I’ve personally found their estimates to be equally inaccurate. However, unique to their service is the ability for site owners to place actual Quantcast tracking tags on their site so their traffic is directly measured. Once they collect enough data, they will display (if you chose to opt in) the actual traffic stats for all to see.

What this means is we can actually compare actual traffic stats to Compete’s estimated traffic stats (which are also reported in uniques/month) to see what Compete’s margin of error can be across different sites.

Based on a small sample of websites (n=25), Compete’s estimates predict from 18.6% to 137% of a site’s actually measured traffic stats.

See this graph (click for larger):

traffic.jpg

Explanation of Columns

Worldwide Uniques
  Actually measured Worldwide traffic via Quantcast

U.S. Uniques
  Actually measured U.S. traffic via Quantcast

Compete U.S. Uniques
  Compete’s estimate based on their aggregate data.

% of Actual world
  How far off is Compete’s U.S. numbers compared to Actual Worldwide

% of Actual US
  How far off is Compete’s U.S. numbers compared to Actual U.S.

Now, to be fair, Compete only claims to offer estimates of U.S. monthly uniques, but I’ve included Worldwide uniques to point out how deceptive this can be for sites with a large international audience, such as Hi5.com. Compete’s U.S. estimate only counts 3.13% of Hi5’s actual worldwide traffic. Even when comparing U.S. numbers, Compete still underestimates Hi5’s U.S. traffic by over 50%.

Worst Estimates (U.S.)

In our small sample size, the most underestimated traffic sites where

MacRumors.com - 18% of Actual U.S traffic
Wonkette.com - 22% of Actual U.S. traffic
FunnyorDie.com - 24.8% of Actual U.S. traffic
icanhascheezburger.com - 24.94% of Actual U.S. traffic
Gizmodo.com - 27.97% of Actual U.S. traffic
BoingBoing.net - 32.08% of Actual U.S. traffic
Propeller.com -35.16% of Actual U.S. traffic

Best Estimates (U.S.)

These sites’ estimates were not that far off, and in some cases overestimated actual numbers.

Whateverlife.com - 76.2% of Actual U.S. traffic
TechCrunch.com - 76.2% of Actual U.S. traffic
Hotornot.com - 81.67% of Actual U.S. traffic
Gigaom.com - 93.70% of Actual U.S. traffic
Slide.com - 111.18% of Actual U.S. traffic
Wikia.com - 112.79% of Actual U.S. traffic
Digg.com - 137.01% of Actual U.S. traffic

Conclusions

I’m not sure if many conclusions can be drawn from this data alone, but it just shows that traffic stats estimates can be very deceptive. I was somewhat surprised that some sites’ estimates were actually greater than their actual stats.