Content Scraping or Aggregation?

First, some definitions, in my own words:

Content Scraping

A largely illegal and often insidious attempt to rob, harvest or duplicate information from another website without permission in an attempt to avoid the effort involved in creating great manual content and generate income, usually ad based.

Scraped sites would not typically acknowledge the originator or location of the scraped content.

Content Aggregation

A gathering and organising of various sources of relevant information from different websites into a central area, usually via importation of RSS feeds. Such sites are considered helpful to visitors in that they contain all the information required in one place without the need to trawl the net. Aggregation sites tend not to be ad or commercial heavy.

Aggregated sites should acknowledge the author and location of the content they import.

Google has recently said that it will downgrade ‘content farm’ type websites where most of the content is not original and is automated in some way. I worry that they may not make a distinction between the above two types.


SEO Heading Structure for your Site/Blog

I’ve just recently re-coded my main site and blog to take Heading Tags into account. I thought they were added okish to begin with but an article from WordPress guru Yoast de Valk made me have another look. I’d like to try paraphrase his article here and simplify it so it’s a bit easier to digest both for me and for you.

Basically, you can endear yourself to Google and the other search engines a little bit more if you write your markup/code semantically which basically means being tidy, adding code hints and most importantly perhaps, adding the correct Formatting and Heading tags to the content you want highlighted the most/least. The idea is to make the most important keywords on the page your H1 heading, the next most important H2, and so on so when the Google bot visits your page it can then see at a glance so to speak, the most important areas and hopefully index same.

There should be only one H1 tag on a page and this should be your Page Title, Blog Title, Business name, etc..Your H2’s might be the titles of the individual sections on the page or perhaps your Article titles if you have a Blog. H3’s would be Sub-headings, H4′ s might be sidebar headings, etc..etc..

It’s important to style your headings accordingly so people too (not just Google bots!) can easily scan through longer pages of text and pick out the important parts but also that the heading tags actually contain valuable keywords. There’s no point having headings if you don’t follow both rules. If you do it correctly your page will be nicely ‘outlined’ for both search engines and real people.

Here’s a couple of screenshots from my main site and blog to explain things better and show how I’ve personally set things up.

Main Site:

Main Site Headings

Blog Headings:

Blog Headings


Speed Up Your Website for Google

Since Google and other search engines take the size and download speed (among many other factors) of your site into account when deciding where to rank you in results, it makes sense to make sure it’s fast! It will also give your impatient visitors a much better experience. Here’s a few things you should consider doing:

  • Make sure your hsoting server is decent/fast.
  • Build your site with CSS/DIV’s rather than with Tables.
  • Be efficient, tidy and semantic with your HTML and CSS code.
  • Validate your code.
  • Compress all images as much as possible.
  • Avoid Flash/Video/Audio files embedded on the home page.
  • Don’t use to0 many unnecessary fancy scripts or widgets just to show off, eg – Facebook, Live Chat, Google Gadgets, etc…
  • Merge your external CSS files into one (the less external file calling the better).
  • Merge your scripts files into one (as above).
  • Place script calls at the footer of your page so they load last – WordPress Plugin.

Compressing via Gzip:

I’ve just done this belatedly for both my main static site and my WordPress blog. To turn on Gzip for a static PHP page for example, I’ve used this code:

<?php if (substr_count($_SERVER[‘HTTP_ACCEPT_ENCODING’], ‘gzip’)) ob_start(“ob_gzhandler”); else ob_start(); ?>

Your server and the visitors browser both need to support compression for the above command to work. Test whether your site is compressed or not here:

For my wordpress blog I’ve used the WP HTTP Compression plugin. There are a few excellent Caching plugins for WordPress such as WP Super Cache and W3 Total Cache which both handle compression too but I found both of them a bit over the top and they played havoc with my auto publishing to Twitter and Facebook.

Try this website page speed tester to see where you’re at now!


Twitter Poll: About People or Search Engines?

This post was prompted by something I read on an email list and on Linkedin the other day where the author seemed to have the view that Twitter was more about Search Engines than People. His argument was that Twitter is far too ‘noisy’ for people to actually read and interact but that Google could find it’s way around Twitter a lot easier and index content from it. It was therefore good practice in the author’s eyes that businesses auto feed to twitter in the hope of getting indexed and neglect the social aspect altogether. Here is the author’s summation at the end of his article: Continue reading Twitter Poll: About People or Search Engines?