Mark Waterfield: It’s scary how many ways SEO can go wrong

We’ve all had those moments of absolute terror where we just want to crawl into the fetal position, cry and pretend the problem doesn’t exist. Unfortunately, as SEOs, we can’t stay this way for long. Instead, we have to suck it up and quickly resolve whatever went terribly wrong.

There are moments you know you messed up, and there are times a problem can linger for far too long without your knowledge. Either way, the situation is scary — and you have to work hard and fast to fix whatever happened.

Things Google tells you not to do

There are many things Google warns about in their Webmaster Guidelines:

Automatically generated content
Participating in link schemes
Creating pages with little or no original content
Cloaking
Sneaky redirects
Hidden text or links
Doorway pages
Scraped content
Participating in affiliate programs without adding sufficient value
Loading pages with irrelevant keywords
Creating pages with malicious behavior, such as phishing or installing viruses, trojans or other badware
Abusing rich snippets markup
Sending automated queries to Google

Unfortunately, people can convince themselves that many of these things are okay. They think spinning text to avoid a duplicate content penalty that doesn’t exist is the best option. They hear that “links are good,” and suddenly they’re trying to trade links with others. They see review stars and will fake them with markup so that they have them and stand out in the SERPs.

None of the above are good ideas, but that won’t stop people from trying to get away with something or simply misunderstanding what others have said.

Crawl and indexation issues

User-agent: *
Disallow: /

That’s all it takes — two simple lines in the robots.txt file to completely block crawlers from your website. Usually, it’s a mistake from a dev environment, but when you see it, you’ll feel the horror in the pit of your stomach. Along with this, if your website was already indexed, you’ll typically see in the SERPs:

A description for this result is not available because of this site's robots.txt

Then there’s the noindex meta tag, which can prevent a page you specify from being indexed. Unfortunately, many times this can be enabled for your entire website with a simple tick of a button. It’s an easy enough mistake to make and painful to overlook.

Even more fun is a UTF-8 BOM. Glenn Gabe had a great article on this where he explained it as such:

BOM stands for byte order mark and it’s used to indicate the byte order for a text stream. It’s an invisible character that’s located at the start of a file (and it’s essentially meaningless from an SEO perspective). Some programs will add the BOM to a text file, which … can remain invisible to the person creating the text file. And the BOM can cause serious problems when Google tries to read the file. …

[W]hen your robots.txt file contains the UTF-8 BOM, Google can choke on the file. And that means the first line (often user-agent), will be ignored. And when there’s no user-agent, all the other lines will return as errors (all of your directives). And when they are seen as errors, Google will ignore them. And if you’re trying to disallow key areas of your site, then that could end up as a huge SEO problem.

Also of note: Just because a large portion of your traffic comes from the same IP addresses doesn’t mean it’s a bad thing. A friend of mine found this out the hard way after he ended up blocking some of the IP addresses Googlebot uses while being convinced those IPs were up to no good.

Another horrific situation I’ve run into was when someone had the bright idea to block crawlers to get pages out of the index after a subdomain migration. This is never a good idea, as crawlers need to be able to access the old versions and follow the redirects to the new versions. It was made worse by the fact that the robots.txt file was actually the shared for both subdomains, and crawlers couldn’t see either the old or the new pages because of this block.

Manual penalties

Just hearing the word “penalty” is scary. It means you or someone associated with the website did something wrong — very wrong! Google maintains a list of common manual actions:

Hacked site
User-generated spam
Spammy freehosts
Spammy structured markup
Unnatural links to your site
Thin content with little or no added value
Cloaking and/or sneaky redirects
Cloaking: First Click Free violation
Unnatural links from your site
Pure spam
Cloaked images
Hidden text and/or keyword stuffing

Many of these penalties are well-deserved, where someone tried to take a shortcut to benefit themselves. With Penguin now operating in real time, I expect a wave of manual penalties very soon.

A recent scary situation was a new one to me. A company had decided to rebrand and migrate to a new website, but it turned out the new website had a pure spam penalty.

Unfortunately, because Google Search Console wasn’t set up in advance of the move, the penalty was only discovered after the migration had happened.

Oops, I broke the website!

One character is all it takes to break a website. One bad piece of code, one bad setting in the configuration, one bad redirect or plugin.

I know I’ve broken many websites over the years, which is why it’s important to have a backup before you make any changes. Or better yet, set up a staging environment for testing and deployment.

Rebuilding a website

With any new website, there are many ways for things to go horribly wrong. I’m always scared when someone tells me they just got a new website, especially when they tell me after it’s already launched. I get this feeling in the pit of my stomach that something terrible just happened, and usually I’m right.

The most common issue is redirects not being done at all, or developers arguing that redirects aren’t necessary or too many redirects will slow down the website. Another common mistake I see is killing off good content; sometimes these are city pages or pages about their services, or sometimes an entire domain and all the information will be redirected to a single page.

Issues can range from very old issues that still exist — like putting all text in images — to more modern problems like “We just rebuilt our website in Angular” when there was no reason for them to ever use Angular.

Overwrote the file

This scares me the most with overwritten disavow files, especially when a copy is not made and the default action happens to overwrite, or with an .htaccess file where redirects can easily be lost. I’ve even had shared hosts overwrite .htaccess files, and of course, no email is ever sent of the changes.

I don’t even know

In my years, I’ve seen some really random and terrible things happen.

I’ve seen people lose their domain because it expired or because they unknowingly signed a contract that said they didn’t own the domain. I’ve seen second and even third websites created by other marketing companies.

There are times when canonical tags are used incorrectly or just changed randomly. I’ve seen all pages canonicalized to the home page or pages with a canonical set to a different website.

I’ve seen simple instructions that sounded like a good idea, like “make all links relative path,” end up in disaster when they made canonical URLs relative along with alternate versions of the website, such as with m. and hreflang alternate tags.

SEO is scary

It’s amazing how one little thing or one bad decision can be so costly and scary. Remember to follow the rules, plan, execute and QA your work to prevent nightmares. Share your own tales of horror with me on Twitter @patrickstox.

Some opinions expressed in this article may be those of a guest author and not necessarily Search Engine Land. Staff authors are listed here.

About The Author

Patrick Stox is an SEO Specialist for

IBM

and an organizer for the

Raleigh SEO Meetup

, the most successful SEO Meetup in the US.

Mark Waterfield

Thursday, 27 October 2016

It’s scary how many ways SEO can go wrong