The Broken Link Building Bible

Posted by russvirante This post was originally in YouMoz , and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc. Broken link building may perhaps be the most effective, white-hat link building strategy in years. In particular, broken link building is appealing because the success of the campaign is directly proportional to how much good you do for the web . You profit only if you create good content to replace lost or abandoned content that webmasters still want to link to. This is the type of strategy that marries so many of the competing interests our industry: content vs. links, link earning vs link building, inbound vs. outbound, etc. Below, I attempt to organize as much as I know about broken link building tactics. Throughout the piece I mention tools that will help you make the broken link building process scalable and less monotonous. Let’s begin. Table of Contents Overview Prospecting Resource Page Targeting w/ Keywords Selecting Keywords Prospecting Phrases Scraping Search Results Extracting URLs Header Checks Opportunity Qualification Prospecting Tools Resource Page Targeting w/ Model URL Site Selection Backlink Acquisition Extracting URLs Header Checks Opportunity Qualification Prospecting Shortcuts Direct URL Targeting Site Crawling Opportunity Selection Content Creation Rebuilding Tools Raised Expectations Outreach Contact Finding Email Templates Conclusions & Community Credits Overview Broken link building is a link building tactic where a marketer contacts a webmaster who has a broken link on his/her site and recommends one or more alternatives that include his/her target site. For the purposes of this piece, we will use a pediatrician in Raleigh, NC as an example client. Prospecting The first step in any Broken link building campaign is to find relevant dead pages. However, there are different methods of prospecting depending upon the broken link building strategy you are employing. There are essentially three types of broken link building strategies: Resource Page Targeting with Keywords Resource Page Targeting with URLs Direct URL Targeting We will cover each of these in the prospecting section. I will mention multiple tools throughout this post and will give descriptions of all of them at the end. Keyword Based Keyword based is the the most common and, in my opinion, straightforward method of broken link building. The method involves searching Google for keywords relevant to your site’s interests, finding resource pages that link to content related to your keywords, extracting all the links from those resource pages, finding missing pages among those links, and finally qualifying those opportunities. Select Prospecting Keywords Like so many things in SEO, we begin with keyword selection. A successful broken link building campaign lives and dies by the keywords used. There are a couple of characteristics we want to look for in an ideal keyword. Categorically relevant : This characteristic seems obvious. The prospecting keywords need to be relevant. However, they don’t necessarily have to be relevant to your product like the key phrase “health resources.” The keywords could be relevant to your audience “resources for kids” or your geography “Raleigh resources.” Remember, you are finding resource pages with these keywords, you are not finding the final targets. You want to cast a wide net, which leads to… Generally broad : This is where most campaigns fail. Our mock client is unlikely to find any resource pages for the keyword “raleigh nc pediatrician resources,” much less any with good link opportunities. You should choose key phrases that you would consider to be categories that your company might fall in, rather than the specific term. Prospecting Phrases : Once you have identified your keywords, you will want to pair them with prospecting phrases. These are searches to use in Google or Bing to find relevant resource and links pages like “intitle:resources” or “inurl:links.” Below is a list of prospecting phrases you can use to help find relevant linking pages. site:.gov links resources intitle:links intitle:resources intitle:sites intitle:websites inurl:links inurl:resources inurl:sites inurl:websites “useful links” “useful resources” “useful sites” “useful websites” “recommended links” “recommended resources” “recommended sites” “recommended websites” “suggested links” “suggested resources” “suggested sites” “suggested websites” “more links” “more resources” “more sites” “more websites” “favorite links” “favorite resources” “favorite sites” “favorite websites” “related links” “related resources” “related sites” “related websites” intitle:”useful links” intitle:”useful resources” intitle:”useful sites” intitle:”useful websites” intitle:”recommended links” intitle:”recommended resources” intitle:”recommended sites” intitle:”recommended websites” intitle:”suggested links” intitle:”suggested resources” intitle:”suggested sites” intitle:”suggested websites” intitle:”more links” intitle:”more resources” intitle:”more sites” intitle:”more websites” intitle:”favorite links” intitle:”favorite resources” intitle:”favorite sites” intitle:”favorite websites” intitle:”related links” intitle:”related resources” intitle:”related sites” intitle:”related websites” inurl:”useful links” inurl:”useful resources” inurl:”useful sites” inurl:”useful websites” inurl:”recommended links” inurl:”recommended resources” inurl:”recommended sites” inurl:”recommended websites” inurl:”suggested links” inurl:”suggested resources” inurl:”suggested sites” inurl:”suggested websites” inurl:”more links” inurl:”more resources” inurl:”more sites” inurl:”more websites” inurl:”favorite links” inurl:”favorite resources” inurl:”favorite sites” inurl:”favorite websites” inurl:”related links” inurl:”related resources” inurl:”related sites” inurl:”related websites” list of links list of resources list of sites list of websites list of blogs list of forums   Search Results Scraping : You now have the arduous task of finding all the results for all these prospecting phrases. Google is not fond of sending in automated requests, so you have a couple of choices. You complete the task by hand and use the MozBar to extract results , you can use a SERP scraping tool and risk Google’s ire, or you could look into use the Bing API, which would necessitate changing many of the search operators in the above list of prospecting phrases. Ultimately, you will want to pull down the top 100 results for each of the prospecting phrases you use. You will have quite a bit of crossover, so you will want to de-dupe those lists. You can use Virante’s free ” Duplicate Deleter ” tool to accomplish this, or you can simply use Excel’s remove duplicates function . Link Extraction : Once you have a culled list of potential “linking pages,” you need to extract every external link from these pages and begin the process of finding all the 404s. You can also combine this step with the 404 header check using a tool like Domain Hunter+or Check My Links. Link extraction: webmaster-toolkit.com iwebtool.com code.google.com Link extraction and 404 header check Domain Hunter Plus Check My Links 404 / Error Checking : Once you have extracted all the links, you will have to check the headers on each link to determine whether or not they are 404s, our ultimate target. If you used Domain Hunter Plus or Check My Links, you can skip this process. The easiest way to do this is with a simple HTTP Status Code checker. There is a free bulk tool here . Just copy and paste all your URLs here, without the http:// and it will find all the 404s for you. Opportunity Qualification : There are two things you will want to determine about each potential opportunity to vet them for quality: relevance and backlinks. Backlink acquisition : Once you have found a set of 404 pages, you now have to filter them to determine which are actually strong targets. The more backlinks pointing to a 404 page, the more opportunities you have for link replacement. These linking domains will be the sites you contact to replace the broken link with your own. There are several ways to do this, but the easiest at the moment is likely Majestic SEO’s bulk backlink checker . Remember, at this point you are trying just to get an idea of those with the most links and ignore those with very few. This will limit the amount of time you have on checking relevance. Relevance analysis : Now you filtered your list of 404 opportunities to those with a good number of unique linking domains. Let’s say that number is 50 or more. You now have to determine the relevancy of that content. You can do that a few ways: Visit the Wayback Machine (also known as the way back machine) to find cached copies of the URL in history. If the page is well linked and did not block web crawlers, you should be able to find the content here. If this is not available, you can look at the anchor text of the links pointing to the page. You can use SEOMoz Open Site Explorer to get an export of the anchor text. You can look at the URL itself for hints as to how relevant the content would be. You can visit the linking pages to see if those links have descriptions of what the previous content was. Prospecting Shortcuts : There are two tools that you could use to jump over a lot of these steps. Broken Link Index ( brokenlinkindex.com ) : This tool by iAcquire allows you to find tons of potential 404 pages from their gigantic database of opportunities. Unfortunately, all of the link qualifications have to be done one at a time, although you could export the list and automate the process if you are savvy. Broken Link Builder ( brokenlinkbuilding.com ) : This tool by CitationLabs is not free, but allows you to perform all of the actions above in an automated fashion. Just type in your kewords and it performs all of the steps above, from finding opportunities to qualifying them based on links and relevance. This is by far the most robust broken link building tool currently available and a huge time saver. Resource Page Targeting w/ Model URL Unlike using keywords, this method starts with a known site and mines their backlinks to relevant resource pages that, in turn, produce broken link building opportunities. Site / URL Selection : This is by far the most important part of the process. Choosing the right site will make or break this strategy. I do want to give a nod to Garrett French for pointing this method out to me a few months ago. There are a couple of factors you want to use in identifying the perfect site or URL. Non-commercial: In most cases, you want a non-commercial source. If the site has a direct incentive to acquire links, chances are there will be too much manipulated link noise in their backlink profile to properly mine them for broken link building opportunities. Authoritative: If the site is not authoritative, it likely has attracted few links from resources that aggregate important links on the web. These are the resource pages from which we will find 404 opportunities. If they aren’t linking to your selected URL, you are wasting your time. Relevant: Obviously, the site needs to be relevant to your industry. You can use this technique to find great opportunities based on nasa.gov, but unless you are SpaceX, you probably have no business doing so. Backlink Acquisition : Following the example above of a Raleigh, NC dentist, let’s assume that we selected the American Dental Association (ADA.org). Using Open Site Explorer , Majestic SEO , or A Hrefs , export all of the links pointing back to this site. This list of URLs should be treated in the same way as the list of URLs in the keyword method that were pulled from searching Google with prospecting phrases. You can now skip to the Link Extraction section in the previous description and follow from there. The steps are identical, no need to repeat them. Direct URL Targeting This is the least scalable of the strategies and is used specifically to target a single link prospect. Unlike the previous two methods where you are trying to find potential broken content to replace and your link prospects are those who link to that broken content, in this method you have already chosen your link prospect and you simply want to find broken links on his/her site as an excuse to start a conversation. I hesitate to include this strategy because it is weak and unscalable, but it is a part of the grouping of strategies known as “broken link building” so I will include it. Let’s assume that you are the Raleigh, NC dentist and you have decided that all you really want is a link from ADA.org. You feel that you have some great content they would link to if only you had a reason to open up a conversation that didn’t sound completely like begging. Well, the first step is to try and find a broken link on their site so you have a reason to reach out to their webmaster. Site Crawling : Site crawling can be problematic because you must balance your need for relatively quick responses and a general respect for the site owner’s bandwidth and uptime. Do not turn on a crawler that you are not certain follows polite crawling policies and obeys robots.txt. Your best bet would be one of the following: Xenu Link Sleuth A classic SEO tool, Xenu Link Sleuth makes it easy to spider a site and find broken links among other problems. Screaming Frog SEO Quickly becoming the spider of choice for many SEOs, Screaming Frog can quickly spider your site to diagnose everything from duplicate content to 404s. Deep Trawl Often overlooks, Deep Trawl is a worthy adversary for solving on-site issues.   Opportunity Selection : You now have a list of broken links on your ideal linking website. Identifying the best opportunity will greatly increase the likelihood of succeeding with this strategy. Here are a couple of pointers. Choose a broken link opportunity where the link is external . This does two things: it makes the webmaster feel like it is not his/her fault unlike an internal link and it creates a 1:1 ratio of removing an external link and hopefully adding your external link. A webmaster is far more likely to replace a broken external link with another external link than to replace an internal link with an external one. Try and choose a broken link on the same page as the one your link would most fit. This is most likely to occur if your ideal linking site has a resources section. Content Creation The next step in the broken link building process is creating content that matches or improves upon the broken page. The first step you will need to take is actually determining what the broken page is. We assume that you have already vetted this page for relevance so you should have a general idea, but getting as specific as possible will help you create content that meets the expectations of all of those who previously linked to the now defunct resource. There are two tools that can help with this right off the bat… Rebuilding Tools : Wayback Machine : The Wayback Machine at Archive.org allows you to see much of the web as it existed in history. This is your first and best bet for finding the content. Pro-tip: Use Majestic SEO’s historical index to find when the links were acquired, and then choose the date in Archive.org that corresponds with this date. This will help you know the mindset of the linkers if the content changed over time Warrick : Warrick is a little known tool by the Comp Sci department at Old Dominion that helps you rebuild an entire website by searching through public proxies/mirror caches to find copies of lost content. This is particularly good for rebuilding content that was blocQked by robots.txt. Unfortunately, Warrick is a perl program that may be difficult to operate.   Raised Expectations : Chances are the site for which you are replacing content has greater authority in the industry than does yours. Chances are it is less commercial, more informative, and more trustworthy in general. If you want to acquire a decent return on investment, you need to focus intently on content quality. Expect to improve upon the content that was created. Update relevant statistics. Add new citations and sections. Consider reaching out to the original author for more information to add credibility. Outreach So, you have found your opportunity, created your list of link opportunities, and you are ready to start outreach. Here is how to make the most out of that link list you have. Contact Finding : There are a growing number of resources for automating the process of contact discovery, although each comes with it’s own set of issues. CitationLabs Contact Finder Link Research Tools Contact Finder SEOGadget’s Contact Finder Raven Tools Contact Finder BuzzStream Virante’s Contact Finder: In Beta Email Templates : There are many strategies you can employ in the outreach, here are a few of them depending on how transparent you want to be. We find, in general, that if you write good enough content you can be very transparent. Act as a user who happened upon the broken link Mix your link in with other valuable, related links Offer the replacement in a follow up email Email Templates : Below is an example of a broken link building outreach email. The most important part of the outreach process is that you should tailor your outreach at least to the specific campaign and industry if not to each target specifically . If you can add even a sentence of plausible, relevant customization to each email you send out you will greatly increase your conversion. I promise you if you copy and paste this template you will waste a lot of your opportunities, no matter how good it is. SL: quick note – dead resource on your site Hello, I’m a licensed (industry specialist) and a health writer – I recently visited your site while researching for an article I’m working on… This is a note for your webmaster, as I found a dead resource on your site that visitors like me surely miss. It’s on this page: http://www.theirsite.gov/linksandresources I got an error message when I tried to click on this site: http://DeadURL.org/index.jsp It looks like they made a change to their home page but didn’t update it… anyhow, the correct link is here: http://www.FixedURL.org/ And while you’re updating your page, I wondered if you’d be open to including some further resources that could help people struggling with similar issues. Compelling Content Title http://www.clientsite.org/compellingcontent Compelling Content Title 2 http://www.clientsothersite.com/compellingcontent Thanks for your help and for providing great resources! Best, First Name Last Name Industry Credentials clientsite.org Anthony Nelson has some fantastic templates here from his excellent piece “Broken Link Building Guide from Noob to Novice”. Conclusions & Community Like nearly any link building technique, sweat equity is ultimately going to make the difference between a successful campaign and a failure. The devil is always in the details. With that, I would like to see that this becomes a living document. Broken link building, while not a new technique, is becoming more and more scalable. As more agencies, consultants and business owners jump on the bandwagon, their voices need to be heard as well. Subsequently, I am requesting that if you know any tips or tricks that you feel free to include them in the comments here. Thanks, and happy broken link building! Credit Where Due While I would like to pretend that most of my knowledge came from divine inspiration or on-the-job learning, the truth is that many thought leaders have chimed in on broken link building. This posting can be attributed in part to conversations with or content provided by the following great SEOs: Jon Cooper Garrett French Anthony Nelson Matt Zaffina Paddy Moogan   Sign up for The Moz Top 10 , a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

View article:
The Broken Link Building Bible

To Catch a Spammer: Uncovering Negative SEO

Posted by russvirante This post was originally in YouMoz , and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc. Google recently updated its claims regarding the ability of other webmasters to affect your rankings via negative SEO . While questions about the efficacy of negative SEO continue to exist, it does not seem to be slowing down the growth of what is arguably the most contemptible part of the search industry. On July 9th, a good friend of mine reached out to me with a problem. As a very risk-averse webmaster, he constantly plunges into the numbers, especially anchor text diversity, in order to make sure his site is as penalty-proof as possible. The latest updated data in SEOmoz’s MozScape revealed a massive shift towards anchor text over optimization for several primary terms. It took only a few minutes to identify the culprit. Diagnosing the Damage The first step was to dig down into all the link data to identify just how deep the damage was. We downloaded all the links available on SEOmoz, MajesticSEO and AHrefs to make sure that we had every possible outlet covered. It didn’t look good. On a primary keyword, the number of unique linking domains with exact anchor text went up 20x in a matter of two days. Below is an example of one of the spam posts. Now the leg work began of identifying as many negative links as possible. But this is when it got interesting. We were able to quickly identify that there were several sites involved in the attack. Sunlight Bingo Online Bingo Finder Public Liability Insurance Distilled Which Bingo Ladbrokes Bingo Wait, what? Did you just read what I read? Distilled, the venerable white-hat SEO company was being attacked along side several bingo sites and an insurance liability website. This was too interesting to give up. At that point, I knew my day was shot. Footprints, Footprints, Footprints Let me go ahead and get this out – if you are thinking about doing negative SEO and are not a regular practitioner of black hat SEO, you are going to get caught. Sorry, but you just haven’t thought it through enough to cover your tracks. What follows is a perfect example of that. After digging through several of the XRumer spammed backlinks, most hitting up old .cgi guestbooks and bulletin boards, I noticed a handful of sitewide links coming from poor quality blogs. My first instinct was that these were from hacked sites. But something was different about these. Normally hackers hide their links in the posts with display:none tags so that the webmasters never actually see the bad links. It is a very effective strategy, but in this case they were fully exposed. So I checked another site that seemed to follow the same pattern. In this example, the links were included in a post. It is very strange for a “hack” to follow such different patterns, sometimes dropping links sitewide and other times just in posts. So, it was time to investigate these anomalies. Off to one of my favorite sites, DomainTools . For some reason, people still think that private registration is enough to cover all your tracks. Sure, it helps if you register a new domain and establish private registration at the point of acquiring the domain, but if at any point in your history you had accurate domain registration data, we can get to it. Anyone can. Using the DomainTools Registration History, we were able to track down the original registrant email address to info@——-.com A Quick Note on Outing As you have no doubt noticed so far in this post, I am not going to out the perps. We know the motive, and we know the likely perpetrator, but I can’t prove that the parent company knew of the actions, nor even that the SEOs responsible for their accounts were aware of the actions taken on their behalf. I will not allow myself to be responsible for the downfall of a company that may have merely been ignorant rather than malicious, and I certainly won’t open myself up to false flag attacks. That being said, the likely culprits are members of this community, and I believe they have much to lose if they continue in their ways. I can’t prevent you all from connecting the dots, but I won’t paint the picture myself. So, back to the Investigation. Now that we had a domain, we had a strong position from which to catapult our investigation. We quickly turned the domain into a twitter account, a twitter account into a link building company out of India. Aside from Distilled, a seemingly random business liability website was lumped into the attack. We were able to determine that the likely culprit owned a site which competes directly with this business liability insurance site. But we were stuck, until my good friend came through and did a quick analysis of the perpetrator’s follow list on Twitter. After a cursory look, he was able to identify a stinging indictment. Of the 41 individuals the likely culprit was following on Twitter, two worked for a direct competitor of the targeted bingo sites, one of which was the CEO of the company and the other the head of Web Marketing. He also followed Distilled, perhaps waiting to see how they responded when the attack was revealed. This isn’t quite the smoking gun yet, though, because the connection is not reciprocal. It is a strong indication, but not a nail in the coffin so to speak.  But, alas, twitter is only one social media site. After digging deeper and deeper, we were able to find direct conversations of a personal, non-business, nature between the head of Web Marketing for the competitor sites and the likely culprit on Google+. Of course, this still only shows a link. But, as if the icing on the cake couldn’t get any thicker, here is a nice comment the Director of Web Marketing left on a post about negative SEO just a few weeks ago. As you notice, he is contemplating Google’s updated statement that negative SEO is possible. Seriously, could you make this any easier? So, what exactly does the evidence tell us… A negative SEO attack was launched between May 20th and May 22nd of 2012 against several bingo sites, Distilled, and a business liability insurance site. The attack was likely created by an individual from India who owns a link building company. We know that who ever performed the attack had direct access to websites owned by the individual from India. That individual has direct connections with the CEO and Director of Web Marketing for a bingo website company. The Director of Web Marketing has reciprocated communication on social media sites with the individual likely responsible for the attack. The Director of Web Marketing responded with curiosity to Google’s updated notation on negative SEO. What do we not know? We don’t know, for certain, that either the CEO or Director of Web Marketing requested these actions be taken. We don’t know, for certain, that the individual who owns the link building company was directly responsible. Why did they target Distilled in the campaign? Did they assume Distilled was an SEO of record for one of their competitors? The Aftermath If you are a victim of negative SEO, there are a handful of steps you simply have to tag to prevent potential damage to your site. Download a complete list of links pointing to your site from Open Site Explorer. Mark any links in this list that came from the negative SEO attack. Submit these as a preemptive reconsideration request or via the feedback channel in Google Webmaster Tools. Use the Bing Webmaster Tools Disavow Tool immediately. Finally, if necessary, begin removing the bad links wherever possible. There are several tools to help out with this, including Virante’s Remove ‘Em , rMoov , or Richard Baxter’s Excel Tool. The Good News At least at the moment, it appears that the negative SEO attack has been as effective as their ability to cover it up. For the time being, none of the sites appear to have been dramatically impacted by the campaign. However, with looming updates to Penguin, there is no telling. The best bet for any SEO is to stay on top of their backlinks, watching closely to make sure nothing nefarious makes its way into your profile. Editor’s Note After the author wrote this post, Google announced a way to download your most recent links in Google Webmaster Tools that could prove very useful in this situation. Sign up for The Moz Top 10 , a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

See original article:
To Catch a Spammer: Uncovering Negative SEO

Set It and Forget It SEO: Chasing the Elusive Passive SEO Dream

Posted by russvirante This post was originally in YOUmoz , and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc. Howdy, Mozzers. This is  Russ Jones (@rjonesx) from Virante, Inc . I recently spoke at the Search Exchange conference in Charlotte, NC on the topic of programmatic, automated SEO solutions and realized that it could probably be more valuable in front of a larger audience. Of course, the attendees have a head start, so you better get to work. I have a confession to make. I love infomercials . In fact, I would probably call myself an infomercial elitist / hipster. I liked infomercials before they were cool; before the Billy Mays and Slap Chop Guy made their way into internet memes. I pledge my allegiance to the godfather of infomercials, Ron Popeil , while guys like Anthony Sullivan weep at his alter, asking forgiveness for their sub-par jobs as pitchmen. OK, maybe I take it a little too seriously – I do happen to have a DVR full of Gator Grip, Ginsu Knives, and Flowbees – but I believe there is something extremely motivating about this type of advertising. And Ron Popeil hit it on the head over and over again:  Set It and Forget It. This was the tag line for the Ronco Showtime Rotisserie, an amazing success for infomercials. You see, there is an innate desire for us to find solutions to common, everyday problems that do not require our attention. These nagging, annoying problems like making dinner, cleaning up, and in our industry – SEO tedium – tend to suck up our time and attention while bringing only marginal improvements.  Unfortunately, there is this perception, almost bias, against automation in our space: a misbelief that there is nothing that we can set and forget in SEO . Well, I am here today to free you from the reigns of some of your daily miseries of  SEO, all for the incredible price of free.  Strategy 1: Real Time Referrer Indexing We often joke that “Google knows everything.” While we can lament the loss of privacy and liberty, there is one thing that I do want Google to know about – my links. I want them to know about as many links pointing to my site as possible. Unfortunately, Google misses out on a good portion of the web. Well, what if you could find links that Google hasn’t necessarily found, and then make sure that Google does index them and count them? Introducing Real Time Referrer Indexing: If you were go into your Google Analytics right now and export all of the pages that have sent visitors to your site since your website’s inception, what percentage of them do you think will have been indexed by Google? 90%, 95%, 99%? Sure, it will probably vary from site to site, especially given how many different sites out there have sent traffic to you, but there are likely to be a handful that Google never got around to crawling. Our goal with this first set-it and forget-it tactic is to find the pages that refer traffic to your site on-the-fly and make sure if they have a link, that Google knows about it. Ideally, our automated solution would work like this… The script would record every referrer from other sites. The script would spider that site to see if it actually has a real, followed link. The script would check to see if Google had cached that referring page with the followed link. The script would coax Google to reindex that page if it had not yet found the link. The script would continue to check to see if Google had cached the referring page. This is actually quite easy to accomplish programmatically. The first three steps are done every day by tools regularly used by SEOs. The only difficult part is finding a way to encourage Google to visit the referring pages it has not yet indexed . We can solve this by simply having a widget on the page that displays those referrers, essentially an “As Seen On” bulleted list of pages that had linked to your site, but had not yet been indexed.  Well, I have a treat for those of you who are or know someone with some half way decent programming skills. Here is sample code that does just this on your typical LAMP (Linux, Apache, MySQL, PHP) installation. A word of warning – it is highly likely that this code is buggy. Make sure that you check it and make modifications before running it on production. All you need to do is install the script on any pages of your site for which you would like to perform real time referrer indexing. This is exactly the type of set-it-and-forget-it SEO that I love. Simple techniques, simple solutions, long-term results. So let’s move on to another set-it-and-forget-it technique. Strategy 2: On-the-Fly PageRank Recovery Alright, so if you haven’t heard of PageRank Recovery before, you are going to need a quick little lesson. Whenever someone links to your site, but screws up the URL, the PageRank that flows through that link essentially evaporates. I am pretty sure that it ends up in Matt Cutt’s personal PageRank stash, which he has learned to convert into a powerful foodstuff that he consumes prior to mountain climbing and running marathons . But I digress, if you can find where those broken links point to on your site, then 301 those URLs to a real page, you can “recover” that PageRank. Virante created a tool to do just that based on SEOMoz’s Site Intelligence API which Rand highlighted a little while ago , but it still requires you spend time going and running the tool regularly. I want to be lazy and have my site recover PageRank for me while I watch The Facts of Life dressed in a Snuggie and downing 5 hour energy shots. So here is how it would work: Ideally, our program would do the following… The script sits in your CMS right before a 404 is fired. If you don’t have a CMS, you would direct your HTACCESS file to pass all 404 traffic through it first. The script captures the URL that the visitor or GoogleBot tried to visit. The script somehow magically knows what URL you MEANT to visit. The script 301 redirects you there. What’s that you say? ” But Russ, our programmers don’t know magic. They are all muggles. And even if they did know magic, I can’t find a USB powered wand anywhere these days. ” Well, I am bringing you good news from some friends: Mr. XML Sitemap and Ms. Levenshtein.  If you were paying attention to countless blog posts in the SEO world, you should have an XML Sitemap which keeps record of all the URLs on your site. This is a good start to the magic that is On-The-Fly PageRank Recovery, because now we know all the possible URLs your visitor or GoogleBot may have been trying to reach. Now, we simply have to find the most similar URL to the one the visitor came to . How do we accomplish this? Levenshtein Distance. Levenshtein Distance, also known as the Edit Distance, is a measurement of the minimum number of changes necessary to convert one piece of text into another by adding a letter, removing a letter, or substituting a letter. For example, the Levenshtein Distance between the words “Rock” and “Russ” is 3, because we will have to substitute the O, C, and K with U, S, and S. Below is an example of how Levenshtein Distance could be used to find two similar URLs: So, the way On-the-Fly PageRank Recovery works is by reading all the URLs in your sitemap and then comparing the Edit Distance between those URLs and the URL your visitor entered. If the server finds a close match, we then 301 redirect rather than show a 404 error. Subsequently, when a Googlebot tries to visit those previously 404 pages, it will instead find that 301 redirect and appropriately pass the PageRank through to the intended page. Plus, On-the-Fly PageRank Recovery is a huge usability win for visitors who now don’t have to try and search your site to find the correct page. Want to give it a test drive? Try any one of these broken links back to Virante and my blog, TheGoogleCache Virante’s Tool Page: http://www.virante.com/se9-toolz Second Page Poaching: notice the dollar sign in the url Now, It would be hypocritical of me to talk about setting it and forgetting it, and then make you go out and do all the work yourself to get it up and running. So, in the spirit of laziness, I have included a couple of options for you to use as well. Of course, double-check everything before you go into production with any code you ever get on the internet, regardless of whether or not it is on a trusted site like SEOmoz. WordPress Plugin Drupal Module Generic PHP Code Final Thoughts There are incredible opportunities in the world of Search Engine Optimization that we have only begun to address. So much more can be done in terms of describing, detecting, and repairing SEO issues all in a programatic, automated fashion. These are just two of them. Good luck, and keep inventing! Do you like this post? Yes No

Excerpt from:
Set It and Forget It SEO: Chasing the Elusive Passive SEO Dream