The Guide to US Census Data for Local SEO

Posted by OptimizePrime This post was originally in YouMoz , and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc. As tax time nears in the United States, it’s hard not to wonder what exactly all that money is being spent on. Instead of getting into politics, I’d rather describe something our taxes do pay for and how it can help you plan an effective local SEO strategy. During the daily grind, we can become accustomed to exclusive data available to us only through analytics platforms, Webmaster tools accounts, and other resources requiring a username, a password, and a mother’s maiden name. This private-access mentality makes it easy to overlook that which is freely available to everyone – including our own Census. I’m Harris Schachter (you might know me better as OptimizePrime) and I’d like to show you not what you can do for your country, but what your country can do for you. All of the information and images presented from the US Census are free to reproduce unless otherwise noted. Using Census Data When Planning Local Strategy During the planning phase of a local strategy, you need to identify which specific localities will serve you best, whether it be local content , social media , community engagement ( geographic community & company community ), on-site optimization , off-site citation building , link building , or anything else that goes into local SEO. By using census data, these viable hyper-local markets can be identified before you even publish a single tweet. You can plan micro-campaigns designed to match each of the various cities, counties, towns, or even city blocks in your selected location. This type of analysis is particularly important when considering where to open a new brick-and-mortar establishment. Demographic data can guide everything from the language and reading level of your content, to the methods by which it should be distributed. Distinct personas for each of the geographic components can be made to help you visualize the potential customers within them. Once armed with this information, local strategies (including everything you are going to learn from GetListed+SEOmoz) can be applied with laser precision. You can spend hours on Census.gov exploring the myriad databases and tables. It can be overwhelming, so I’ll just demonstrate three of the most useful resources. If you’re an international reader, let this guide serve to motivate you to seek out what is available through your government. Since it’s been cold lately, and Richmond has as enough plaid and square glasses to rival Seattle, I’ll use hipster snow boots as my example of a locally marketable product, targeting the 20-24 age groups. I’ll look for viable hyper-local markets in the Richmond area since that is where I live, and do most of my local SEO here at Dynamic Web Solutions. I’ll go through each of the three Census tools using this scenario. 1. Interactive Population Map First is the Interactive Population Map . With this interactive map, you’re able to utilize population data at the most granular views . Currently the data is for 2010, but if you suspect a large population shift since the last data collection you can use proportions instead of volumes to make your observations . The image below shows counties, but you can view data at the following levels (from widest to most specific): national, Indian reservations, congressional districts, counties/municipalities, subdivisions, places, census tracts, census block groups, and census blocks (basically city blocks). You can segment population data by age, race, ethnicity, and housing status, and compare these features to those of nearby locations. How to use the Interactive Population Map: Head over to the map . Enter your location into the Find field, place your area of interest within the cross hairs, and use the on-screen controls to adjust the view and detail level. Choose any of the segmentation tabs, select a location, and click Compare. You can compare up to 4 locations to examine their demographics side by side. Once in the compare screen, you can flip between the tabs to view populations by age, race, ethnicity, or housing status for each of your chosen locations. In my example, I chose Richmond City and the nearby counties of Henrico, Chesterfield, and Hanover. Since my hipster snow boots business isn’t concerned with any specific ethnicity, race, or housing status, I’ll flip over to age since I am primarily focused on the 20-24 age group. From the table, I can see the city of Richmond has more people in my target demographic (20-24) than the three neighboring counties. Interesting. 2. County and Business Demographics Interactive Map Next up is the County and Business Demographics Interactive Map (or CBD Map) . This is similar to the interactive population map, but provides more robust information in addition to population, race, ethnicity, age/sex, and housing status. This map layers in three business demographics: industries, business patterns, and raw counts of establishments per industry. Industries are the general market classifications, such as Accommodation and Food Services, Construction, Manufacturing, Health Care, Real Estate, etc. Business patterns contain data on annual payroll, and employee counts (within a location or industry). The CBD Map is limited to the county level, but the additional information makes it an essential tool to decide where to focus your marketing efforts. This map can display the number of establishments in each industry, in each location. The capability for local competitive analysis is priceless. How to use the CBD Map: Head over to the map . Enter your city, state or zip code into the Find field. It should automatically switch to the County view on the left (under Geographic Levels). Choose any of the top demographic tabs – anyone will do for now. Select a location and click “Compare” at the bottom of the window. In the new window that appears, click “Add Topic” to choose your areas of interest. Once you have your topic areas chosen, go back to the map and select up to 4 more locations. Going back to our cooler-than-snow snow boots business, I chose Retail Trade from Industries, 20-24 from Age/Sex , and Total Establishments from business patterns. In addition to Richmond City, I again picked the neighboring counties of Henrico, Chesterfield, and Hanover. The goal while using the CBD map is to identify areas with large shares of your target demographic, but low business counts for your industry. This is a good indicator of areas with many potential customers, but low competition for them. Using the table, I can do some quick math to rank the four locations along these criteria. The comparison metric to use in this instance is number of (20-24 year old) people per retail trade establishment. Richmond has 33, Chesterfield has 19, Henrico has 14, and Hanover has 15. Richmond has the greatest number of potential customers per establishment, suggesting comparatively low competition for retail store customers. Interesting. 3. US Economic Census The final data table is the Economic Census within the American FactFinder collection. This is the most powerful database of the three, and also the most complicated. Data contained here includes everything the interactive maps do but at a much more granular level. Specifically, industries can be further broken down by individual product or service, and how many establishments offer them in any given area . This resource also contains a search bar- a familiar face in an unfamiliar environment. The FactFinder database is rather complex, so I’ll dive right into how to use it. Because this one is so detailed, the accessibility and recency of data is highly variable, so you may have more or less than what I’ve found. How to use the Economic Census : Step 1. Visit the American FactFinder database. Don’t be tempted to use the search bar just yet. Step 2: Program Choose the Topics tab. Select “Economic Census.” Step 3: Location Choose your location by selecting the Geographics tab. Use the “Geographic type” dropdown, and pick your level of detail. Select State from the next dropdown. I’ve selected County and Virginia respectively. Pick your actual locations from the next dropdown. Use the “Add to your selections” button to select your criteria. You’ll see the chosen options in the left sidebar under “Your Selections.” Step 4: Industry Select your industry by finding the North American Industrial Classification System (NAICS) number under Industry Codes. Do this by using the search bar to find your business. This one is much more detailed than the industry selections of the Interactive Map, so try a few queries until you get a solid match. For my example, I first searched for “boots” with no luck. I then tried “shoes” and found the codes 4482 for “shoe stores.” Check off the applicable industry code and click “Add.” Close this window to reveal your search results. Step 5: Database Results First, review your selections in the left sidebar. Check off the source most applicable to you, and make sure it is the most recent version. Select View. Step 6: Data! Finally, we’ve got the goods. First of all, I should warn you not to use your browser’s back button – all of your selections will be lost and the process starts over again. Instead, take note of the “Return to Advanced Search button.” Use this if you want to go back to the search options. Check out the data columns. Specifically, the most important are: geographic area, number of establishments, and sales. Data Collected From the Economic Census Due to the sheer number of search options, every research endeavor will be different. My results had three of the four locations in a data table, and the most recent data is from 2007. Immediately, we can see Chesterfield had 31 shoe stores, Henrico had 47, and the city of Richmond had 32. This is another good indicator for the city of Richmond, since it shows a relatively low number of shoe stores, and we already know it has the greatest volume of our target demographic. Now let’s look at the sales column for the total sales each location’s shoe stores generated. Using our county population data from earlier, we can calculate how much the average person spends on shoes (customer value) in each location . Keep in mind the population numbers are from 2010 while the sales figures here are from 2007, but hey, we’re just making estimations. Sales, Population, and Businesses Divide the sales figures by the total county population. I found the average person in Richmond to be worth $101, in Henrico $174, and in Chesterfield $85. Of the three locations, the average Henrico resident spends the most on shoes. But what about our target demographics of 20-24 year olds? For this calculation, we’ll apply the percent of target population within the total population, and apply it to sales for each location. Although this assumes the different age groups purchase shoes at the same rate, it will give us an estimated percent of sales contributed by our target demographic . From the first analysis, we found the 20-24 year old group made up 13% of Richmond’s population, 6% of Henrico’s population, and 6% of Chesterfield’s population. After applying these percentages to total shoe sales, we find our target demographic spending $2.7M in Richmond, $3.2M in Henrico, and $1.6M in Chesterfield. At this point, it might seem wiser to go after Henrico County, since the target demographic spends the most on shoes there, in total. Given the sheer amount of money spent on shoes in that county, I might consider a separate strategy to attract Henrico’s business. However, keep in mind Henrico has 47 shoes stores, while Richmond only has 32, and Chesterfield has 31. Taking this competitive information into account, we can compute the sales generated by the target demographic, for each store in each location. The data translates into $84k in sales per Richmond shoe store, $68k per Henrico store, and $52k per Chesterfield store. This suggests individual shoe stores in Richmond generate more sales from our target demographic than they do in the other two nearby counties. In-ter-est-ing. Analysis Results After three rounds of analysis, Richmond looks like the ideal place to set up a shoe store (especially one that sells supa-fly snow boots to young adults). So, what have we learned from all this? From the data available, I’ve found: Richmond has a greater volume of people in the target demographic than neighboring counties. Richmond has more potential customers within the target demographic per retail store than neighboring counties. A shoe store in Richmond generates more sales from the target demographic than a shoe store in a neighboring county. Apply the Insights Now that you’ve identified the most viable business locations, it’s time to incorporate these findings into local strategies. Go after these promising localities by gaining relevancy and ranking through a variety of methods, including: Hosting events in the chosen location to establish an audience. Building inbound links from sites which rank well in/for the target area (and be sure to diversify these links). Doing competitive analysis for the most visible websites in the locations uncovered by the analysis. Go through their backlink profiles for relevant links and try to attain them too. Encouraging customers to leave reviews, with specific attention to people in the targeted areas. Include the reviewer’s location in the review itself to gain more trust and influence among the potential customers. Engaging with prospects in the identified locations through social media. Find them through various tools like Followerwonk’s Twitter bio search and get the conversation going. Creating content specific to the viable locations. Dedicate a section of your blog for things to do and see in the area, why you like doing business there, interview citizens, government officials, or well known residents. Publishing content about the area can gain you exposure well before your visitors are even looking for your products or services. Once aware of your business, they’ll likely keep you in mind at some point down the road. Optimizing your content with traditional on-site methodologies for the locations uncovered in the analysis (but don’t overdo it). Developing press releases specifically for the target locations and distributing them to online sources like chambers of commerce, colleges and universities, local newspapers, free publications, etc. Considering mobile users and making sure your site delivers a satisfactory experience for people in the targeted areas. Local and mobile go hand in hand. Finally (and possibly the most effective in the long-run) is to consider opening a physical store within the location. Claim all profiles and listings from data aggregators using consistent NAP citations. Use consistent NAP citations on the website itself. Consider including the name of the location in a brand name. Utilize rich snippets to take full advantage of your new location in the SERP Complete your Google+ Local page with proper categorization, and mentions of the location within the business description. Modify social media profiles to include this new location. I encourage you to explore Census.gov , and subscribe to the Census RSS feed to make sure you don’t miss any of their interesting publications. They recently released mobile apps for the true geeks out there (I recommend the iPad app “America’s Economy”). Also be sure to check out the data visualization gallery to learn something new or just to get some data vis inspiration. So the next time a Census taker knocks on your door, answer it! You never know what type of product or business you’ll be working with in the future, but chances are good that you’ll have data for it. Sign up for The Moz Top 10 , a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

View original post here:
The Guide to US Census Data for Local SEO

Why Remarketing? – Whiteboard Friday

Posted by addthree No matter what type of product you’re offering, how your sales cycle flows, or what the industry you’re in looks like, there are a many different ways that you can leverage remarketing to target your audience. In today’s Whiteboard Friday, Brian Rauschenbach and Nora Park share their tips and tactics for remarketing success so that you can turn those visits into conversions! Have you had remarketing success? Leave your thoughts and experiences in the comments below! Video Transcription ” Brian : Hello, I’m Brian Rauschenbach and this is Nora Park. We’re with Add3. We’re a search and display network and agency located here in Seattle. We’re here today to talk about remarketing and Google AdWords. We’ve got a couple of examples of some brands that are probably using remarketing and how they’re going after sort of the same user and some of the advanced tactics, and some ideas and suggestions that we have that have worked with some of our clients and to share them with you. So, why remarketing, Nora? Why is it so important for brands to be remarketing today? Nora : So there are a lot of reasons why all brands should really be doing a lot of remarketing. Depending on what type of product you’re offering, your sales cycle, the type of industry you’re in, there are a few different ways that you can really leverage remarketing to target your audience. Kind of the first one, really, the core, basic reason to do it is to get back in front of customers who visited your site and didn’t take the desired action. They didn’t sign up for your free trial or make a purchase on your site. So that sort of also links into, if you have more of a type of ecommerce site, the really great way to do it is to reengage those customers who actually spent a lot of time on your site, put things in their shopping cart, maybe even got to the payment page, and didn’t hit the Submit button and actually make a completed purchase. You can get back in front of those users with remarketing, and even use some dynamic product feed remarketing and show them specific products that they looked at. Brian : Yeah. So I’ve seen that with some sites like Levi’s, where I might put a pair of jeans in a shopping cart, and then I abandon the shopping cart and don’t do the purchase, and then come back, like the next day, and I’m just surfing the web, and then I’ll see that pair of jeans still in there inside of a banner. Nora : Exactly. Brian : So that’s a dynamic product feed. But it’s a remarketing of that piece. Nora : Yeah, exactly. It’s going to be really effective. Another good scenario is to target your existing customers and upsell or cross sell them. So for example, if you’re a software company and you have people who you know have purchased a certain product, based on the way you’ve cookied them and set up your lists, you can show them ads that promote other similar products that somebody who purchased the other product will be likely to buy in tandem, or might also need down the road. Brian : Okay. These remarketing lists, how is the time piece sensitive? If you have a remarketing list, and you’re like, “I know this person is coming to purchase a product,” and what’s the learning that you can gather from setting up your custom lists with time segments in them? Nora : Yeah, absolutely. That’s a really great question. A good thing that you should do some testing around is to kind of find out when it’s most effective after that initial purchase, whether it’s 10 days, 20 days, 30 days later, that you can effectively reach that customer. Right away they might say, “You know, I already just gave you some money. I don’t need to make another big software purchase.” But in 30 days, “Well, great, I really like this product. I like this company.” They might be more likely to do that. Brian : Oh, so it might be like a brand, like a Brenthaven, like I really like their bags. They have a lifetime warranty. I might have just purchased a backpack, but I might be back next month buying an iPad case or whatever. Nora : Yeah, exactly. So it’s like, great, that kind of leads us into our last one, which is that when you have a really strong brand with really loyal customers, is knowing who those existing customers are, who have made purchases in the past, and being able to reach out to them with other products that you have they might be interested in. Brian : Okay. So for any of you that might not be using remarketing yet today with your product or brand that you represent, let’s talk a little bit about just setting up campaigns. Where do you find it in the Google AdWords interface, and then what’s your best practices for setting these campaigns up from scratch? Nora : Yeah, absolutely. It’s pretty simple. Kind of the core is setting up your custom combination lists. So you can go in the AdWords interface to the Audiences section, and that’s where you’ll be able to find the pixels you need to place on your site and then be able to create these lists to segment people based on what pages they’ve visited. So you can add lists based on different products, so if they’ve visited any page related to this certain product, and then you can show them an ad that is aligned with that. Brian : So the page could be just a URL that’s like the shopping cart URL or the success confirmation page or the thank you confirmation page, if it’s just a sign-up that someone’s looking for. Nora : Exactly. That’s where you can get really kind of creative and advanced in terms of how you set up the combinations of the list, is to be able to include and exclude people based on how far they got in the cycle. If they did put something in their shopping cart and didn’t reach the confirmation page, you might want to target them separately than somebody who didn’t even put anything in their shopping cart yet. Brian : So if you have like a subscription-based model for your company and the person has already upgraded, like they’ve upgraded to a Moz Professional account, you don’t want to be following them around and remarketing back to them. So you put them in an exclusion list? Nora : Exactly. Brian : Okay. Nora : That’s another great example. When you have a subscription service, to be able to use those exclusion lists to take out people from the remarketing pool that are already subscribed, based on a visit to, for example, a login page using that URL. Brian : Okay. Great. Then talk to us a little about user segmentation and the duration thing again, why that’s so important. Nora : Yeah. That one’s important too. You may have some insight already into the sales cycle for your product. So basically, if somebody visits your site, it might take a consideration time of one week up to a month, depending on what it is, before they are actually ready to make a purchase. So you can kind of start and use that as how long you want to set the duration of your cookie pool. Brian : So these would be good for clients or brands that have, basically, a free trial maybe, and then to upgrade the free trial to a paid trial. Nora : Exactly. Brian : Okay. Nora : At the end of that 30 days, or whatever it is. But another great way to do it is just to set up a test and kind of do increments of 10 days, where you give those people, you treat them differently, so you can just see how they act if you target them within 10 days after they first visited your site, within 20 days, and within 30 days. Brian : Okay. So these are the actual user list pools that you’re doing these time segments? Nora : Exactly. Brian : Your total cookie pool might be 30,000 users. So after 10 days, you’re cutting off remarketing to those people, and then you go into a 10 to 20-day window and then a 20 to 30. Nora : Exactly. Brian : Then you’re looking at those as three different lists and their effective CPA that they might be achieving. Nora : Yeah. Brian : Okay. Nora : Exactly. So you kind of get those learnings, and then you can start to use some custom messaging. Instead of just saying, the people after 10 days didn’t convert as well, we’ll give them a different message and see if you can get them to convert as well, whether you’re using a promotion code with an expiration date that you put directly in the ad, or offering a higher discount. Or a third example would be . . . Brian : Well, we’ve got a couple of examples up here. So the discounted example is if you’re booking a flight. This example that we’ve drawn out here is some guys that are planning a mancation to Alaska. So they come in. Someone’s been to Alaska Airlines, and they’re going to pick up a cookie there. Then, a day later, they might be getting a leader board banner that’s targeted to them for a cheaper flight up to Alaska. Then that person’s also looking to get some outdoor gear for that trip, and REI might hit them a couple of days later with a marketing message around free shipping. So it’s basically a promo, one that’s a little bit more time delayed. Then Airbnb might have a call to action that’s like, “Are you still looking for a cabin to rent?” I think a lot of those, if you make those messages custom, and don’t repurpose what you’re running in your existing AdWords campaigns, but understand the audience that you’re actually remarketing back to these people. They’ve been to your website. So you don’t need to really talk about the brand too much. But give them a promo or a time-sensitive call to action or something that’s like a question. Nora : Exactly. Brian : Going back to the user segmentation duration thing. I found that, when you ask this to a client a lot of times, like, “What’s sort of your sweet spot of when your person converts,” this is also a way that, if your brand doesn’t really know what that is, you could get the learnings from this. Nora : Yeah, exactly. It will definitely give you a good idea of where that sweet spot is. Another thing, too, is how many times those people see those ads. So you can set frequency caps, as well as set up the duration settings to see how effective it is to show them 10 ads a day versus 10 ads a month. Brian : Oh, so there’s a good segue there. After you’ve had your remarketing campaign up and everything is just chugging away, what are some tactics that you’ve sort of used to enhance the remarketing strategy with all this learning that you’re gaining, from setting up custom combination lists to time-delayed market segmentation? What have you been doing to sort of keep the meter going? Because it seems like the remarketing comes out really strong after you’re learning, and then it sort of has a little tail. Nora : Yeah. With any AdWords campaign, it’s always important to kind of keep up with the marketplace. So optimizing your bids is sort of standard. But something else, the really great thing Google provides, is looking at the managed placement, so the actual list of the sites that your ad showed up on and the performance by each of those sites, so you can find that maybe there are 20 sites . . . Brian : Maybe some pockets. Nora : Yeah. Either a category of sites or just specific sites that you can bid higher on that will allow your ad to show in more prominent positions, potentially more above the fold, and just more frequently. Brian : Then, on sort of the bid management side of things, I’ve seen some different market or duration list segments where I see if you’ve run 10, 20, 30, 40 day segments, sometimes they’ll pause out, like the 30 or 40, and then really focus in on the ones that are very optimal. Then you mentioned frequency caps. What’s a good generic setting for frequency caps, given that some of these ads might appear below the fold, and so even if you’re winning in this auction against three different brands, what should you have your frequency cap set to? Nora : Generally, let some of the initial data kind of show you where that drop-off is. You actually can see in Google, after how many impressions in a given week, where your click-through rate starts to drop off or your conversion rate starts to drop off. I’ve typically seen that it’s around eight a week. Brian : Eight a week. Okay. That’s good to know. So we talked a little bit about some Google beta programs that are out there. There are a couple other ones that we’re testing with different clients that are in different verticals, so it makes sense for them. Can you talk about any of those? Nora : Yeah. The one I think I’m most excited about that we’ve started to test and see some great success with is the search companion beta. What that does is it enables you to remarket to people who haven’t necessarily been to your site. So you choose keywords that you want to retarget. So anybody who’s searched for those keywords on Google, then when they are on sites that are part of the Google Display Network and accept AdSense ads, then you can get in front of them that way. Brian : So if you were brand like REI and someone did a search for hiking shoes, and then they visited the REI website, can then one of their competitors, like an outdoor emporium or, something like that, go after that user even though they didn’t even visit the site? Nora : Yeah, absolutely. Brian : Okay. Nora : So they would just say anybody who searches for hiking shoes, we want to be able to remarket to them. Brian : Okay. So that’s a pretty powerful beta that’s out. How about anything in YouTube? Have you done any work with their network? Nora : Yeah. That’s another great opportunity, that Google allows you to kind of repurpose your remarketing list and show YouTube ads, in-stream ads. It’s within the same log-in account, and they kind of talk to each other. You can set up a campaign and use that same list of people. Brian : So this is the same custom combination list, but just in YouTube. Nora : Exactly. Exactly. Brian : So you don’t have to just throw impressions away, basically. Nora : Yeah. So it makes it really targeted. Brian : Well, cool. Well, we’ve been doing a lot of discovery with remarketing here this last year and paying close attention to it, because all these new beta programs are coming out. Do you think that there’s going to be an end to this? Nora : Probably not. Brian : This is our industry crack we have right now. What do you think is going to be on the horizon with Google? Nora : I’m sure more like this. With traditional remarketing, you’re sort of capped in terms of how much you can grow just based on the visits you’re getting to the site in a given month. Something like the search companion beta really opens that up to a much larger population of available impressions. It just makes the marketplace that much bigger. So I’m sure that they’ll come up with more things along the same lines. Brian : We forgot to mention that, in order to sort of participate in this universe, you do need to have content running, right? Nora : Yeah, absolutely. That’s what the campaign setting is. Brian : So this used to be one of those check boxes that you used to leave unchecked, but now it’s like the Google Content Network or the Google Display network, it’s pretty big now, right? Quality’s really gone up on it. Nora : Yeah, absolutely. There are so many different ways you can target the Google Network. Remarketing is just one of them. But it’s sort of part of the same thing, where you can target on the Google Display Network by keyword content, categories, and interests as well. Brian : Then if you didn’t have the resources to get banner creative, this stuff can also just be contextual only, right? Nora : Yeah. You can use text ads. Actually, Google has a really cool thing called the Google Display Ad Builder, and they will just kind of take images from your site and put banners together themselves. I’ve actually used that, and they look really great. It’s a free and effective way for some clients that might not have the resources to get their ads out there. Brian : I was going to say that kind of sounds a little scary, if they’re just grabbing images from your site. Nora : Well, you get to see them. You have a lot of choices in terms of the layout and the language, and they actually look great. I don’t know how they pick the right images, but from what I’ve seen, they do a really good job. Brian : Okay, cool. Well, I think that sort of wraps up our segment on remarketing. We’ll be online listening and replying back to any commentary or any questions that you might have. Thanks. Nora :  Thank you.” Video transcription by Speechpad.com Sign up for The Moz Top 10 , a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Continue Reading:
Why Remarketing? – Whiteboard Friday

Excel Statistics for SEO and Data Analysis

Posted by Virgil This post was originally in YouMoz , and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of SEOmoz, Inc. Everybody has probably already realized that there is almost no data that we cannot get. We can get data about our website by using free tools, but we also spend tons of money on paid tools to get even more. Analyzing the competition is just as easy, competitive intelligence tools are everywhere, we often use Compete or Hitwise. Opens Site Explorer is great for getting more data about our and competitors backlink profile. No matter what information we are trying to get, we can, by spending fortunes or no money. My favorite part is that almost every tool has one common feature and that is the “Export” button. This is the most powerful feature of all these tools because by exporting the data into Excel and we can sort it, filter it and model it in any way we want. Most of us use Excel on the regular basis, we are familiar with the basic functions but Excel can do way more than that. In the following article I will try to present the most common statistical techniques and the best part it is that we don’t have to memorize complicated statistical equations, it’s everything built into Excel! Statistics is all about collecting, analyzing and interpreting data. It comes very handy when decision making faces uncertainty. By using statistics, we can overcome these situations and generate actionable analysis. Statistics is divided into two major branches, descriptive and inferential . Descriptive statistics are used when you know all the values in the dataset. For example, you take a survey of 1000 people asking if they like oranges, with two choices (Yes and No). You collect the results and you find out that 900 answered Yes, and 100 answered No. You find the proportion 90% is Yes 10 is No. Pretty simple right? But what happens when we cannot observe all the data? When you know only part of your data than you have to use inferential statistics . Inferential statistics is used when you know only a sample (a small part) from your data and you make guesses about the entire population (data). Let’s consider you want to calculate the email open rate for the last 24 months, but you have data only from the last six months. In this case, assuming that from 1000 emails you had 200 people opening the email, which resulted in 800 emails that didn’t convert. This equates to 20% open rate and 80% who did not open. This data is true for the last six months, but it might not be true for 24 months. Inferential statistics helps us understand how close we are to the entire population and how confident we are in this assumption. The open rate for the sample may be 20% but it may vary a little. Therefore, let’s consider +- 3% in this case the range is from 17% to 23%. This sounds pretty good but how confident are we in these data? Alternatively, what percentage of a random sample taken from the entire population (data set) will fall in the range of 17%-23%? In statistics, the 95% confidence level is considered to be reliable data. This means 95% of the sample data we take from the entire population will produce an open rate of 17-23%, the other 5% will be either above 23% or below 17%. But we are 95% sure that the open rate is 20% +- 3% The term data stands for any value that describes an object or an event such as visitors, surveys, emails. The term data set has two components, observation unit , which is for example visitors and the variables that can represent the demographic characteristics of your visitors such as age, salary or education level. Population refers to every member of your group, or in web analytics all the visitors. Let’s assume 10,000 visitors. A sample is only a part of your population, based on a date range, visitors who converted, etc. but in statistics the most valuable sample is considered a random sample. The data distribution is given by the frequency with which the values in the data set occur. By plotting the frequencies on a chart, with the range of the values on the horizontal axis and the frequencies on the vertical axis, we obtain the distribution curve. The most commonly used distribution is the normal distribution or the bell-shaped curve. An easy way to understand this is by considering the number of visitors a website has. For example the number of visits are on average of 2000/day but it happens to have more visits such as 3000 or less 1000. Here, probability theory comes in handy. Probability stands for the likelihood of an event happening such as having 3,000 visitors/day and is expressed in percentages. The most common example of probability that probably everybody knows is the coin flip. A coin has two faces, head and tail, what is the probability when flipping a coin to have head? Well there are two possibilities so 100%/2=50%. Enough with theories and let’s get a little bit more practical. Excel is an amazing tool that can help us with statistics, it’s not the best but we all know how to use it so let’s dive right into it. First, install the Analysis ToolPack. Open Excel, Go to Options -> Add-ins-> at the bottom we will find Hit Go -> select Analysis ToolPack-> and click OK. Now under the Data tab we will find Data Analysis. The Data Analysis tool can give you supper fancy statistical information but first let’s start with something easier. Mean, Median, and Mode Mean is the statistical meaning of average, for example the mean or average of 4,5,6 is 5 how we calculate in excel the mean? =average(number1,number2,etc) Mean=AVERAGE(AC16:AC21) By calculating the mean we know how much we sold on average. This information is valuable when there are no extreme values (or outliers). Why? It looks like we sold on average $3000 worth of products, but actually we were lucky that somebody spent more on September 6. But actually we did pretty poorly during the previous six days, with an average of only $618. Excluding the extreme values from the mean can reflect a more relevant performance rate. The median is the observation situated in the middle of the data set. For example, the median of 224, 298, 304 is 298. In order to calculate the mean for a large set of data we can use the following formula =MEDIAN(224,298,304) When is the median useful? Well, the median is useful when you have a skewed distribution, for example you are selling candies for $3 up to $15 a bag but you have some very expensive candies for $100 a bag that nobody really purchases on a regular basis. At the end of your month you have to make a report and you will see that you sold mostly cheap candies and only a couple of the $100. In this case calculating median is more beneficial. The easiest way to determine when to use the median vs. the mean is by creating a histogram. If your histogram is skewed with an extreme, then you know that the best way to go is by calculating the median. The mode is the most common value, for example the mode for: 4,6,7,7,7,7,9,10 is 7 In Excel you can calculate the mode by using the =MODE(4,6,7,7,7,7,9,10) formula. Although this looks nice keep in mind that in Excel the lowest mode is considered, or in other words, if you have to calculate the mode for the following data set 2,2,2, 4,5,6, 7,7,7 ,8,9 you can see that you have two modes, 2 and 7 but Excel will show you only the smallest value: 2. When can we use the mode function? Calculating the mode is beneficial only for whole numbers such as 1, 2 and 3. It is not useful for fractional numbers such as 1,744; 2.443; 3,323, as the chance to have duplicated numbers, or a mode, is very small. A great example of calculating the mode, or the most frequent number, will be probably on a survey. Histograms Let’s say your blog recently received hundreds of guest posts, some of them are very good ones but some of them are just not that good. Maybe you want to see how many of your blog posts received 10 backlinks, 20, 30 and so on, or maybe you are interested in social shares such as tweets or likes, but why not just simply visits. Here we will categorize them into groups by using a visual representation called histograms . In this example I will use visits/articles as an easy example. The way I setup my Google Analytics account is as follows. I have a profile that tracks only my blog, nothing else. If you don’t have such profile setup yet, then you can create a segment on the fly. How are you doing this? Pretty simple: Now go to export-> CSV Open the excel spread sheet and delete all the columns except for Landing Page and Visits. Now create the ranges (also called bins) that you want to be categorized into. Let’s say we want to see how many articles generated 100 visits, 300, 500 and so on. Got to Data -> Data Analysis-> Histograms-> OK Input range will be the visits column Bin Range will be the groups Output Range, click on the cell where you want your histogram to show up Check Chart Output Click OK Now you have a nice histogram that shows you the number of articles categorized by visits. To make it easier to understand this histogram, click on any cell from the Bin and Frequency table and sort the frequency by low to high. Analyzing now the data is even easier. Now go back and sort all the articles with less or equal to 100 visits (Visit drop down-> Number filters-> Between…0-100-> Ok) in the last month and update them, or promote them. Visits by source How valuable this report is for you? It’s pretty good but not amazing. We can see ups and downs but…how much did YouTube contribute in February to the total visits? You can drill down but that is extra work, and it is very uncomfortable when the question arrives on a phone call with a client. To get the most out of your graphs, create valuable self-descriptive reports. The report above is so much easier to understand. It takes more time to create it but it’s more actionable. What we can see is that in May, Facebook had a bigger amount of contribution to the total than in general. How come? Probably the May marketing campaign was more effective than in other months, resulting in a lot of traffic. Go back and do it again! If it was a working solution, then repeat it. If you consider that May is just by chance bigger than the rest of the months, then you should create a Chi-Square Test to make sure that the increase in visits is not by chance and it is statistically proven the effectiveness of your campaign. The actual column is the number of visits, the expected column is the Mean(average) of the “actual” column. The formula of the Chi-Square test is =1-CHITEST(N10:N16,O10:O16) where N10:N16 are the values from Actual and O10:O16 the values from Expected. The result of 100% is the confidence level that you can have when considering that the work invested in every month campaign impacts the number of visitors coming from Facebook. When creating metrics, make them as easy as possible to understand, and relevant to the business model. Everybody should understand your reports. The video below explains pretty well another example of Chi-Square function: http://www.youtube.com/watch?v=UPawNLQOv-8 Moving average and linear regression for forecasting We often see graphs like the one above. It can represent sales or visits, it doesn’t really matter, it is constantly going up and down. There is a lot of noise in the data that we probably want to eliminate to generate a better understanding. The solution, moving average! This technique is sometimes used by traders for forecasting, the Stock prices are booming one day but in the second they are hitting the floor. Let’s see how we can use their basic techniques to make it work for us. Step 1: Export to excel the number of visits/sales for a long time period, such as one or two years. Step 2: Go to Data-> Data Analysis -> Moving Average -> OK Input range will be the column with the number of visits Interval will be the number of days on which the average is created. Here you should create one moving average with a higher number such as 30 and another one with a smaller number such as 7. Output range will be the column right next to the visits column. Repeat the steps for the interval of 7 days Personal Preference: I didn’t check the chart output and standard error box on purpose, I will create a graph later on. Your data now probably looks similar to this: Now if you select all the columns and create a line chart it will look like this: This representation has less noise, it is easier to read and it shows some trends, the green line cleans up a little bit in the chart but it reacts to almost every major event. The red line instead is more stable and it shows a real trend. At the end of the line chart you can see that it says Forecast. That is forecasted data based on previous trends. In Excel there are two ways for creating a linear regression, using the formula =FORECAST(x,known_y’s, known_x’s) where ” x” stands for the date you want to forecast, “known_y’s” are the visits column and “known_x’s” are the date column. This technique is not that complicated but there is an easier way to do this. By selecting the entire visits column and dragging down the field handle it will automatically forecast for the following dates. Note: Make sure to select the entire data set in order to generate an accurate data set. There is a theory when comparing a 7day moving average and a 30day. As said above the 7day line reacts to almost every major change while the 30day one requires more time to change its direction. As a rule of thumb when the 7day moving average is intersecting the 30day moving average then you can expect a major change that will last longer than a day or two. As you can see above around April 6th the 7 day moving average is intersecting the 30 day one and the number of visits are going down, around June 6th the lines are crossing again and the trends are going upward. This technique is useful when you are losing traffic and you are not yet sure if it is just the trend or it is just a daily fluctuation. Trendline The same results can be achieved by using the trend line feature of excel: Right click on the wiggling line -> select: Add Trendline Now you can select the Regression Type and you can use the Forecast feature as well. Trendlines are probably the most useful to find out if your traffic/sales are going upward, downward or are simply flat. Without the linear function we cannot confidently tell if we are doing better or not. By adding a linear trendline we can see that the slope is positive the trendline equation explains how our trend is moving. y=0.5516x-9541.2 X represents the number of days. The coefficient to x, 0.5516, is a positive number. This means that the trendline is going upward. In other words every day that passes by we increase the number of visitors with 0.5 as a trend. R^2 represents the level of accuracy of the model. Our R^2 number is 0.26 which implies that our model explains 26% of the variations. Simply said: we are 26% confident that every other day that passes by our number of visitors increases with one new visitor. Seasonal Forecasting Christmas is coming soon and forecasting the winter season can be beneficial especially when your expectations are high. If you didn’t get hit by Panda or Penguin and your sales/visitors are following a seasonal trend, then you can forecast a pattern for sales or visitors. Seasonal forecasting is technique that enables us to estimate future values of a data set that follows a recurring variation. Seasonal datasets are everywhere, an ice cream store will be very profitable during the summer season and a gift store can reach the maximum sales during the winter holidays. Forecasting data for near future can be very beneficial, especially when we planning to invest money in marketing for those seasons. The following example is a basic model but this can be expanded to a more complex one to fit your business model. Download the Excel forecasting example I will break up the process into steps to be easier to follow. The best way to implement it for your business is by downloading the Excel spreadsheet and following the steps: export your data, the more data you have the better forecasting you can make! and place the dates into column A and sales into column B. Calculate the index for each month and add the data in column C In order to calculate the index scroll down at the bottom right of the spreadsheet and you will find a table called Index. The index for Jan-2009 is calculated by dividing the sales from Jan-2009 by the average sales of the entire year 2009. Repeat calculating the index for every month of every year. In column S38 to S51 we calculated the average index for every month Because our seasonality is every 12 month we copied the index means into column C over and over again matching up every month. As you can see January of 2009 has the same index data as January 2010 and 2011 In column D calculate the Adjusted data by dividing the monthly sales by the index =B10/C10 Select the values from column A, B and D and create a line chart Select the adjusted line (in my case the Red line) and add a linear trendline, check the “Display Equation on Chart” box Calculate the backcasted non-seasonal data by multiplying the monthly sales by the coefficient from the trandline equation and adding the constant from the equation (column E) After creating the trendline and we displayed the Equation on the chart we consider the Coefficient the number which is multiplied by X and the constant the number that is usually has a negative sign. We place the coefficient into cell E2 and the Constant into cell F2 Calculate the Backcasted Seasonal data by multiplying the index (column C) with the previously calculated data (column E) Calculate MPE(mean percentage error) by dividing sales by Backcasted seasonal minus 1 (=B10/F10-1) Calculate MAPE (mean adjusted percentage error) by squaring the MPE column (=G10^2) In my case cell F50 and F51 represents the forecasted data for Nov-2012 and Dec-2012. Cell H52 represents the error margin. By using this technique we can say that in December 2012 we are going to make $22,022 +- 3.11%. Now go to your boss and show him how you can predict the future. Standard deviation Standard deviation tells us how much we deviate from the mean, in other words we can interpret it as a confidence level. For example if you have monthly sales, your daily sales will be different every day. Then you can use the standard deviation to calculate how much you deviate from the monthly average. There are two Standard Deviation formulas in Excel that you can use. =stdev -when you have sample data -> Avinash Kaushik explains in more details how sampling works http://www.kaushik.net/avinash/web-analytics-data-sampling-411/ or =stdevp -when you have the entire population, in other words you are analyzing every visitor. My personal preference is =stdev just because there are cases when the JS tracking code is not executed. Let’s see how we can apply Standard Deviation in our daily life Probably you see the wiggling graph in analytics daily but it is not very intuitive. By using standard deviation in Excel you can easily visualize and understand better what is happening with your data. As you can see above, average daily visits were 501 with a standard deviation of 53, also the most important, you can see where you exceeded the normal so you can go back and check out which of your marketing efforts caused that spike. For the Excel document use the following link http://blog.instantcognition.com/wp-content/uploads/2007/01/controllimits_final.xls Correlation Correlation is the tendency that one variable change is related to another variable. A common example in web analytics can be the number of visitors and the number of sales. The more qualified visitors you have the more sales you have. Dr Pete has a nice infographic explaining correlation vs. causation http://www.seomoz.org/blog/correlation-vs-causation-mathographic In Excel we use the following formula to determine the correlation: =correl(x,y) As you can see above we have a correlation between Visits and Sales of 0.1. What does this mean? between 0 and 0.3 is considered weak between 0.3 and 0.7 is normal above 0.7 is strong The conclusion in our case is that daily visits don’t affect daily sales, which also means that the visitors that you are attracting are not qualified for conversion. You also have to consider your business sense when making a decision. But a correlation of 0.1 may not be overlooked. If you want to correlate three or more datasets you can use the correlation function from the Data Analysis tool. Data-> Data Analysis-> Correlation Your result will look similar to this one: What we can see here is that none of the elements correlate with each other: Sales and visitors= correlation of 0.1 Sales and Social Shares = correlation of 0.23 Descriptive Statistics for quick analysis Now you have a pretty good understanding of the mean, standard deviations etc. but calculating each statistical element can take a long time. The Data Analysis tool provides a quick summary of the most common elements. Go to Data-> Data Analysis-> Descriptive Statistics Input Range – select the data you want to analyze Output Range – select the cell where you want your table to be displayed Check Summary Statistics The result is pretty nice: You already know most of the elements but what is new here is Kurtosis and Skewness Kurtosis explains how far peaked the curve is from the mean, in other words the higher the kurtosis value is the bigger the peak is on the sides, in our case the kurtosis is a very low number which means the values are spread out evenly Skewness explains if your data is negatively or positively skewed from a normal distribution. Now let me show you more visually what I mean: Skeweness: -0.28 (the distribution is more likely oriented towards the higher values 2500 and 3000) Kurtosis: -0.47 (we have a very small peak deviation from the center) These are some of the techniques that you can use when analyzing data, the biggest challenge behind statistics and Excel is the ability of applying these techniques in various situations and not being limited to visits or sales. A great example of multiple statistical approaches implemented together was realized by Tom Anthony in his post about Link Profile Tool . The examples above are just a small fraction of what can be done with statistics and Excel. If you are using other techniques that help you take faster and better decisions I would love to hear about them in the comment section. Sign up for The Moz Top 10 , a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Read More:
Excel Statistics for SEO and Data Analysis