Showing posts with label Ranking. Show all posts
Showing posts with label Ranking. Show all posts

Saturday, 7 September 2013

Sep 4th, 2013 - Sudden Drop In Traffic - A Thin Or Lack Of Original Content Ratio Issue?

Many people have reported a sudden drop in traffic to their websites since September 4th, 2013.

Google Webmaster forum is full of related posts. A Webmaster World thread has been started. Search Engine Roundtable mentions 'early signs of a possible major Google update'. A group spreadsheet has been created. No one seems to make sense of what is happening. There is a lot of confusion and speculation, without any definitive conclusion.

I have seen a major drop in traffic on a new website I am working on since then. However, the traffic for this blog has remained the same. No impact.

I am going to post relevant information and facts here as I find them. If you have any relevant or conclusive information to contribute, please do so in the comments and I will include them here. Let's try to understand what has happened.

Symptoms

Facts & Observations

  • Many owners claim no black hat techniques, no keyword stuffing, only original content, legitimate incoming links.
  • Many owners say they have not performed any (or significant) modifications to their website.
  • All keywords and niche are impacted.
  • Both old and new websites are impacted.
  • Both high and low traffic websites are impacted.
  • Some blogs are also impacted.
  • It is an international issue, not specific to a country, region or language.
  • Site with few backlinks are also penalized, not only those with many backlinks.
  • Nothing changed from a Yahoo or Bing ranking perspective.
  • One person mentions a site with thin content still ranking well.
  • At least one website with valuable content has been penalized.
  • Several sites acting as download repositories have been impacted.
  • Some brands have been impacted too.
  • So far, Google has nothing to announce.
Also:
  • In May 2013, Matt Cutts announced that Panda 2.0 is aiming at getting better at fighting blackhat techniques. He also announced that upcoming changes include better identification of websites with higher quality content, higher authority and higher trust. Google wants to know if you are 'an authority in a specific space'. Link analysis will be more sophisticated too.

Example Of Impacted Websites

  1. http://www.keyguru1.com/
  2. http://allmyapps.com/
  3. http://www.techerhut.com
  4. http://domainsigma.com/
  5. http://www.knowledgebase-script.com/
  6. http://alternativeto.net/
  7. http://www.pornwebgames.com/ (adult content)
  8. http://www.fresherventure.net/
  9. http://www.casaplantelor.ro/
  10. http://www.dominicancooking.com/
  11. http://www.weedsthatplease.com/
  12. http://www.medicaltourinfo.com/
  13. http://www.safetricks.net/
  14. http://www.qosmodro.me/
  15. http://botsforgames.com/
  16. http://blog.about-esl.com/
  17. http://last-minute.bg/
  18. http://www.shine.com/
  19. http://www.rcmodelscout.com/
  20. http://itech-hubz.blogspot.nl/
  21. http://gpuboss.com/
  22. http://www.taxandlawdirectory.com/
  23. http://www.newkannada.com/
  24. http://places.udanax.org/review.php?id=33
  25. http://www.seniorlivingguide.com/
  26. http://listdose.com/
  27. http://indianquizleague.blogspot.nl/
  28. http://beginalucrativeonlinebusiness.com/
  29. http://www.teenschoolgirlsfucking.com/ (adult content)
  30. http://quickmoney365.com/
  31. http://www.orthodoxmonasteryicons.com/
  32. http://codecpack.co/
  33. http://www.filecluster.com/
  34. http://www.dominicancooking.com/
  35. http://pk.ilookstyle.com/
  36. http://www.ryojeo.com/2013/08/ciptojunaedy.html

Hypotheses

  • Websites with content duplicated on other pirate websites are penalized.
  • Websites with little or no original or badly written content are penalized (thin content vs plain content ratio).
  • Websites with aggregated content have been penalized.
  • Sites having a bad backlink profile have been penalized.
  • Sites having outbound links to murky site or link farms have been penalized.
  • Ad density is an issue.
  • Google has decided to promote brand websites.
  • This is a follow-up update to the August 21st/22nd update, at a broader scale or deeper level.
  • An update has been posted and contains a bug (or a complex, unanticipated and undesirable side effect).

Analysis

Using collected information and data gathered in the group spreadsheet:
  • Average drop in traffic is around 72%
  • No one reports use of black hat techniques
  • 12,8% report use of grey hat techniques
  • 23,1% report impact before 3rd/4th September
  • 7,7% have an EMD
  • 17,9% had a couple of 404 or server errors
  • 17,9% are not using AdSense
  • 30,8% admit thin content
  • 38,5% admit duplicate content
  • 25,6% admit aggregate content
  • 15,4% admit automatically generated content
  • 64,1% admit thin or duplicate or aggregate or automatically generated content
  • The range of backlinks is 10 to 5.9 millions
  • The range of indexed pages is 45 to 12 millions
The spreadsheet sample contains only 39 entries, which is small.
  1. The broad range for the number of backlinks seems to rule out a pure backlink (quality or amount) issue.
  2. The broad range of indexed pages points at a quality issue, rather than a quantity issue.
  3. More than 92% do not have an EMD, so this rules out a pure exact domain name issue.
  4. More than 82% did not have server or 404 issues, so this rules out them as the main cause for a quality issue.
  5. 17,9% are not using AdSense, meaning this cannot be a 'thin content above the fold' or 'too many ads above the fold' issue only.
  6. Some brand websites have been impacted. Therefore, it does not seem like Google tries to promote them over non-brand websites.
  7. Domain age, country or language are not discriminating factors.

    Best Guess

    By taking a look at the list of impacted websites and the information gathered so far, it seems like we are dealing with a Panda update where sites are delisted or very severely penalized in search rankings because of quality issues.

    These are likely due to thin content, lack of original content, duplicate content, aggregate content or automatically generated content, or a combination of these. It seems like a threshold may have been reached for these sites, triggering the penalty or demotion.

    Regarding duplicate content, there is no evidence confirming for sure that penalties have been triggered because a 3rd party website stole one's content. More than 60% do not report duplicate content issues.

    To summarize it, the September 4th culprit seems to be a high thin or lack of original content ratio issue, leading to an overall lack of high quality content, leading to a lack of trust and authority in one's specific space.

    Unfortunately, Google has a long history of applying harsh mechanical decisions on websites without providing any specific explanation. This leaves people guessing what is wrong with their websites. Obviously, many of the impacted websites are not products of hackers or ill willed people looking for a 'I win - Google looses' relationship.

    Some notifications could be be sent in advance to webmasters who have registered to Google Webmaster Tools. If webmasters do so, it can only mean they are interested in being informed (and not after the facts). This would also give them an opportunity to solve their website issues and work hand-in-hand with Google. So far, there is no opportunity or reward system to do so.

    Possible Solutions

    Someone from the Network Empire claims that Panda is purely algorithmic and that it is run from time to time. If this is true, then this might explain why no one received any notifications or manual penalty in Google Webmaster Tools, and why no one will.

    Google might just be waiting for people to correct issues on their websites and will 'restore' these sites when they pass the Panda filter again. The up side is that this update may not be as fatal as it seems to be.

    Assuming the best guess is correct, the following would help solving or mitigating the impact of this September 4th update:
    • Re-read Dr. Meyers' post about Fat Panda & Thin content.
    • Thin content pages should be marked as noindex (or removed from one's website) or merged into plain/useful/high quality content pages for users.
    • Low quality content (lots of useless text) pages should preferably be removed from the website, or at least be marked as noindex.
    • Internal duplicate content should be eliminated by removing duplicate pages or by using rel="canonical" (canonical pages).
    • Content aggregated from other websites is not original content. Hence, removing these pages can only help (or at least, these page should be marked as noindex).
    • Not enough valuable content above the fold should be solved by removing excessive ads, if any.
    • Old pages not generating traffic should be marked as noindex (or removed).
    • Outbound links to bad pages should be removed (or at least marked as nofollow), especially if they do not contribute to good user experience. This helps restore credibility and authority.
    • Disavow incoming links from dodgy or bad quality websites (if any). One will loose all PageRank benefit from those links, but it will improve their reputation.
    • Regarding Panda, it is known (and I'll post the link when I find it again) that one bad quality page can impact a whole website. So being diligent is a requirement.
    • Lorel says that he has seen improvement on his client websites after de-optimizing and removing excessive $$$ keywords.
    Something to remember:
    • Matt Cutts has confirmed that noindex pages can accumulate and pass PageRank. Therefore, using noindex may be more interesting than removing a page, especially if it has accumulated PageRank and if it has links to other internal pages.

    Friday, 8 March 2013

    Exiting From Google's Sandbox In A Week

    Sharing the experience acquired while trying to get a keyword-stuffed site out of Google's sandbox. I explain how I succeeded faster than methods described on the net so far. Not a silver bullet, but surely a boost. In less than a week, I got my site at search result position number 3 for its title keywords:

    Out Of Google Sandbox - Your World Clocks

    As a reminder, the Google Sandbox effect is observed when a site is indexed by Google, but it does not rank well or at all for keywords it should rank for. The site: command returns pages of your website, proving it is indexed and not banned, but its pages do not appear in search results.

    Google denies there is a sandbox where it would park some sites, but it acknowledges there is something in its algorithms which produces a sandbox effect for sites considered as spam. My site would qualify as spam since it was keyword stuffed.

    I had registered my site URL in Google Webmaster Tools (GWT) and noticed little to no activity. No indexing of pages and keywords. Fetch as Google would not help. I saw a spike in Crawl Stats for a couple of days, then it fell flat. The site would get no queries, yet the site: command returned its main page.

    So, I cleared my site from everything considered spam. I used to following online tools to find my mistakes:

    I used Fetch as Google in GWT again, but it did not help get it out of the Sandbox effect. I read all the posts I could find on the net about this topic. Basically, everyone recommends using white hat SEO techniques (more quality content, quality backlinks, etc...) in order to increase the likelihood Google bots will crawl your site again: "It could take months before you get out of the sandbox...!!!"

    Not true. I have found a way which showed some results in less than a week. My site is now ranked at the 3rd search result position for its niche keyword. I can see my 'carefully selected' keywords in GWT's Content Keywords page.

    So, here is the procedure:
    1. The first step is indeed to clear any spam and bad SEO practices from your site. It is a prerequisite. The following does not work if you don't perform this step with integrity.
    2. Next, make sure your site has a sitemap.xml and a robots.txt file. Make sure the sitemap is complete enough (i.e., it lists all your site's pages or at least the most important ones).
    3. Then, register your sitemap.xml in your robots.txt. You can submit your sitemap.xml to GWT, but it is not mandatory.
    4. Use Fetch as Google to pull your robots.txt in GWT and submit the URL. This makes sure your robots.txt is reachable by Google. It avoids loosing time.
    5. Make sure there is a <lastmod> tag for each page in your sitemap, and make sure you update it to a recent date when you have updated a page. This is especially important if your page contained spam! Keep updating this tag each time you modify a page.
    6. If you don't cheat with the <lastmod> tag, I have noticed Google responds well to it.
    7. Wait for about a week to see unfolding results.
    8. It is as simple as that. No need for expensive SEO consulting!
    My site was new and did not have quality/high PR backlinks. So the probability it would be revisited after clearing the spam was low. It is a chicken and egg problem. Nothing happened for two weeks after I had removed the spam. It is only after I applied the above technique that I obtained tangible results!

    I suspect this method works better than everything suggested so far, because Google bots crawl robots.txt frequently. The sitemap is revisited more often and therefore, Google knows faster that pages have been updated. Hence, it re-crawls them faster, which increase the likelihood of proper re-indexing. No need to wait months for bots to come back. It eliminates the chicken and egg issue.

    I don't think this method would work with sites which got penalized because of one bought fake backlinks or traffic. I think shaking those bad links and traffic away would be a prerequisite too. If this can't be achieved, some have suggested using a new URL. I have never tried this, because I never bought links and traffics, but I am reckon this would be part of the solution.

    Why did I keyword-stuffed my site in the first place? Because, I was frustrated by GWT which would not show relevant indexing data fast enough for new sites, even when clean SEO was applied. Moreover, GWT does not tell when a site falls into the sandbox effect. Google gives you the silent treatment. This is a recipe for disaster when it comes to creating a trusting and educated relationship with publishers.

    Not everyone is a evil hacker!

    Thursday, 14 February 2013

    Dealing With Google Webmaster Tools Frustrations

    If you don't understand the mechanics behind Google Webmaster Tool (GWT, not to be confused with Google's Web-Toolkit framework) and of page indexing, trying to obtain valid information about your website can be a very frustrating experience, especially if it is a new website. This has even led me to take counter-productive actions in order to solve some of GWT flaws. This post is about sharing some experience and tips.

    First, you need to know that GWT is a very slow tool. It will take days, if not weeks to produce correct results and information, unless your website is very popular and already well indexed. Secondly, GWT is obviously aggregating information from multiple Google systems. Each system is producing its own information and if you compare all this information, it is not always coherent. Some of it is outdated or plain out-of-sync.

    Understanding The Indexing Process

    1. Crawling - The first step is having Google's bots crawl your page. It is a required step before indexation. Once a page is crawled, the snapshot is stored in Google's cache. It is analyzed later for indexing by another process.
    2. Indexing - Once a page has been crawled, Google may decide to index it or not. You have no direct influence on this process. The delay can vary according to websites. Once indexed, a page is automatically available in search results (says w3d).
    3. Ranking - An indexed page always has a ranking, unless the corresponding website is penalized. In this case, it can be removed from the index.
    4. Caching - It is a service where Google stores copies of your pages. Google confirms it is the cached version of your page which is used for indexing.
    There are several reasons why a page may not be indexed, or will have a very low ranking:
    • The page falls under bad SEO practices, which includes keyword stuffing, keyword dilution, duplicate content, or low quality content.
    • The page is made unreachable in your robots.txt.
    • There is no URL link to your page and it does not appear in any sitemap known to Google.
    For the sake of simplicity, let's call a clean page "a page which does not fall under bad SEO practices, which is not blocked by your robots.txt and whose URL is known to Google bots via a internal or external links or a sitemap (i.e., it is reachable)".

    Is My Page Indexed?

    Here is a little procedure to follow:
    • The site: command 
      1. Start by running the site: command against the URL of your page (with and without the www. prefix). If it returns your page, then it is indexed for sure. If not, it does not mean your page has not been indexed or that it won't be indexed soon. The  site: command provides an estimation of indexed pages.
      2. You can use the  site: command against the URL of your website to have an estimation of the pages Google has indexed for your site.
    • The cached: command
      1. If the site: command has returned your page, then the cached: command will tell you which version (i.e. snapshot) it has used (or will soon use) for indexing (or reindexing). Remember there is a delay between crawling/caching and indexing.
      2. Else, if it returned nothing and the cached: command returned a snapshot of your page, it means Google bots have managed to crawl your page. This means indexing may or may not happen soon, depending on Google's decision.
      3. If the cached: command still does not return your page after a couple of days or weeks, then it may indicate that you don't have a clean page.

    What Can I Do About It?

    Here is another procedure:
    • No confirmation that your page has been crawled
      1. The first step is to make sure your page's URL is part of a sitemap submitted to Google (eventually using GWT for submission). Don't believe that Google will naturally and quickly find your page for crawling, even if it is backlinked.
      2. Double-check that your page's URL is not blocked by your robots.txt.
      3. Add a link to your sitemap in your robots.txt.
      4. Avoid using the GWT's Fetch As Google feature as Google will penalize excessive use with less frequent visits to your site. It does not accelerate the indexing process. It just notifies Google it should check for new/updated content. Google can be a pacha taking its time.
      5. Always prefer submitting a complete and updated sitemap versus using GWT's Fetch As Google feature. You don't need to resubmit a sitemap if its URL is defined in your robots.txt. Search engines revisit robots.txt from time to time.
      6. Take a look at GWT's crawl stats. It will tell you (with a 2-3 days delay) whether Google bots are processing your site.
      7. Double-check that your page is not suffering from bad SEO practices. Such pages can be excluded from the indexing process.
      8. Be patient, it can take days, and sometimes weeks before Google reacts to your page.
      9. Check GWT's index status page, but never forget it reacts very very slowly to changes. If you are in a hurry, you may obtain faster information by running the site: and cache: commands from time to time.
    • Your page is in the cache, but no confirmation of indexation
      1. Double-check that your page is not suffering from bad SEO practices. Such pages can be excluded from Google's index.
      2. If your site contains thousands of pages, Google will often start by indexing only a subset. Typically, it will be those it thinks have a better chance of matching users' search requests. If your page is not part of them, check whether other pages of your site are indexed using your website URL in the site: command.
      3. If, after being patient, your clean page is still not being indexed, then it probably means Google does not find it interesting enough. You need to improve its content first. Next, try to apply more white hat SEO recommendations. Layout design, readability and navigability are often the culprit when content isn't.
    • Your page is in the index, but does not rank well
      1. Double-check that your page is not suffering from bad SEO practices. Such pages can be included in Google Index with a low ranking.
      2. Make sure you are using proper keywords on your page, title and meta description. Perform the traditional white hat SEO optimization tricks. If you got everything right and still don't get traffic, it means users don't find your content interesting or their is too much competition for what your offer.

    About New Websites & Under Construction

    Because of the slowness of GWT and a lack of understading of its mechanics, I once tried to accelerate the indexing of new websites by first submitting 'under construction' versions, stuffed with relevant keywords. It did not help at all! Not only Google did not index my sites (or with a very bad ranking), once I uploaded the final version a couple of weeks later, Google took weeks to (re)index them properly. Google's cache was soooo out of sync...

    I have noticed that Google gives extra premature exposure to new websites to test their success, before letting them float naturally. It also tries to find out how often your pages are updated. With a new website under construction, not only will you fail the premature exposure because there is no valuable content for users, but if there are weeks before you put the first final version of your site online, Google may decide not to come back to your site for weeks too, even if new content is uploaded in the mean time (pretty frustrating). Of course, you can use GWT's Fetch as Google feature, but there is no guarantee it will accelerate the process (at least this is what I observed).

    Nowadays, I don't register my websites in GWT prematurely. I wait until a first final version is available for production. Next, I apply all the relevant white hat SEO tricks. Then, I create a proper sitemap and robots.txt. At least, after having uploaded everything in production, I register and submit everything to GWT and monitor the indexation process with GWT's crawl stats, together with the site: and cache: commands, until GWT starts to display coherent data. It has eliminated a lot of frustration and teeth grinding!

    Thursday, 7 February 2013

    Online Content: Quality, Ranking And Promotion

    This post is a summary of thumb rules collected here and there and personal experience about content posted online. It is a living document that I will update from time to time:
    • Love the content or have a genuine passion for it! - People notice when you are being authentic and passionate, or not.
    • Content is king! - The quality of the delivered content influences your ranking much more than anything else, especially on the long term.
    • Know what the users wants! - Which is different than believing you know what the customer/readers/users want. Is it valuable to them? Can they find same or better quality on other websites?
    • Give them what they want! (*) - No matter how good your content is, if they don't want it, they won't take it!
    • Tell them where it is! (*) - If they have to dig a mine to get the gold, they never start digging the mine in the first place, especially if they don't know where the gold is! The famous Field Of Dreams' quote "If you build it they will come!" very seldom works on the Internet.
    • Keyword selection is more complex than you think! - I have been seriously humbled by a tip given by a friend. He recommended using Market Samourai (I have no affiliation with this company). I got to understand I only had half of the keyword game rules right. I have increased this blog's traffic by 8-9% by ajusting 3 keywords. Amazing! Even if you won't use this product, DO watch the tutorial videos. It is an incredibly valuable and free SEO education!
    (*) Advice found on Black Hat World. I don't recommend implementing the dirty SEO tricks available there, since sooner or later, Google and others always implement remedies. Sites using those tricks are penalized and loose a lot of ranking. Lack of integrity never pays on the long run...