Come On Google … :: Google PageRank Update
More Search Optimization News….
Tags: blogs blogger google pagerank sem
Please be sure to visit our new Social Blog Network, as well as our Social Bookmark Site. Both offer services for free!
Get you latest SEO news fix at AutoPrimeMedia.com!
Search Engine Optimization is not an exact science. It takes lot’s of work and research. Trial and Error. Read through our posts here and try to learn from our experience. We offer some of our insight… news and comment. Please feel free to share your thoughts, and ideas. The site does allow "follows" so post your links.
IIS Case Folding, Robots and Results :: Social Marketing News
Posted by Jeremy Chatfield
For the last few years, I’ve been doing some SEO on Apache sites. Suddenly, this year, I’ve had a clutch of IIS sites to handle and I’m seeing some puzzling and worrying things which appear to be caused by the way that Microsoft defaults to "caseless" file systems. Worrying things as in "damaging to search engine results." I can’t find any guidance from Microsoft’s knowledge base or Live Search, nor from Yahoo! and Google Webmaster guidelines. Have I missed something?
What is Case Folding?
If I have a file called "default.asp", I can call it "DEFAULT.ASP" and "dEfAuLt.AsP" and still open it. Upper and lower case letters are treated as one.
This is required in handling domain names. The original domain name service specifications ensure that "MERJIS.COM" and "Merjis.com" and "merjis.com" all map to the same machine on the Internet. So there’s no problems with inbound links, or in-site links, that refer to the web site with different cases in the domain name part of a URL.
However, the original World Wide Web Consortium spec is clearly based on Unix and Linux usage in the scientific community. Unix and Linux have case-respecting file systems. That is "default.aspx" and "Default.aspx" are two different files.
The result is that "http://merjis.com/contact" and "http://merjis.com/Contact" are two distinct and different URLs. On a Linux system, you could have two different files to deliver the contents. But on an IIS system, although you can make the request for two different files, you are delivered the contents of a single file.
Robot Exclusion Protocol
If you don’t want part of your web site crawled–for example, a private, members only area–you can tell web robots to steer clear. You drop a "robots.txt" file with a couple of lines like:
User-Agent: *
Disallow: /private
This will tell search engine spiders like GoogleBot that you do not want Google to crawl these pages.
The problem, of course, is that like the original W3C specification for a URL, the Robot Exclusion Protocol appears to respect the case of a file name. So if you have accidentally referred to the private members area as "/Private" or "/PRIVATE", then the robots are allowed to crawl that URL. And IIS will fold the case and let the robots look at content that shouldn’t be allowed.
Search Rank and Results
As SEOs know, rank depends on inbound links and the link copy. So if spiders see a few references to an uppercased version of an IIS file and to a lowercased version of the same file, then there can be two different page ranks for the same page. This would tend to decrease the PageRank for both files - it’d have more PageRank if all the links went to a single URL, not two or more.
Obviously, this only becomes a problem when the search engines have both case variations of the file. So the risk becomes real if there is any evidence that the search engines return search results for two or more case variations.
So is This a Real Fear?
I am looking at web server log files for August and September 2008. I can see Yahoo! and Google spiders crawling the same file, under two different case variations. Clearly the spiders aren’t smart to these web servers being IIS and using case folding. If the spiders were smart, they wouldn’t crawl the same file under two case variations. The spiders are also clearly crawling case-respecting variations - that is, if the reserved area for members is called "/Members", then Google crawls "/Members", even if "/members" would also take it to the same place.
This means that so long as all references to private areas have a consistent case usage, then you can rely on using the robot exclusion protocol to deflect the robots unless someone not under your control, such as a third party site, refers to "/MEMBERS" or another case variation - which the robots are allowed to look at.
Search Engine Results
Even worse, the log files show several instances, for different files, where search engine results have led to different pages. For example, imagine that I have a page called "uppercase", triggered by the search query "upper case." I have instances where the search engine query is the same ("upper case") but some search results lead to "/uppercase" and some lead to "/UPPERCASE".
That suggests that the search engines, as well as the spiders, do not understand than IIS folds case. The consequence appears to be that using IIS risks reducing your page rank, for reasons outside your control.
Defenses
You can defend entire private areas of the site. There is a "robots" meta tag that allows you mark each page as being indexed or not. So by marking the private area with "NOINDEX", you can keep those pages out of the search results. They may be crawled, but they shouldn’t be indexed. That will work whatever the case of the filename that was used.
However, I can’t see any simple defense, using IIS, to protect against the multiplicity of search results and the apparent weakening of rank that might follow. There are some tools similar to the Apache mod_rewrite that will rewrite URLs for IIS - allowing you to enforce a mapping to all lower case, for example.
Duplicate Penalties?
So, if the same content is served in multiple case variations and spiders don’t seem to recognise case folding, and search engines appear to multiply-index case variations… are these pages treated as duplicates?
I don’t know.
Is Page Rank affected?
I don’t know. Yet. I’m setting up some experimental sites to see if I can manipulate rank by tweaking capitalisation of links.
Do any SEOmoz readers have any experience of this problem? Am I overreacting to seeing crawling and ranking of multiple variations of the file name?
More Search Optimization News….
Tags: blogger wordpress themes sphinn rand fishkin pagerank
Please be sure to visit our new Social Blog Network, as well as our Social Bookmark Site. Both offer services for free!
Get you latest SEO news fix at AutoPrimeMedia.com!
Search Engine Optimization is not an exact science. It takes lot’s of work and research. Trial and Error. Read through our posts here and try to learn from our experience. We offer some of our insight… news and comment. Please feel free to share your thoughts, and ideas. The site does allow "follows" so post your links.
Google Clears “Abortion” As An AdWord :: Mypsace News Bulletin
Google settled a suit in the UK around the issue of whether or not religious groups can buy the keyword “abortion.” Long story short: They now can (via NYT). Expect a lot more of this kind of thing going forward. Google has the responsibility of being an arbiter of…
More Search Optimization News….
Tags: wordpress seomoz google pagerank seo
Please be sure to visit our new Social Blog Network, as well as our Social Bookmark Site. Both offer services for free!
Get you latest SEO news fix at AutoPrimeMedia.com!
Search Engine Optimization is not an exact science. It takes lot’s of work and research. Trial and Error. Read through our posts here and try to learn from our experience. We offer some of our insight… news and comment. Please feel free to share your thoughts, and ideas. The site does allow "follows" so post your links.
Post Comments :: Search Engine Optimization Weekly Report
More Search Optimization News….
Tags: blogs wordpress myspace free hosting rand fishkin
Please be sure to visit our new Social Blog Network, as well as our Social Bookmark Site. Both offer services for free!
Get you latest SEO news fix at AutoPrimeMedia.com!
Search Engine Optimization is not an exact science. It takes lot’s of work and research. Trial and Error. Read through our posts here and try to learn from our experience. We offer some of our insight… news and comment. Please feel free to share your thoughts, and ideas. The site does allow "follows" so post your links.
Paul Griffiths: Search Blogger of the Day :: MSN/Live Search News
Meet Paul Griffiths, the Search Blogger of the Day. Today I’d like to highlight a post entitled Realistic Salary Expectations? Average Salaries in SEO. Paul talks about salary surveys and how the relate to real world SEO job hunting. Because the SEOVacancies blog is SEO-job focused (which is a really cool niche blog idea, btw), I trust Paul’s opinions.
More Search Optimization News….
Tags: blogger myspace wordpress themes digg rand fishkin
Please be sure to visit our new Social Blog Network, as well as our Social Bookmark Site. Both offer services for free!
Get you latest SEO news fix at AutoPrimeMedia.com!
Search Engine Optimization is not an exact science. It takes lot’s of work and research. Trial and Error. Read through our posts here and try to learn from our experience. We offer some of our insight… news and comment. Please feel free to share your thoughts, and ideas. The site does allow "follows" so post your links.
Reading, Writing, & Helping :: Social Marketing on FaceBook
Brian Clark highlights various levels of reading, and how many bloggers fail to go beyond scratching the surface.
Improved data visualization technologies are making it easier to transfer knowledge.
I few posts back I mentioned that Amazon’s Mechanical Turk would be good for some SEO processes. It looks like people are already using it to spam social media. As sock puppets rise up many of the broad/general social media sites will get polluted by increasing amounts of spam.

My latest column on SEL was about how you have to give your SEO as much information as possible if you want them to solve your problems in an efficient manner, equating a broken website to a sick patient.
More Search Optimization News….
Tags: blogging myspace free blogs cylon page rank
Please be sure to visit our new Social Blog Network, as well as our Social Bookmark Site. Both offer services for free!
Get you latest SEO news fix at AutoPrimeMedia.com!
Search Engine Optimization is not an exact science. It takes lot’s of work and research. Trial and Error. Read through our posts here and try to learn from our experience. We offer some of our insight… news and comment. Please feel free to share your thoughts, and ideas. The site does allow "follows" so post your links.
Chrome: This Is Web OS, Make No Mistake :: StumbleUpon Update
Why launch Chrome (Google’s new “browser”) when Firefox, Google’s favored son, is doing so well? Because Google needs its own. Using a comic book to introduce it is fun, and certainly, there’s always room for new approaches to platform and interface, and Chrome looks to have a lot of…
More Search Optimization News….
Tags: web 2.0 free blogs cylon blogosphere reddit
Please be sure to visit our new Social Blog Network, as well as our Social Bookmark Site. Both offer services for free!
Get you latest SEO news fix at AutoPrimeMedia.com!
Search Engine Optimization is not an exact science. It takes lot’s of work and research. Trial and Error. Read through our posts here and try to learn from our experience. We offer some of our insight… news and comment. Please feel free to share your thoughts, and ideas. The site does allow "follows" so post your links.