Skip to content

URL Rewriting and SEO

November 3, 2010

After reading Scott Guthrie’s article – Fix Common SEO Problems Using the URL Rewrite Extension – here is my attempt at implementing rewriting rules for an Umbraco site. It’s essentially the same as Scott’s article, with a few tweaks for Umbraco.

  • Canonical home page – I’ve added a hard-coded canonical rewrite of the root content node (as the root node in Umbraco can be accessed via /nodename.aspx as well as /default.aspx).
  • Directory URLs – I’m using DirectoryUrls for my Umbraco site, which allows the URL with or without the “.aspx” extension, so I’ve added a rule to trim the “.aspx” extension (except in the /umbraco folder).
  • Canonical host name – This example redirects all requests with a “www” prefix to the equivalent URL without the www.
  • Trailing slash – I’m removing trailing slashes from all requests.
  • Lower case – I’m forcing all URLs to lower-case, but I’ve added exclusions for static assets (css, images, js) as well as .axd resource files. Again, I’ve excluded the /umbraco folder.

Here’s the code from my web.config:

<rewrite>
			<rules>

				<!-- SEO: Canonical home page - redirect from home page /nodename to home page with no path -->
				<!-- assumes root node is called "Home" -->
				<rule name="Canonical home page" stopProcessing="true">
					<match url="^(home|default\.aspx)$" />
					<action type="Redirect" redirectType="Permanent" url="/" />
				</rule>

				<!-- SEO: Using Directory URLs, so force trim all .aspx -->
				<!-- exclude umbraco folder -->
				<rule name="Trim aspx for directory URLs" stopProcessing="true">
					<match url="(.*)\.aspx$" />
					<conditions>
						<add input="{REQUEST_URI}" pattern="^/umbraco/" negate="true" />
					</conditions>
					<action type="Redirect" redirectType="Permanent" url="{R:1}" />
				</rule>

				<!-- SEO: Canonical host name - consistent use/absence of www. -->
				<!-- in this case, we are redirecting to host name without www -->
				<rule name="Canonical host name" stopProcessing="true">
					<match url="(.*)" />
					<conditions>
						<add input="{HTTP_HOST}" pattern="^yourdomain\.com$" negate="true" />
					</conditions>
					<action type="Redirect" redirectType="Permanent" url="http://yourdomain.com/{R:1}" />
				</rule>

				<!-- SEO: Remove trailing slash from URLs -->
				<rule name="Remove trailing slash" stopProcessing="true">
					<match url="(.*)/$" />
					<conditions>
						<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
						<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
					</conditions>
					<action type="Redirect" redirectType="Permanent" url="{R:1}" />
				</rule>

				<!-- SEO: Force lower-case for URLs -->
				<!-- exclude umbraco folder, and all static requests to images, css, js and axd resource files -->
				<rule name="LowerCaseRule1" stopProcessing="true">
					<match url=".*[A-Z].*" ignoreCase="false" />
					<conditions>
						<add input="{REQUEST_URI}" pattern="^/umbraco/" negate="true" />
						<add input="{URL}" pattern="^.*\.(axd|css|js|jpg|jpeg|png|gif)$" negate="true" ignoreCase="true" />
					</conditions>
					<action type="Redirect" redirectType="Permanent" url="{ToLower:{URL}}" />
				</rule>

			</rules>

		</rewrite>

These are all 301 permanent redirects.

Feel free to suggest improvements or amendments – trying to get to a “best-practice” approach, so all advice is welcome!

Cheers,

Mike

From → Uncategorized

33 Comments
  1. Trailing slashes – I would INCLUDE them, not remove them. IE will automatically add it, and in Chrome, if you copy/paste the URL, it will include the trailing slash. If you request a URL without the trailing slash, the request is actually:

    HTTP GET /

    The correct syntax is to include the trailing slash. Great article by the way – nice work :)

    • Interesting point – my understanding was that it didn’t matter too much whether you used the trailing slash or not, as long as all your links are consistent.

      There’s also a school of thought that the trailing slash denotes a folder (ie another level of depth) and deeper content can be detrimental.

      Anyone got a definitive position on this?

      Mike

  2. Great approach to get a “best-practice-guide” for Umbraco SEO!

    One small comment about the “Canonical host name”. I’ve heard some SEO-gurus that recommend that you redirect to the www-version instead.

    The reason is that you loose some (even though it’s not much) link power on each redirect. And most people when linking to your page will spontaniously use the www-version.

    /Fredrik

    • Agreed – I had originally set up domain.local on my dev machine, but for a live site I agree the www would be preferable.

  3. Roland permalink

    @Mike: The trailing slash is needed as google crawl treats any URL as if it contained a trailing slash.
    @Fredrik Sewén: That’s right. Google Web Master Tool explains it:
    http://www.google.com/support/webmasters/bin/answer.py?answer=55281&hl=en

    I have just used Url rewriting for a website I am focused on. I started from the Gu’s recommendations. However using DirectoryUrl and trailing slashes didn’t work at first glance.

    Moreover I run into some issues in the back-office :o It is funny you didn’t have any issues.

    I will post the relevant part of my web.config later.

  4. Roland permalink

    Here is mine:

    <rewrite>
                <rules>
                    <clear />
                    <rule name="CanonicalHostNameRule1">
                        <match url="(.*)" />
                        <conditions>
                            <add input="{HTTP_HOST}" pattern="^www\.mydomain\.com$" negate="true" />
                        </conditions>
                        <action type="Redirect" url="http://www.mydomain.com/{R:1}" />
                    </rule>
                    <rule name="Default Document URL Rewrite" enabled="true" stopProcessing="true">
                        <match url="(.*?)/?Default\.aspx$" />
                        <conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
                        <action type="Redirect" url="{R:1}/" />
                    </rule>
                    <rule name="LowerCaseRule1" enabled="true" stopProcessing="true">
                        <match url="[A-Z]" ignoreCase="false" />
                        <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
                            <add input="{URL}" pattern=".axd" negate="true" />
                            <add input="{URL}" pattern=".asmx" negate="true" />
                            <add input="{URL}" pattern=".jpg" negate="true" />
                            <add input="{URL}" pattern=".png" negate="true" />
                            <add input="{URL}" pattern=".gif" negate="true" />
                            <add input="{URL}" pattern=".js" negate="true" />
                            <add input="{URL}" pattern="/Base" negate="true" />
                            <add input="{URL}" pattern="cdv=1" negate="true" />
                        </conditions>
                        <action type="Redirect" url="{ToLower:{URL}}" />
                    </rule>
                    <rule name="AddTrailingSlashRule1" enabled="true" stopProcessing="true">
                        <match url="(.*[^/])$" ignoreCase="true" />
                        <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
                            <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
                            <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
                            <add input="{URL}" pattern=".axd" negate="true" />
                            <add input="{URL}" pattern=".asmx" negate="true" />
                            <add input="{URL}" pattern=".aspx" negate="true" />
                            <add input="{URL}" pattern=".mp3" negate="true" />
                        </conditions>
                        <action type="Redirect" url="{R:1}/" />
                    </rule>
                </rules>
                <outboundRules>
                    <preConditions>
                        <preCondition name="ResponseIsHtml1">
                            <add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
                        </preCondition>
                    </preConditions>
                </outboundRules>
            </rewrite>
    

    Mine is slightly different from yours at Rules “Trailing Slash” and “Case insensitive”

  5. I’ve tweaked my code based on some of the above feedback – here is what I am currently using…

    <rewrite>
    			<rules>
    
    				<!-- SEO: Canonical host name - consistent use of www. -->
    				<!-- better to use the www version -->
    				<rule name="Canonical host name" stopProcessing="true">
    					<match url="(.*)" />
    					<conditions>
    						<add input="{HTTP_HOST}" pattern="^www.flag10\.local$" negate="true" />
    					</conditions>
    					<action type="Redirect" redirectType="Permanent" url="http://www.flag10.local/{R:1}" />
    				</rule>
    
    				
    				<!-- SEO: Canonical home page - redirect from home page /nodename to home page with no path -->
    				<!-- assumes root node is called "Home" -->
    				<rule name="Canonical home page" stopProcessing="true">
    					<match url="^(home|home\.aspx|default\.aspx)$" />
    					<action type="Redirect" redirectType="Permanent" url="/" />
    				</rule>
    
    
    				<!-- SEO: Using Directory URLs, so force trim all .aspx -->
    				<!-- exclude umbraco folder -->
    				<rule name="Trim aspx for directory URLs" stopProcessing="true">
    					<match url="(.*)\.aspx$" />
    					<conditions>
    						<add input="{REQUEST_URI}" pattern="^/umbraco/" negate="true" />
    					</conditions>
    					<action type="Redirect" redirectType="Permanent" url="{R:1}/" />
    				</rule>
    
    
    				<!-- SEO: Add trailing slash to URLs -->
    				<!-- better for SEO to *have* the trailing slash -->
    				<rule name="Add trailing slash" stopProcessing="true">
    					<match url="(.*[^/])$" ignoreCase="true" />
    					<conditions>
    						<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
    						<add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
    						<add input="{REQUEST_URI}" pattern="^/umbraco/" negate="true" />
    						<add input="{URL}" pattern="^.*\.(asp|aspx|axd|asmx|css|js|jpg|jpeg|png|gif|mp3)$" negate="true" ignoreCase="true" />
    						<add input="{URL}" pattern="/Base" negate="true" />
    						<add input="{URL}" pattern="cdv=1" negate="true" />
    					</conditions>
    					<action type="Redirect" redirectType="Permanent" url="{R:1}/" />
    				</rule>
    
    				
    				<!-- SEO: Force lower-case for URLs -->
    				<!-- exclude umbraco folder, and all static requests to images, css, js and axd resource files -->
    				<rule name="LowerCaseRule1" stopProcessing="true">
    					<match url=".*[A-Z].*" ignoreCase="false" />
    					<conditions>
    						<add input="{REQUEST_URI}" pattern="^/umbraco/" negate="true" />
    						<add input="{URL}" pattern="^.*\.(axd|asmx|css|js|jpg|jpeg|png|gif|mp3)$" negate="true" ignoreCase="true" />
    						<add input="{URL}" pattern="/Base" negate="true" />
    						<add input="{URL}" pattern="cdv=1" negate="true" />
    					</conditions>
    					<action type="Redirect" redirectType="Permanent" url="{ToLower:{URL}}" />
    				</rule>
    
    			</rules>
    			
    		</rewrite>
    
  6. Excelent work, I will use this as inspiration on my Umbraco installation for sure.

  7. Phil Jackson permalink

    Hi, This is exactly what I’m looking for to implement SEO requirements for our website. But where do I add this in? In the web.config? And if so, in what section? Thanks.

    • Hi Phil,

      Yes, it goes in the web.config here:

      <configuration>
           <system.webServer>
                <rewrite>
                     <rules>
                          <!-- in here -->
                     </rules>
                </rewrite>
           </system.webServer>
      </configuration>
      
  8. Phil permalink

    Thanks for your reply Mike. I put the rules in there but to no lucky. They just aren’t getting hit for some reason, and as a result my pages do not include the trailing slash to the URLs, and are not forced to be in lowercase.

    I am testing this by accessing my site through the Cassini web server via Visual Studio. Could there be something else I may be missing in the setup/configuration?

    • I think you need IIS7 with the URL Rewrite module installed for this to work. As far as I know, you can’t use Rewrite with the Cassini web server.

  9. Why not use the build in UrlRewriting.config?

    • I haven’t tried that (I used the URL Rewrite module in IIS, which automatically writes into the web.config). Would be interested to hear if anyone can replicate these rules in UrlRewriting.config…

  10. For “Canoncical host name” I’d use this regex with negate=”true”, so you don’t have to alter ‘yourdomain.com’ with every project:

    ^www\..+

  11. Yes, with the IIS UrlRewrite plug-in it is ;-)

  12. Bleh, that went wel…
    Attempt #2:

    http://snipt.org/okppn/

  13. Well, about my last comment, I got it working by using UrlRewritingNet:

  14. And once again my 2nd attempt:
    http://snipt.org/pZnn/

  15. Came across another issue where preview mode breaks – it’s because /umbraco/dialogs/preview.aspx redirects to (eg) /1082.aspx (where 1082 is the node ID), and the rewrite rule then tries to strip the .aspx off and add a trailing slash…

    The solution is to add the following into the two affected rules (trim .aspx and add trailing slash):

    <add input="{REQUEST_URI}" pattern="^/([0-9]+).aspx" negate="true" />
    

    Current set up can be seen here:
    http://snipt.net/m1ketayl0r/umbraco-seo-url-rewriting

  16. Thanks for the heads up Mike :-)
    Here’s the edited UrlRewriting.NET rule:
    http://snipt.org/tnolj/

  17. Chris Tarba permalink

    Thanks, this seems to be the only reference that I can find that refers to excluding directories within rules. I am trying to force my site to load under HTTP, except for a couple of sub-directory with forms that must be HTTPS. Links to files in those sub-directories are to HTTPS and SSL is working fine. I apologise if I am pushing my luck here, and if I am, please ignore me. But if you are prepared to look at this, my code is below. I have used your “excluded directory” syntax but it seems to be doing the opposite i.e. enforcing the rule only on the directory which I am trying to exclude it from. Should my match URL be defined differently? Thanks and here’s hoping xxx

    • Unfortunately (and irritatingly) pasted code doesn’t work in comments – could you email me the code at mike (at) miketaylor (dot) eu, or maybe post in on snipt.net?

      Thanks,

      Mike

  18. Dave Stringer permalink

    Hello, I’m just wondering, what version of Umraco are you doing this for? In my UrlWritingConfig i see

    <!–
    <rule….

    And the syntax above throws errors.

    Please help!

    • I don’t think your code has pasted correctly – but I think I can answer your question anyway. The code shown in this post uses the URLRewrite module in IIS7 to do the rewriting, not Umbraco’s built-in UrlRewriting.config engine, which has a different syntax. A similar approach may be possible using UrlRewriting.config but I used IIS7 just because it was available and easier for me.

  19. Hi i put this code:

    and work but the postback not, i am working an asp.net application and i have a button and when the user press the button i display a results. what i can do for my problem. thanks.

  20. This was extremely helpful. Seems to work like a charm with Umbraco 4.7 also. Thanks a lot :-)

  21. Thanks for your response, what I am trying do to is to take the old url: http://www.mydomain.com/Villa.aspx?propertyid=2071 and re-direct to http://www.mydomain.com/Villa/2071.

    With the current code I am using it re-directs to http://www.mydomain.com/villa/?propertyid=2071.

    I cannot remove the question mark.

Trackbacks & Pingbacks

  1. Teknisk SEO af Umbraco

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.