morzel.net

.net, js, html, arduino, java... no rants or clickbaits.

Easy way to fix outdated links (URL rewrite rule in Web.config)

INTRO

I’ve recently moved my site from BlogEngine.NET 2.0 to 3.3 – thanks to the great work done by BlogEngine.NET team the migration was easy... The only serious problem I’ve noticed was with post links ending with .aspx. For example when Google or CodeProject had a link to such URL:

http://en.morzel.net/post/2014/09/24/OoB-Sonar-with-Arduino-C-JavaScript-and-HTML5-(Part-2).aspx

the post was not found. If .aspx suffix was removed from the address:

http://en.morzel.net/post/2014/09/24/OoB-Sonar-with-Arduino-C-JavaScript-and-HTML5-(Part-2) 

everything was working fine! Fortunately fixing it didn’t require any BlogEngine code changes – all thanks to URL Rewrite Module 2.0 (available since IIS 7) and system system.webServer/rewrite Web.config section.

URL Rewrite is a big topic. Checkout http://www.iis.net/learn/extensions/url-rewrite-module docs if you want to know all the details – you can even do things like setting HTTP headers or server variables! In this post I will focus on how to fix the .aspx link problem and I will also note some issues you might face while trying to setup you own URL rewrite rules…

 

SETTING THE RULES

I’ve added such rewrite section inside system.webServer node in Web.config file:

<rewrite>
    <rules>
        <rule name="FixOldAspxLinks" stopProcessing="true">
            <match url="^(.*post/.+)\.aspx$" />
            <action type="Redirect" url="{R:1}" redirectType="Permanent" />
        </rule>
    </rules>
</rewrite>

It has a single rule that matches all addresses that contain post/ and end with .aspx and triggers redirect action to the same address but with .aspx part dropped.

The rule

Rule has a name (something describing the purpose of the rule is welcome) and stopProcessing=”true” setting which instructs IIS to skip any further rules for matched URL (yes, there's only one rule but having stopProcessing=”true” makes our intention clear).

The match

If you are familiar with regular expressions the url="^(.*post/.+)\.aspx$" attribute should be obvious, if not - don’t worry, it’s simpler than it looks:

  • ^ – means beginning of URL
  • $ – means the end of URL
  • .* – means any character zero or more times
  • .+ – means any character at least once
  • \. – means a literal dot (in regexes . stands for any character so if we literally want to look for a dot we need to escape the special meaning by preceding it with backslash)
  • () – parentheses denote the text (capturing group) what we are going to reference in action element by using {R:1} 

The matching expression could be written in many ways but the one I’ve used solves the problem without going overboard with URL pattern recognition…

The action

We want the browser to look for a new address hence type="Redirect" is set.
New address is specified with url="{R:1}". The {R:1} is a reference to the group captured by matching expression – its value is the text found between parentheses. In our case it’s everything that preceded the .aspx suffix. redirectType="Permanent" instructs the server to issue a 301 Moved Permanently response to the browser. When HTTP client receives permanent redirect it will use the new URL each time it sees a link to the old URL…

Ok, so the above rewrite should be all that’s needed to make .apsx problem disappear! Doesn’t work on your machine? Read on!

 

POSSIBLE ISSUES

No URL Rewrite module installed

Before pushing any changes to remote server I wanted to check rewrite settings on my local IIS 7.5 on Windows 7 x64. I did it and instead of redirect I got HTTP Error 500.19 – Internal Server Error. The error page was useless as it didn’t show any hint on what was wrong with the config... If you face the same issue you probably don’t have IIS Rewrite module installed (it is not added by default). Quick way to find out if you have the module is to check if this file exists: 

%windir%\System32\inetsrv\config\schema\rewrite_schema.xml

I got the installer from here: https://www.microsoft.com/en-us/download/details.aspx?id=7435. After module was added to IIS the rewrite rule started to work :)

Redirect caching

Rewrite rule is setup to redirectType="Permanent" because we want to teach HTTP clients that the resource is moved for good, right? It's all ok unless you are during development and do some changes to the rule – if browser already received 301 response for particular URL your modified rule will not get a chance to work! To solve this problem you can clear the cache but I prefer to have Chrome's dev tools open (with caching disabled) or try to open the page in fresh incognito window…

Pattern testing

Regular expressions are powerful tool but it's very easy to make a mistake while working with them. Fortunately IIS Rewrite Module has it's own panel (snap-in) in IIS Manager:

URL Rewrite module in IIS Manager... Click to enlarge...

that lists rewrite rules used for the site:

Rewrite rule in IIS Manager... Click to enlarge...

If you double click a rule, you will see a window that lets you change rule properties without manual modifications to Web.config. Pressing "Test pattern..." button opens the window in which you can quickly test your regular expression:

Pattern test in IIS Manager... Click to enlarge...