Redirects Using 404 Error Handling in ASP.NET 2.0 on Shared Hosting (Part 4)

This is the fourth and last in a series of blog entries on using 404 error handling to redirect obsolete HTML pages to newly created ASPX pages.  You can read the earlier posts in the series at the links below:

In the third post in this series, I added ASP.NET custom error page handling to the IIS custom error page handling for HTTP 404 errors.  To understand why I needed both IIS and ASP.NET custom error page handling, read through the earlier blog entries in this series.   

Redirects from IIS and ASP.NET Custom Error Pages

Now that I’ve got all 404 errors for all file-types redirected to my custom error page, it’s time to set up redirection for those obsolete pages that have been replaced. To do this, I’ll need to obtain the originally requested page from the query string passed to my custom error page.

IIS custom error page redirection passes a query string that looks like this:

404;http://www.yourserver.com:80/SomeBogusPage.html

ASP.NET custom error page redirection works a bit differently, and passes a query string that looks like this:

aspxerrorpath=/SomeBogusPage.aspx

Since the one error page is being used by both the IIS custom error redirection and the ASP.NET custom error redirection, the logic for determining if an old-to-new page URL mapping exists would have to accommodate the both methods of passing information on the original requested resource.

Here’s some code that does just that:

   private string GetRequestedPath()
   {
      string path = "unknown";
      string qstr = HttpUtility.UrlDecode(Request.QueryString.ToString());
      if (!string.IsNullOrEmpty(qstr))
      {
         path = Request.QueryString["aspxerrorpath"]; // try to get asp.net error info
         if (string.IsNullOrEmpty(path))                         // if none, must be IIS error
         {
            if (qstr.StartsWith("404"))
            {
               int start = qstr.IndexOf(":80");
               if (start != -1)
               {
                  path = qstr.Substring(start + 3);
               }
            }
         }
      }
      return path;
   }

The first thing the method does is decode the query string to make it easier to work with. Next, we see if the request is from ASP.NET, in which case there will be an aspxerrorpath query string parameter. If that doesn’t exist, the redirect came from IIS so we do a little work to separate out the requested path.

An incoming URL like this: http://www.myserver.com/SomeBogusPage.html, results in a return value of “/SomeBogusPage.html”. A URL pointing to a page in a sub-folder like http://www.myserver.com/prds/oldPrd.html would return “/prds/oldPrd.html”.

Now that we have the visitors requested path (typically a web page), we can see if it’s a page for which we have replacement. There are any number of ways to represent mappings from old paths to new URLs. You can chain if-then-else string comparisons, set up a static array or load the mappings from the database into a DataSet. Do whatever works best for you. For my purposes, since I had only a few pages to redirect, I just used a static array.

 

Making the Redirect SEO Friendly

ASP.NET provides the Response.Redirect method for easily redirecting to a specified URL on your domain. There’s a problem with that Redirect method, however, in that it sets an HTTP response code of 302 (found) which basically says that the resource has moved temporarily and that a substitute page was found. In general, I prefer to either return an HTTP response code of 404 indicating that the resource simply doesn’t exist or return 301 indicating that the resource has moved permanently.

Human visitors to your web site don’t care what you return. But search engine spiders do. 301 and 404 give pretty clear instructions to a search engine spider, essentially telling it “use this new page in place of that old page” or “this page no longer exists, remove it from your index” respectively. 302 sends a kind of mixed message, sort of “hey, the page isn’t here right now; it might be back later; it might not, or whatever”.  Search engine's like Google do not seem to like this indecision.

When an incoming URL path is an old page that has a newer replacement page, I want to return 301. I do that by calling the following utility method instead of Response.Redirect:

      public static void Redirect301(string url, bool endResponse)
      {
         System.Web.HttpContext context = System.Web.HttpContext.Current;
         if (context != null)
         {
            context.Response.Status = "301 Moved Permanently";
            context.Response.AddHeader("Location", url);
            if (endResponse)
            {
               context.Response.End();
            }
         }
      }

If I don’t find a mapping for the requested URL, I just manually set the response status to 404 by doing this:

      System.Web.HttpContext context = System.Web.HttpContext.Current;
      if (context != null)
      {
         context.Response.Status = "404 Not Found";
      }

Conclusion

I hope this series of blog entries helps some of you who are wending your way through the ins and outs of redirecting old pages to new pages, or just providing custom error pages on ASP.NET web sites. The solution I presented is far from perfect, but it seems to be working.

The one major flaw for which I'm seeking a fix is that incoming requests for ASPX resources with no permanent redirect still return 302 errors. Somehow, my manipulation of the current context response status, inserting a 404 or 301 response, does not find its way out of the page processing pipeline and back to the visitor. It works okay for non-ASPX resources. But something ASP.NET is doing is gumming up the works.

Once I figure that out (and suggestions are very welcome), I’ll post the solution.  If someone has an idea, please post a comment.

Comments

4/21/2008 11:30:49 AM #

Very informative and helpful.  Thanks very much for posting this.

Al United States

4/21/2008 11:46:41 AM #

You're welcome.  I will be posting some updates on this topic shortly.  IIS7 changes things a bit.

Andy United States

4/29/2008 8:09:51 PM #

Great stuff, saved me the headaches of wondering WTF is going in my web server.  Thanks a million!

kenn United States

4/29/2008 9:04:00 PM #

Glad it was helpful.

Andy United States

7/3/2008 4:28:08 AM #

Great article, but under "Making the Redirect SEO Friendly" where you discuss returning a 404 for missing content, shouldn't:

      if (context != null)
      {
         ...

be:
      if (context == null)
      {
         ...

?

Thanks again,

tim United Kingdom

7/13/2008 1:01:26 AM #

@tim

Nope, the code's correct.  In both cases, 301 or 404, I'm just checking to ensure that the context is not null before trying to do something with it.  It's probably overkill as I don't know of any normal situations whereby the context would not be non-null - maybe inside an HttpHandler or HttpModule.

I'm also being a little inconsistent in that I'm not also checking that the Response object is non-null. But if there's an HttpContext, I figure there's going to be a Response object.  

Andy United States

8/5/2008 11:19:11 AM #

Thanks for the information, please change the colors of your website its hard to read.

Cobus South Africa

8/6/2008 2:18:09 AM #

Thanks for Information. Its really good. Only problem i am facing is 320-200 ok response insted of 404. I checked thru Fiddler . For bad or non exists URL with aspx page request it shows may page (404.aspx) but does not returns 404 status code. Do you have solution / workaround for this? i uses IIS6.

Thanks again.

Utkarsha India

8/6/2008 11:53:53 PM #

@Utkarsha:

I have the same problem (see the last couple paragraphs of the blog entry above).  For non-existing resources, ASP.NET is overriding my override of the status code.  

I'll post a fix as soon as I find one.

Andy United States

8/12/2008 5:14:40 PM #

Have you posted info regarding IIS7 yet? I'm working on a new site and am trying to get the custom 404 working.

Thanks, I appreciate the information here.

Jeremy United States

8/12/2008 6:11:17 PM #

@Jeremy:

I haven't posted anything specific to IIS7.  But I think the general concepts here should work okay under IIS7.  They don't take advantage of any IIS7-specific features, but they should still work.  In most shared-hosting situations, which is one of the conditions the article addresses, users don't have direct access to IIS anyway - there's some form of host-specific control panel.

Andy United States

9/11/2008 3:34:42 AM #

Hello,

Thank you indeed,  quite a useful post here, saved me loads of headache...

Ike United States

9/11/2008 6:09:05 PM #

@grayfox:

You're welcome.  I'm glad it helped out some.

Andy United States