Redirects Using 404 Error Handling in ASP.NET 2.0 on Shared Hosting (Part 4)

This is the fourth and last in a series of blog entries on using 404 error handling to redirect obsolete HTML pages to newly created ASPX pages.  You can read the earlier posts in the series at the links below:

In the third post in this series, I added ASP.NET custom error page handling to the IIS custom error page handling for HTTP 404 errors.  To understand why I needed both IIS and ASP.NET custom error page handling, read through the earlier blog entries in this series.   

Redirects from IIS and ASP.NET Custom Error Pages

Now that I’ve got all 404 errors for all file-types redirected to my custom error page, it’s time to set up redirection for those obsolete pages that have been replaced. To do this, I’ll need to obtain the originally requested page from the query string passed to my custom error page.

IIS custom error page redirection passes a query string that looks like this:

404;http://www.yourserver.com:80/SomeBogusPage.html

ASP.NET custom error page redirection works a bit differently, and passes a query string that looks like this:

aspxerrorpath=/SomeBogusPage.aspx

Since the one error page is being used by both the IIS custom error redirection and the ASP.NET custom error redirection, the logic for determining if an old-to-new page URL mapping exists would have to accommodate the both methods of passing information on the original requested resource.

Here’s some code that does just that:

   private string GetRequestedPath()
   {
      string path = "unknown";
      string qstr = HttpUtility.UrlDecode(Request.QueryString.ToString());
      if (!string.IsNullOrEmpty(qstr))
      {
         path = Request.QueryString["aspxerrorpath"]; // try to get asp.net error info
         if (string.IsNullOrEmpty(path))                         // if none, must be IIS error
         {
            if (qstr.StartsWith("404"))
            {
               int start = qstr.IndexOf(":80");
               if (start != -1)
               {
                  path = qstr.Substring(start + 3);
               }
            }
         }
      }
      return path;
   }

The first thing the method does is decode the query string to make it easier to work with. Next, we see if the request is from ASP.NET, in which case there will be an aspxerrorpath query string parameter. If that doesn’t exist, the redirect came from IIS so we do a little work to separate out the requested path.

An incoming URL like this: http://www.myserver.com/SomeBogusPage.html, results in a return value of “/SomeBogusPage.html”. A URL pointing to a page in a sub-folder like http://www.myserver.com/prds/oldPrd.html would return “/prds/oldPrd.html”.

Now that we have the visitors requested path (typically a web page), we can see if it’s a page for which we have replacement. There are any number of ways to represent mappings from old paths to new URLs. You can chain if-then-else string comparisons, set up a static array or load the mappings from the database into a DataSet. Do whatever works best for you. For my purposes, since I had only a few pages to redirect, I just used a static array.

 

Making the Redirect SEO Friendly

ASP.NET provides the Response.Redirect method for easily redirecting to a specified URL on your domain. There’s a problem with that Redirect method, however, in that it sets an HTTP response code of 302 (found) which basically says that the resource has moved temporarily and that a substitute page was found. In general, I prefer to either return an HTTP response code of 404 indicating that the resource simply doesn’t exist or return 301 indicating that the resource has moved permanently.

Human visitors to your web site don’t care what you return. But search engine spiders do. 301 and 404 give pretty clear instructions to a search engine spider, essentially telling it “use this new page in place of that old page” or “this page no longer exists, remove it from your index” respectively. 302 sends a kind of mixed message, sort of “hey, the page isn’t here right now; it might be back later; it might not, or whatever”.  Search engine's like Google do not seem to like this indecision.

When an incoming URL path is an old page that has a newer replacement page, I want to return 301. I do that by calling the following utility method instead of Response.Redirect:

      public static void Redirect301(string url, bool endResponse)
      {
         System.Web.HttpContext context = System.Web.HttpContext.Current;
         if (context != null)
         {
            context.Response.Status = "301 Moved Permanently";
            context.Response.AddHeader("Location", url);
            if (endResponse)
            {
               context.Response.End();
            }
         }
      }

If I don’t find a mapping for the requested URL, I just manually set the response status to 404 by doing this:

      System.Web.HttpContext context = System.Web.HttpContext.Current;
      if (context != null)
      {
         context.Response.Status = "404 Not Found";
      }

Conclusion

I hope this series of blog entries helps some of you who are wending your way through the ins and outs of redirecting old pages to new pages, or just providing custom error pages on ASP.NET web sites. The solution I presented is far from perfect, but it seems to be working.

The one major flaw for which I'm seeking a fix is that incoming requests for ASPX resources with no permanent redirect still return 302 errors. Somehow, my manipulation of the current context response status, inserting a 404 or 301 response, does not find its way out of the page processing pipeline and back to the visitor. It works okay for non-ASPX resources. But something ASP.NET is doing is gumming up the works.

Once I figure that out (and suggestions are very welcome), I’ll post the solution.  If someone has an idea, please post a comment.

Redirects Using 404 Error Handling in ASP.NET 2.0 on Shared Hosting (Part 1)

Introduction

I recently made some changes to an ASP.NET web site on a shared hosting plan. There were some old web pages with HTML extensions that were replaced by newer pages with ASPX extensions. For example, I had an old page called clubteams.html and a new page called: teams.aspx. I wanted to make sure that visitors with links to the old pages would arrive at the new pages rather than seeing some obscure “not found” error.

This is a common enough issue and there are a number of ways to approach it on an ASP.NET website. But the devil is in the details, so they say, and it turns out that there are a lot of details involved in getting this right. In this series of blog entries, I’ll walk through the steps I took, share the lessons I learned and hopefully provide some answers for others seeking to do similar redirects in shared hosting environments.

 

Choosing an Approach

There are a number of methods for redirecting visitors from outdated web pages to their replacement pages. Here’s a list of some of the methods, grouped by required access level:

IIS Methods – Requires IIS Access

  • IIS Page Specific Redirects
  • IIS Custom Error Message Pages

ASP.NET Methods – No IIS Access Required

  • Custom HTTP Handler
  • URL Rewriting
  • ASP.NET URL Mapping
  • ASP.NET Custom Error Pages

The ASP.NET methods only work for file-types that IIS passes to ASP.NET for processing. A typical IIS configuration does not hand off HTML pages to ASP.NET. Since the incoming URLs I wanted to redirect were going to be HTML pages, that pretty much seemed to rule out using ASP.NET processing.

I next looked at the IIS methods. The web host for the web site in question did not provide a direct way to configure IIS page-specific redirects, so that was out. The host did, however, provide a nice UI for configuring IIS custom error pages for a web site.

How would using a custom error page help me redirect old pages to new pages? IIS helps us out here by handing the custom error pages a query string that looks something like this:

404;http://www.yourserver.com:80/SomeBogusPage.html

That query string provides enough information to detect the visitor’s intended destination. From that, I can match against a list of obsolete pages and, where a mapping exists, redirect the visitor to the appropriate new page.

The added benefit of using error handling to do the redirects is that my site would also gain a much more user friendly error page than the default generic error pages provided by IIS or ASP.NET.

So I now had my approach mapped out. In my next blog entry in this series, I’ll describe how I implemented the IIS custom error page.