Dynamically generated robots.txt and sitemap.xml files in ASP.NET

Wednesday, September 17, 2014

A robots.txt file and a sitemap are important additions to a site from an SEO point of view. You can of course just serve static files but it’s easy to imagine cases in which you’d want the content of these to be dynamically generated:

1. You are regularly adding/changing site content. Certainly on hirespace.com, we’re adding new venue pages all the time, so our sitemap changes frequently.

2. You have some kind of test/pre-production environment set up for your site that you don’t want google crawling and indexing. In this case, your robots.txt file needs to change depending on what environment you’re in.

It’s easy in ASP.NET to do this. In principle you just set up a controller and view for each page and create the content dynamically inside the index action on the controller. Then set up routes to point yoursite.com/sitemap.xml and yoursite.com/robots.txt to the right actions. There’s one further issue which is that by default IIS will try and handle urls with a dot in using the default static file handler, so you also need to add custom handlers to your web config as I’ll show below.

For example, I created a controller called RobotsController with the following action. It outputs a plain text document in the expected format for a robots.txt file that varies depending on the deployment environment.

public virtual ActionResult Index()
{
     string robotsResult;

     switch (Config.DeploymentContext)
     {
         case DeploymentContextType.Test:
         robotsResult = "User-agent: * \n Disallow: /";
         break;
         case DeploymentContextType.Live:
         robotsResult =
         "User-agent: * \n Disallow: /Account";
         break;
         default:
         robotsResult = "User-agent: * \n Disallow: /";
         break;
     }

     return Content(robotsResult, "text/plain");
}

Now we add the route to our RouteConfig class:

 routes.MapRoute(
                name: "Robots.txt",
                url: "Robots.txt",
                defaults: new { controller = "Robots", action = "Index" });

So far so simple. The only problem now, is that when you navigate to yoursite.com/robots.txt, IIS sees the dot in the url and expects a static file. Since there is no such file, it fails and returns a file not found error.

I found the solution to this issue on StackOverflow (of course – yay!) and it’s dead simple. All you need to do is add a handler to your web config (original StackOverflow answer here):

    
        
〈system.webServer〉
    〈handlers〉
        〈add name="Robots-ISAPI-Integrated-4.0" path="/robots.txt" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" /〉
        ...
    〈/handlers〉
〈/system.webServer〉

Similarly, if you’re generating a sitemap dynamically, add a route:

 routes.MapRoute(
                name: "Sitemap.xml",
                url: "Sitemap.xml",
                defaults: new { controller = "Sitemap", action = "Index" });

and a handler:

    
        
   〈system.webServer〉
    〈handlers〉
        〈add name="Sitemap-ISAPI-Integrated-4.0" path="/sitemap.xml;" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" /〉
        ...
    〈/handlers〉
  〈/system.webServer〉
    

And that’s it! Google and other search engine bots will pick up your dynamically generated sitemap and robots files, and you’ll reap the SEO benefits 🙂


Leave a Reply

Your email address will not be published. Required fields are marked *