Samples that suck: IfModifiedSince – Tales from the Crypto

Samples that suck: IfModifiedSince

I’ve been trying to improve my IFetch application’s overall performance, and it’s clear that the best thing that could be done to improve it immediately is to cache the information being returned from the BBC Radio web site, so that next time around, the application doesn’t have to reload all the information from the web, slow as it often is.

My first thought was quite simple – store the RDF (Resource Description Format) XML files in a cache directory – purge them if they haven’t been accessed in, say, a month, and only fetch them again from the web if the web page has been updated since the file was last modified.

Sadly, I wrote the code before I discovered that the BBC Radio web site acts as if all these RDF files were modified this last second.

Happily, I wrote the code before I went and looked at the documentation for the HttpWebRequest.IfModifiedSince Property.

Here’s the sample code (with source colouring and line numbers):

01    // Create a new ‘Uri’ object with the mentioned string.
02    Uri myUri =new Uri("http://www.contoso.com");           
03    // Create a new ‘HttpWebRequest’ object with the above ‘Uri’ object.
04    HttpWebRequest myHttpWebRequest= (HttpWebRequest)WebRequest.Create(myUri);
05    // Create a new ‘DateTime’ object.
06    DateTime today= DateTime.Now;
07    if (DateTime.Compare(today,myHttpWebRequest.IfModifiedSince)==0)
08    {
09        // Assign the response object of ‘HttpWebRequest’ to a ‘HttpWebResponse’ variable.
10        HttpWebResponse myHttpWebResponse=(HttpWebResponse)myHttpWebRequest.GetResponse();
11        Console.WriteLine("Response headers \n{0}\n",myHttpWebResponse.Headers);
12        Stream streamResponse=myHttpWebResponse.GetResponseStream();
13        StreamReader streamRead = new StreamReader( streamResponse );
14        Char[] readBuff = new Char[256];
15        int count = streamRead.Read( readBuff, 0, 256 );
16        Console.WriteLine("\nThe contents of Html Page are :  \n");   
17        while (count > 0)
18        {
19            String outputData = new String(readBuff, 0, count);
20            Console.Write(outputData);
21            count = streamRead.Read(readBuff, 0, 256);
22        }
23        // Close the Stream object.
24        streamResponse.Close();
25        streamRead.Close();
26        // Release the HttpWebResponse Resource.
27        myHttpWebResponse.Close();
28        Console.WriteLine("\nPress ‘Enter’ key to continue……………..");   
29        Console.Read();
30    }
31    else
32    {
33        Console.WriteLine("\nThe page has been modified since "+today);
34    }

So, what’s the problem?

First, there’s the one that should be fairly obvious – by comparing (in line 7) for equality to “DateTime.Now” (assigned in line 6), the programmer has essentially said that this sample is designed to differentiate pages modified after the run from pages modified before the run. Now this will have one of two effects – on a site where the If-Modified-Since request header works properly, all results will demonstrate that the page has not been modified; on a site where If-Modified-Since always returns that the page has been modified, it will of course always state that the page has been modified. That alone makes this not a very useful sample, even if the rest of the code was correct.

But the greater error is that the IfModifiedSince value is a request header, and yet it is being compared against a target date, as if it already contains the value of the page’s last modification. How would it get that value (at line 7), when the web site isn’t actually contacted until the call to GetResponse() in line 10?

Also irritating, in that the .NET Framework makes far too much use of exceptions, is that instead of the NotModified response type being a simple category of response, it’s an exception.

How should this code be changed?

My suggestion is as follows – let me know if I’ve screwed anything else up:

01 // Create a new ‘Uri’ object with the mentioned string.
02 Uri myUri = new Uri("http://www.google.com/intl/en/privacy.html");
03 // Create a new ‘HttpWebRequest‘ object with the above ‘Uri’ object.
04 HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
05 // Create a new ‘DateTime‘ object.
06 DateTime today = DateTime.Now;
07 today=today.AddDays(-21.0); // Test for pages modified in the last three weeks.
08 myHttpWebRequest.IfModifiedSince = today;
09 try
10 {
11     // Assign the response object of ‘HttpWebRequest‘ to aHttpWebResponse‘ variable.
12     HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
13     Console.WriteLine("Page modified recently\nResponse headers \n{0}\n", myHttpWebResponse.Headers);
14     Stream streamResponse = myHttpWebResponse.GetResponseStream();
15     StreamReader streamRead = new StreamReader(streamResponse);
16     Char[] readBuff = new Char[256];
17     int count = streamRead.Read(readBuff, 0, 256);
18     Console.WriteLine("\nThe contents of Html Page are :  \n");
19     while (false && count > 0)
20     {
21         String outputData = new String(readBuff, 0, count);
22         Console.Write(outputData);
23         count = streamRead.Read(readBuff, 0, 256);
24     }
25     // Close the Stream object.
26     streamResponse.Close();
27     streamRead.Close();
28     Console.WriteLine("\nPress ‘Enter’ key to continue……………..");
29     Console.Read();
30     // Release the HttpWebResponse Resource.
31     myHttpWebResponse.Close();
32 }
33 catch (System.Net.WebException e)
34 {
35     if (e.Response != null)
36     {
37         if (((HttpWebResponse)e.Response).StatusCode == HttpStatusCode.NotModified)
38             Console.WriteLine("\nThe page has not been modified since " + today);
39         else
40             Console.WriteLine("\nUnexpected status code " + ((HttpWebResponse)e.Response).StatusCode);
41     }
42     else
43     {
44         Console.WriteLine("\nUnexpected Web Exception " + e.Message);
45     }
46 } 

So, why are sucky samples relevant to a blog about security?

The original sample doesn’t cause any obvious security problems, except for the obvious lack of exception handling.

But other samples I have seen do contain significant security flaws, or at least behaviours that tend to lead to security flaws – not checking the size of buffers, concatenating strings and passing them to SQL, etc. It’s worth pointing out that while writing this blog post, I couldn’t find any such samples at Microsoft’s MSDN site, so they’ve obviously done at least a moderately good job at fixing up samples.

This sample simply doesn’t work, though, and that implies that the tech writer responsible didn’t actually test it in the two most obvious cases (page modified, page not modified) to see that it works.

One Response to Samples that suck: IfModifiedSince

Leave a Reply

Your email address will not be published. Required fields are marked *