Saturday, February 24, 2007

Fixing IE Content-Type Detection Issues: Output Filtering Instead Of Input Validation

[EDIT](25/02/07): It seems that this method doesn't completely work, so please read the comments to find more info, because otherwise this isn't going to do you any good.

There's been a bit of discussion over a sla.ckers.org about injecting Javascript into uploaded image files, and having IE detect the content type as text/html rather than the content-type sent by the server. For anyone who isn't familiar with the issue I recommend you read the following post: http://www.splitbrain.org/blog/2007-02/12-internet_explorer_facilitates_cross_site_scripting Not because its the first mention of it, but its the best and most technical description I've seen.

Anyway; to take a leaf out of Sylvan von Stuppe's book, I'd like to recommend a way to do (the equivalent of) output filtering, rather than input validation to stop this issue.

First of all, lets take a look at why we would ever do input validation to stop XSS attacks. The only reason we have ever had to do input validation is to stop people inputting Javascript, but allowing them to input html.

In all other situations where we don't need to allow certain html, we can simply encode all output in the appropriate char set, and we're safe.

And there is no reason we would ever need to allow users to upload images which get interpreted as html files, and therefore served as such.

So, having established (at least in my view), that output filtering is the way to go; how would we go about doing this without altering the image?

Well, in this case its easy enough; all we need to do is use a header that IE does respect; the Content-Disposition header. And possibly also a Content-Type header of application/octet-stream or we may not, depending on how paranoid we are, and how much we want to (possibly) break things.

There are several way to do this.

On Apache, the best solution is to use mod_headers to send the header for all files in a particular directory, and move all your uploads there.

Microsoft provides an explanation of how you can achieve the same on IIS here: http://support.microsoft.com/kb/q260519/

You can of course, also set PHP or any other server side language as the handler for all the files in a directory, and then use the header() (or similar) function to send the Content-Disposition header tot he browser.

Of course, this might be annoying if a user does something like right click on an image and click view image, but this is a minor inconvenience IMO.

3 comments:

Anonymous said...

Note, that IE will still ignore the mime type when sending an image with Content-Disposition: attachment. It will display a download dialog which asks you what you want to do with the HTML(!) file: download or view it. When the user clicks view (what he probably would do if he expects an image) the script code is executed.

kuza55 said...

Hey Andreas,

I missed this when I was doing my initial testing, but I've been able to reproduce something similar to what you've said on IE6 (I haven't been able to reproduce it in IE7).

If I have a PHP file called test.php which looks like this (test.png is the file from your article):

<?php

header ("Content-Disposition: attachment; filename=\"test.png\"");

readfile ("test.png");

?>

Then the first time I visit it I am presented with [Open][Save][Cancel]. If I choose cancel, then the next time I load the image the Javascript executes; this is obviously bad because someone can load the image into an iframe, and the most likely action for a user is to click cancel, at which point they just load it again.

After some investigation I've found what I think is a fix to this issue.

At the risk of absolutely killing performance you can provide an Expires header set in the past, which forces IE to not cache the image. With this implemented I have not been able to reproduce any of the effects above.

The reason I think this works is because while IE6 caches the contents of a page, it does not cache response headers, and so the second time a request is made IE simply pulls it out of its cache and displays it.

If I've found the wrong issue, please tell me how I could reproduce it, and I'll have a look at that....

kuza55 said...

I should probably note, that while I was seeing what I described when I wrote it. When someone from MS got in touch with me about something else, we were unable to confirm this issue, so I'll lay it down to my incompetence in testing, the solution I described in the post seems to work just fine.