5 signs your ASP.NET application may be vulnerable to HTML injection

If you don’t encode data when using any of the following methods to output to HTML your application could be compromised by unexpected HTML turning up in the page and modifying everything from formatting though to capturing and interfering with form data via remote scripts (XSS). Such vulnerabilities are incredibly dangerous.

Using MonoRail or Microsoft’s MVC does not make you automatically immune — use {! } in MonoRail’s Brail engine and the HtmlHelpers in Microsoft’s MVC to ensure correct encoding.

Just imagine post.Author contains "><script src="http://abadsite.com"></script> after an unscrupulous user entered that into a field your application uses and it got into the database. The following typical ASP.NET techniques would leave you open.

1. You use <%= %> or <%# %> tags to output data

Example showing outputting literals with <%= %> :

// Vulnerable
<p>Posted by <%= post.Author %></p>
// Secure
<p>Posted by <%= HttpUtility.HtmlEncode(post.Author) %></p>

2. You use Response.Write

Example showing writing out attributes with Response.Write and String.Format, again post.Author could contain <script>:

// Vulnerable
Response.Write(String.Format("<input type=\"text\" value=\"{0}\" />", post.Author);
// Secure
Response.Write(String.Format("<input type=\"text\" value=\"{0}\" />", HttpUtility.HtmlAttributeEncode(post.Author));

3. You set HRef or Src on HtmlAnchor, HtmlImage or HtmlnputImage controls

In general the HtmlControls namespace are very well behaved with encoding but there is a bug in the code that attempts to adjust the relative url’s for href and src attributes which causes those properties to bypass encoding (I’ve reported this to Microsoft).

Example showing anchor HRef attribute abuse:

// Vulnerable
outputDiv.Controls.Add(new HtmlAnchor() { Text = "Test", HRef = post.Author } );
// Secure
outputDiv.Controls.Add(new HtmlAnchor() { Text = "Test", HRef = HttpUtility.HtmlAttributeEncode(post.Author) } );

4. You set the Text property of WebControls/WebForms

You would imagine the high-level WebForms controls would take care of encoding and you’d be wrong.

Example showing the Label control being so easily taken advantage of:

// Vulnerable
outputDiv.Controls.Add(new Label() { Text = post.Author } );
// Secure
outputDiv.Controls.Add(new Label() { Text = HttpUtility.HtmlEncode(post.Author) } );

The one exception to this is the Text property of input controls — as they put the value into an attribute and therefore call HttpUtility.HtmlAttributeEncode for you.

5. You use the LiteralControl

LiteralControl is a useful control for adding text to the output stream that doesn’t require it’s own tag. It also helpfully, and uncharacteristically, provides a useful constructor. Unfortunately it fails encode the output.

Example showing poor LiteralControl wide open:

// Vulnerable
outputDiv.Controls.Add(new LiteralControl(post.Author));
// Secure
outputDiv.Controls.Add(new LiteralControl(HttpUtility.HtmlEncode(post.Author)));

Warning! Do not:

  1. Encode data in the database — your contaminated data will be difficult to use elsewhere and will end up double-encoded
  2. Look for script on submit — you won’t catch every combination and it might prevent valid data
  3. Trap entry with client-side code — it is trivially bypassed

Just encode the output.

[)amien

PS: The samples use .NET 3.5 object initializer syntax for brevity as many affected controls do not have useful constructors

12 responses to 5 signs your ASP.NET application may be vulnerable to HTML injection

  1. Avatar for

    Information is only used to show your comment. See my Privacy Policy.

  2. Avatar for Amit

    Damien, When you say “You would imagine the high-level WebForms controls would take care of encoding and you’d be wrong.". Is there a list of controls that are vulnerable and a list of controls that are not?

  3. Avatar for Damien Guard

    It should be encoded regardless of where it came from or whether it is believed to be safe. The only exception is data you expect to contain HTML and are going to sanitize to ensure only has the HTML you like.

    One example I’ve seen is blogging software that fails to encode blog post titles because they are ‘safe’ in that they are only entered by the blogging author.

    You create a blog post called “Using List<T> for collections” and it shows up with “Using List for collections”, causes the page to fail validation and breaks the RSS feed.

  4. Avatar for James Curran

    But that’s not really 5 signs --- It’s really just one sign --- because , Response.Write, HtmlAnchor et al aren’t the problem. The problem is post.Author. If it comes from a trusted source (i.e., you typed it in yourself), it’s save to output directly. If it comes from anywhere else (notably user input), then it must be encoded.

  5. Avatar for Damien Guard

    Lol, I think that WordPress is determining that might be dangerous. Dave was, I think, mentioning the <page validateRequest="true" /> option in web.config that tries to prevent your application accepting script from the user.

    This is all well and good but if there are other ways to get data into your database (links with other company systems, web services, message pumps or even internal WinForms apps) then those too can be avenues for attack.

    Many attacks happen from the inside and anyone with access to the SQL box or a WinForms app could be the one putting the payload there ready for your application to deliver up to unsuspecting users.

  6. Avatar for Dave

    So…

    By saying ASP.NET does nothing for you, are you implying that putting a “” section into your web.config isn’t working anymore for some reason? It certainly isn’t fool-proof, but it gives a huge head-start getting around the issues you are describing.

  7. Avatar for ScottB

    Great article, I have always encoded entries into the db but never encoded the output from the DB. After reading this I was like “duh!! why didn’t I think if that”. I wonder if someone could make an Ajax extender for a textbox that automatically encodes the output. Thanks again, Scott

  8. Avatar for Damien Guard

    Yeah I agree with you here and have to conceed that JSF/JSP did the right thing.

    Microsoft almost got it right with the HtmlControls but for that bug I found.

  9. Avatar for steve

    And interestingly, I think shortcuts like {! have a downside. Sure it’s quicker to type, but it does nothing to remind you that you should be using it. And, it’s arguably harder to spot a missing ! when you’re reviewing code than it is to miss a missing explicit encoding/escaping method call. I don’t think shortcuts are the answer at all, making safety the default is much more robust.

  10. Avatar for steve

    I wonder why escaping (or encoding if you prefer) the HTML isn’t the default on the higher-level constructs. I can understand why the low-level ASP output methods don’t do it, but IMO higher level frameworks should always default to safe behaviour. Having to remember to explicitly escape is a huge pain the ass.

    I know everyone except me reading your blog hates Java, but again JSF demonstrates good practice here, by defaulting to escaping HTML. Since JSF is a common building block for most serious Java web software, the result is that most people building using it will have at least that minimum requirement covered without having to keep reminding themselves. I think the MVC framework, or some intermediate view component for ASP (as JSF is to JSP) should do the same.

  11. Avatar for Damien Guard

    {! } is good in MonoRail, I think Ayende added it in response to my ticket on encoding being off by default and causing a problem through to components such as the SmartGrid.

    Having something similar in MVC would be great although personally I’d love it to be mapped to a method on a new HtmlViewPage class which inherits from ViewPage so that you can switch out the encoding method for different output types.

    There is some discussion about the whole issue so we’ll see what happens.

  12. Avatar for Adam Tybor
    Adam Tybor

    I would love to see some type of shortcut in MVC like brail has in monorail. In brail I can !{post.Author} and it will be html encoded. It would be nice if MS MVC and ASP.Net for that matter added something like so encoding did not have to take a function call. Not having this will be so much more annoying in MS MVC has you will have many more code blocks.