How dangerous is HTML injection?

A few years ago I believed that HTML and SQL injection vulnerabilities were headed for extinction. Thanks to object-relational mapping tools SQL injection continues to die but HTML and script injection vulnerabilities are as popular as ever.

Part of the problem stems from the “back-to-basics” approach to rendering web pages, throwing out classes and controls for string-based libraries (primitive obsession) and helpers which do not encode HTML or even offer a concise simple syntax to do so.

MonoRail was one such project but they took feedback on board and addressed the issue although I was surprised it had got as far as release candidate 2 with such a serious oversight.

Other projects have been less reactive when advised of the problem and I can’t help but wonder if I am not getting the severity of the issue across. This isn’t just an annoyance but a real security problem.

If you are not familiar with:
  • HttpUtility.HtmlEncode (.NET)
  • Server.HtmlEncode (ASP)
  • htmlentities/htmlspecialchars (PHP)
  • html_escape (Rails)
  • {! } (MonoRail Brail)

and your web apps output data then they are likely open to HTML & script injection vulnerabilities.

Vulnerable code often looks like this:

myLabel.Text = Request.Form["Something"];
Response.Write(Request.Cookies["AddedProduct"]);
<%= myDataReader[0] %>
<? php echo get_the_title() ?>

For more ASP.NET examples check out 5 signs your ASP.NET application may be vulnerable to HTML injection.

Let’s start by considering the actors involved:

Visitor to visitor

If your site stores input from an external user (visitor) and displays it to another then you could be exposed to this scenario. Many sites do this although it is not always immediately recognized – an internet banking site does not seem an obvious candidate until you consider that you may put a textual reference on payments made to another person. If you know they use a vulnerable internet banking solution…

A worst-case scenario here would be that one visitor could steal another’s login credentials and exploit whatever rights that might give him – anything from posting messages to stealing funds.

Visitor to staff

Not all sites exchange data between users but if your site collects information from visitors chances are it presents this information to staff. Internal systems used to examine it are often considered less vulnerable which is a mistake. Remember all data provided from a user should be considered to be a potential avenue for a dangerous payload, e.g. even the language-accepts or user-agent strings.

When exploited internal systems can reveal information in bulk about the users, the system and the administration accounts used to manage it. Gaining access to these details brings all the privileges those accounts have to offer which can be catastrophic.

Staff to visitor

It is easy to forget that many frauds are perpetuated by people on the inside. A staff member given the ability to present text to the user via a website has the ability to modify any page that the content is presented on which if it includes a login page (perhaps for system status messages) then capturing login details to a server of their own choice is easy.

Security operators with access to reset (but not view) passwords would find this attack particularly enticing given that they do not need to reset the users account and therefore raise any awareness. An insider can perpetuated the fraud and may be in a position to further conceal it within the organization.

Next steps?

I can envisage a sequence of steps that start with discovery of injectable systems through detection of script-enabled into form capture-and-forward and async logging of passwords through XmlHttp.

Detailing those steps would certainly raise awareness and help developers appreciate the severity of the issue but how do I make sure that information isn’t abused?

Disclosure is a double-edged sword but then you can’t have security through obscurity… I wonder how many crackers/black hackers already utilize these techniques for nefarious means.

.NET developers might like to check out the [slides from the Web Application Security talk](/blog/2007/08/16/gsdf-august-2007-postmortem) I gave at the Guernsey Software Developer Forum which demonstrates exploitable, exploits and safe alternatives for preventing HTML and SQL injection.

[)amien

4 responses

  1. Avatar for bhaarat

    so people working with java/jsp's wont have these issues?

    bhaarat 13 December 2007
  2. Avatar for Damien Guard

    No, JSP is just as susceptible to this problem as other web languages. The reason I didn't list what to use in JSP is that it appears there is no standard built-in library function to achieve this in JSP.

    Damien Guard 14 December 2007
  3. Avatar for steve

    JSF escapes HTML by default so you don't have to remember ;) You can turn it off per 'outputText' tag but it defaults, sensibly to 'true'.

    steve 14 December 2007
  4. Avatar for steve

    Oh, and if you must do it manually you can use Apache's StringEscapeUtils, which can escape many different purposes including HTML, Javascript and SQL. Apache Commons is one of the most 'common' pieces of util software Java coders should always be using.

    steve 14 December 2007