Posted on

Cross-Site Scripting (XSS) is one of those vulnerabilities that is easy to find but have moderate to high impacts depending on where and how it is exploited.

The nature of this vulnerability is to execute arbitrary javascript on an application hence the name cross-site scripting – executing scripts on other websites.

How XSS occurs?

XSS occurs when unsanitised data are echoed to the HTML. It can be echoed as it is or the attacker have to escape the current tag.

Here’s an example of payload echoed as it is:

$data = //some database query
echo '<p>' . $data . '</p>';

The above example is pretty straight forward. Here is an example of when an attacker needs to escape a current html tag.

$userInput = //user input
echo '<textarea attribute="' . $userInput . '"</textarea>';

Now lets see what happens when a user enter

'"></textarea><script>alert(0)</script>

<textarea attribute="'"></textarea><script>alert(0)</script>"</textarea>

The application will read it as

<textarea attribute="'"></textarea>
<script>alert(0)</script>   
"</textarea> // ignored

We have just successfully closed the current textarea tag and we can choose what to type after that.

Types of XSS

Let’s start by the definition.

There are 3 types of XSS.

  1. Persistent or Stored XSS
  2. Reflected XSS
  3. DOM-based XSS

Persistent XSS

Persistent XSS occurs when input are stored in a database and the payload is triggered when an application retrieves and use this input.

Imagine signing up on Facebook using <script>alert(0)</script> as your name. This data is then stored in Facebook’s database.

When you log into Facebook, your name together with other resources are requested from the database and your browser renders all these information to give you the Facebook interface.

It is at this moment that the payload is executed.

What is usually stored and retrieved?

name, bio, product names, group name, country, education history, file names

There is a subset of persistent XSS which is called blind XSS. This occurs when the payload is stored and it is used somewhere else.

It’s name - blind XSS - indicates that the attack is blind to the attacker hence a callback related feature is needed to inform the attacker about the triggered payload XSS Hunter.

Blind-XSS usually occurs in applications that are for internal usage. Logs are a good example of where it usually triggers.

Reflected XSS

Reflected XSS occurs when user inputs are echoed to the application.

Let’s take a search bar for example. If you head over to any Wordpress website, try searching for <script>alert(0)</script>. You should see the something similar to this:

Luckily Wordpress sanitises such malicious input by default, else your browser will execute the payload.

DOM-based XSS

DOM-based XSS occurs when user inputs are echoed to the html using javascript. DOM simply means Document Object Model.

DOM-based XSS is very similar to reflected XSS such that user inputs are reflected but there is a slight difference.

The difference is that while reflected XSS is echoed to the html using server side languages (e.g. php), DOM-based XSS echoes using javascript (document.write).

This is significant during audits as DOM-based XSS will not appear in server logs.

Fixing Cross Site Scripting

There are two main ways of fixing a XSS vulnerability by sanitising user input using:

  • Encoding
  • Stripping

By encoding special characters, we are telling the browser “hey, this is data from a user, and it should not be executed”.

By stripping special characters, we are not allowing the browser to even have the opportunity to execute since special characters such a <>'" are generally required for html tags and to escape the current tag.

It is important to note that while we can sanitise user input using javascript, it should ultimately be performed on server side.

This is because, an attacker could use a http proxy such as BurpSuite to bypass the javascript sanitisation.

Encode or Strip

This is entirely up to your application design. However here are some considerations you should take into account.

By encoding, you are able to keep the original version of the user input since you could decode to get back the original. However, if you were to store this information into a database it means that extra bytes are required for each entry.

On the other hand, by stripping, you save more bytes. But you will be trading off the integrity of your user’s input. In addition to that, you might affect the SEO of your website.

When to Sanitise

Here are some examples on where and how you should sanitise.

Scenario 1:

  • User input is stored in database (sanitised before inserting into database)
  • Pages that retrieves from database (data sanitised before echoed to page)

Scenario 2:

  • User input is reflected directly onto webpage (sanitise before echoing)

Scenario 3:

  • User input is passed to other pages (sanitised before passing to other pages)
  • Data retrieved from other pages (data sanitised before echoed to page)

As a rule of thumb, you should be paranoid and sanitise the input regardless of its origin. Sanitise even if the data is from internal sources.

Fixing XSS with PHP

$untrustedData = "<script>alert(0)</script>";

$cleanData = htmlentities($untrustedData);
$cleanData = preg_replace('/[^A-Za-z0-9\-]/', '', $untrustedData);

Fixing XSS with ASP

var untrustedData = "<script>alert(0)</script>";

var cleanData = Server.HtmlEncode(untrustedData); //server-side
<%: untrustedData %> // ASP.NET 4.0