Cross-Site Scripting (XSS) is one of those vulnerabilities that is easy to find but have moderate to high impacts depending on where and how it is exploited.
How XSS occurs?
XSS occurs when unsanitised data are echoed to the HTML. It can be echoed as it is or the attacker have to escape the current tag.
Here’s an example of payload echoed as it is:
$data = //some database query echo '<p>' . $data . '</p>';
The above example is pretty straight forward. Here is an example of when an attacker needs to escape a current html tag.
$userInput = //user input echo '<textarea attribute="' . $userInput . '"</textarea>';
Now lets see what happens when a user enter
The application will read it as
<textarea attribute="'"></textarea> <script>alert(0)</script> "</textarea> // ignored
We have just successfully closed the current textarea tag and we can choose what to type after that.
Types of XSS
Let’s start by the definition.
There are 3 types of XSS.
- Persistent or Stored XSS
- Reflected XSS
- DOM-based XSS
Persistent XSS occurs when input are stored in a database and the payload is triggered when an application retrieves and use this input.
Imagine signing up on Facebook using
<script>alert(0)</script> as your name. This data is then stored in Facebook’s database.
When you log into Facebook, your name together with other resources are requested from the database and your browser renders all these information to give you the Facebook interface.
It is at this moment that the payload is executed.
What is usually stored and retrieved?
name, bio, product names, group name, country, education history, file names
There is a subset of persistent XSS which is called blind XSS. This occurs when the payload is stored and it is used somewhere else.
It’s name - blind XSS - indicates that the attack is blind to the attacker hence a callback related feature is needed to inform the attacker about the triggered payload XSS Hunter.
Blind-XSS usually occurs in applications that are for internal usage. Logs are a good example of where it usually triggers.
Reflected XSS occurs when user inputs are echoed to the application.
Let’s take a search bar for example. If you head over to any Wordpress website, try searching for
<script>alert(0)</script>. You should see the something similar to this:
Luckily Wordpress sanitises such malicious input by default, else your browser will execute the payload.
DOM-based XSS is very similar to reflected XSS such that user inputs are reflected but there is a slight difference.
This is significant during audits as DOM-based XSS will not appear in server logs.
Fixing Cross Site Scripting
There are two main ways of fixing a XSS vulnerability by sanitising user input using:
By encoding special characters, we are telling the browser “hey, this is data from a user, and it should not be executed”.
By stripping special characters, we are not allowing the browser to even have the opportunity to execute since speical characters such a
<>'" are generally required for html tags and to escape the current tag.
Encode or Strip
This is entirely up to your application design. However here are some considerations you should take into account.
By encoding, you are able to keep the original version of the user input since you could decode to get back the original. However, if you were to store this information into a database it means that extra bytes are required for each entry.
On the other hand, by stripping, you save more bytes. But you will be trading off the integrity of your user’s input. In addition to that, you might affect the SEO of your website.
When to Sanitise
Here are some examples on where and how you should sanitise.
- User input is stored in database (sanitised before inserting into database)
- Pages that retrieves from database (data sanitised before echoed to page)
- User input is reflected directly onto webpage (sanitise before echoing)
- User input is passed to other pages (sanitised before passing to other pages)
- Data retrieved from other pages (data sanitised before echoed to page)
As a rule of thumb, you should be paranoid and sanitise the input regardless of its origin. Sanitise even if the data is from internal sources.
Fixing XSS with PHP
$untrustedData = "<script>alert(0)</script>"; $cleanData = htmlentities($untrustedData); $cleanData = preg_replace('/[^A-Za-z0-9\-]/', '', $untrustedData);
Fixing XSS with ASP
var untrustedData = "<script>alert(0)</script>"; var cleanData = Server.HtmlEncode(untrustedData); //server-side <%: untrustedData %> // ASP.NET 4.0