7. Security

Security is especially important in web development because of the sheer number of potential attackers having access to some level of your application. But this is by no means an exhaustive guide to security. Actually, I don't think there could ever be any exhaustive guides to security, as security is not a state of an application, but rather a process. A constant cat-and-mouse game.

Still, there are some security concerns you simply must be aware of before ever pushing any web code into production.

Web application security

Cenzic informs in their 2014 security report that 96% of web applications have vulnerabilities.

Share	Vulnerability
25%	Cross-Site Scripting (XSS)
23%	Information Leakage
15%	Authentication and Authorization
13%	Session Management
11%	Other
7%	SQL Injection
6%	Cross-Site Request Forgery (CSRF)

Common vulnerabilities

User input

Never trust your users. All user input should be treated as guilty until proven innocent. Many of the vulnerabilities used rely on user input not being validated. Especially the next vulnerability is entirely avoidable with sensible input validation.

Always validate user input.

Cross-Site Scripting (XSS)

XSS is a very common and easy attack that will not necessarily affect your system in its most basic form, but rather all of the users of your system. All the attacker needs to do is format input in such a way that JavaScript gets executed when the page is shown to other users.

<p class="comment">This is my first comment.</p>
<p class="author">James <script>alert("Proof of concept");</script></p>

If you can do this, you can also call the global eval() function, which will take, evaluate, and execute any string as JavaScript on the users' machine.

Cross-Site Script runs in the same domain as the page. This means that it can read, for example, users' cookies, and can alter the page in any way. The basic defense against this is to validate and sanitize user input. If it is enough for your application to accept just plain text, the task is easier - remove all left angle tags (<) is at least a start. But if you need your users to be able to insert HTML or CSS or JavaScript, the task is much more complicated. It is a good idea to rely on experts and use some service or application that does this for you, for example Google's Closure Toolkit.

"The Closure Compiler, Library, Templates, and Stylesheets help developers create powerful and efficient JavaScript."

So, Cross-Site Scripting can be used for injecting client-side scripts into web pages.

Persistent cross-site scripting could go as follows:

For example, a comment is submitted to a news article and it is stored.
The comment contains JavaScript code.
When a user accesses the page that displays the comment, the code is executed.
If you happen to be logged in, for example, the script could result in data theft.

Non-persistent cross-site scripting could go as follows:

For example, the user searches using a search engine.
The search term is displayed on the results page.
The search term is actually JavaScript code.
The code is executed.

This is called non-persistent cross-site scripting because it is not persisted in any data store and hence only affects those, who click the link. So, in practice, you could be sent a link like this: https://www.power.fi/haku/q=<script>alert("Hello");</script>. What if instead of the alert there was some malicious code and it would get shown and then executed? If you would be logged in, for example, the script could result in data theft.

Session hijacking

Remember in the second chapter when we discussed how HTTP transactions are stateless and how session cookies are required by most systems to create a state between transactions? That includes a session id, an identifier that helps the server know that it is you making the request.

Session hijacking refers to an attacker gaining access to a user's session which will then allow the attacker to use the system as the hijacked user. The main methods for session hijacking are:

Session Fixation
Session Sidejacking
Cross-Site Scripting (XSS)

Session fixation means that an attacker will first log into a site and acquire a session id. Then this id is fed to the victim somehow (e.g., email, XSS). Then, the victim logs in and the attacker has full access to the victim's resources. Wikipedia has nice descriptive examples of session fixation.

| Extra: Wikipedia - Session fixation

Session sidejacking works by packet sniffing - that is, observing network traffic - for example in unencrypted wi-fi networks, and by this way acquiring the victim's session id.

Note, that even if you are using HTTPS for login and then using HTTP for the rest of the communication, users are still vulnerable to this attack. The users' password communication is encrypted, but if a session id is transmitted unencrypted an attacker can have access to that. So, HTTPS all the way and stay out of shady wi-fi's.

Cross-Site Request Forgery

The idea behind Cross-Site Request Forgery (CSRF) is that a victim has already authenticated to site A and after that visits a malicious site B. Site B contains a malicious link to site A which executes some unwanted action. The link might be, for example, in some img elements src attribute, in which case it will be automatically followed by the browser without the user knowing:

Mallory: Hello Alice! Look here:
<img src="http://bank.example.com/withdraw?account=Alice∓amount=1000000∓for=Mallory">

Because Alice is already logged in to bank.example.com, the action gets executed. Well, most likely not this exact action, as there are usually some safeguards in place. But you get the picture. The bank transfer example used a GET request that caused side effects. This is a bad idea. Generally speaking, you should never have any GET routes that alter your data. However, it is only a tiny bit more difficult to use POST requests and CSRF.

GET and logout

So, you should not have any GET routes that alter your data or application state. How about logout? Oftentimes sites have a logout link; the default action is a GET request. Should you use JavaScript to prevent the default action of that link and instead submit a form to make a POST request just for logging out of the system? Logging out is altering the state and it would be quite easy to guess the endpoint for that action (I mean, it's likely /logout on most systems, right?). Hence, it would be easy to log people out of their sessions with the img src -hack discussed above, if not for any real gain than just to piss them off.

<a href="/logout" id="logout">Logout</a>
<form method="POST" action="/logout" id="logoutForm"></form>

<script>
document.getElementById("logout").addEventListener("click", (e) => {
    e.preventDefault();
    document.getElementById("logoutForm").submit();
});
</script>

Creating POST requests for Cross-Site Request Forgery is not too difficult. In practice, it might use AJAX to perform the request, but for simplicity's sake let's use a basic HTML form here:

<form action="https://my.site.com/me/deactivate-account" method="POST">
    <button type="submit" class="looksLikeALink">Click here to read the rest of the article...</button>
</form>

The attacker could use other methods like DELETE (with AJAX, not supported with HTML forms) as well as read the result. This is particularly important when the user has some sort of session and the session data includes very personal details. These might include credit card and social security information, etc.

| Extra: https://github.com/pillarjs/understanding-csrf

Cross-Site Request Forgeries can be mitigated by requiring more identifiers than just the session id for important actions. When the website uses forms to change the state of its resources, a CSRF token can be added to these forms. An outline of an example HTTP session could be as follows:

Client requests a page that has a form on it
Server responds with a form where it has added a unique CSRF token
Client submits a form with the CSRF token
The server rejects the request if the token is invalid

So, the HTML might look something like this:

<form method="POST" action="/comments">
    <input type="hidden" name="csrf_token" value="kljsdf897ds98f7o9h8fd">
    <input type="text" name="comment" placeholder="What's on your mind?">
    <button type="submit">Send</button>
</form>

| Extra: https://www.npmjs.com/package/csurf

SQL injection

Injection attacks rely on not sanitizing your user input. That input is then used as executable code. Using unsanitized user input is always a bad idea! Consider the following:

statement = "SELECT * FROM users WHERE name = '" + userName + "';"

What if the userName variable is set to ' or '1'='1?

statement = "SELECT * FROM users WHERE name = '' or '1'='1';"

So, select everything from the users' table on the rows where the name is '' OR '1'='1', which is true. So, select everything from the user table on every row instead of just the user's own rows. This common example is from Wikipedia, you might want to check out the rest of the examples as well.

| Extra: Wikipedia - SQL injection

Never ever construct SQL queries by concatenating user input! And never show database error messages to the user. If you are constructing SQL, then use parameterized queries where parameters are not interpreted as SQL code. Another option, perhaps even preferable, is to use a well-tested library that handles the protection.

The root cause of the SQL injection problem is the mixing of the code and the data. In fact, our SQL query is a program. A full legitimate program. And so it happens that we are creating this program dynamically, adding some data on the fly. Thus, this data may interfere with our program code and even alter it - and such alteration would be the very injection itself. Use constant parts that are hardcoded and placeholders which will be substituted with actual formatted data.

How about NoSQL? There is no SQL in NoSQL, so we should be safe? Not so. The same thing applies also to NoSQL databases. For example: https://resources.infosecinstitute.com/what-is-nosql-injection/.

These NoSQL injections can be successful against MongoDB which we will use in the course group project and exercise round 9 exercises. So, beware!

Problems can occur as the JSON documents used to interact with MongoDB can have any internal structure, as by default MongoDB is schemaless.

Directory traversal

it "... consists in exploiting insufficient security validation/sanitization of user-supplied input file names, so that characters representing "traverse to parent directory" are passed through to the file APIs." Wikipedia

Some browsers (but not all!) will handle ../ in an URL differently to protect against this. The attacker gains access to the file system based on malicious user input.

The cure for these attacks is to urlencode strings used in HTTP headers.

Consider this example from Wikipedia:

<?php
$template = 'red.php';

if(isset($_COOKIE['TEMPLATE'])) {
    $template = $_COOKIE['TEMPLATE'];
}

include("/home/users/phpguru/templates/" . $template);

?>

Sending the following HTTP request:

GET /vulnerable.php HTTP/1.0
Cookie: TEMPLATE=../../../../../../../../../etc/passwd

... would give a response like this:

HTTP/1.0 200 OK
Content-Type: text/html
Server: Apache

root:fi3sED95ibqR6:0:1:System Operator:/:/bin/ksh
daemon:*:1:1::/tmp:
phpguru:f8fk3j1OIf31.:182:100:Developer:/home/users/phpguru/:/bin/csh

So, the PHP file vulnerable.php houses a simple templating system. By default, the code will return the red.php file. But if the request contains a cookie with the key TEMPLATE and a name for a file, the code will return that file instead. The developer has intended these templates to be returned from a specific /templates folder but is simply concatenating the intended path with the assumed file name. There is no protection in place preventing the user from requesting a template file with ../, which will instruct the code to look in a parent folder. So, the path resolves to filesystem root, then /etc/passwd, which contains sensitive information. The simple templating system can now be used to access any file on the filesystem.

Protection

Same-origin policy

Same-origin policy is a mechanism that enables restricting interaction between documents and scripts from different origins.

| Extra: Mozilla Developer Netword - same-origin policy

For resources to be from the same origin, they must have the same:

protocol (i.e., HTTP / HTTPS)
host
port

If the requests are to different origins, these are cross-origin requests.

Same-origin policy does usually allow some resources to be used cross-origin or from other servers. Usually embedding a cross-origin resource (for example images and scripts) is allowed, while reading a cross-origin resource is blocked. Cross-origin writes are typically allowed. Examples are links, redirects, and form submissions.

Writing AJAX requests and testing them in your own browser against some remote API can result in Cross-Origin Resource Sharing violations if the remote server is not configured to allow this.

For configuring your server's cross-origin policies, you should use Cross-Origin Resource Sharing (CORS), and Content Security Policy (CSP)

"Cross-Origin Resource Sharing (CORS) is a mechanism that uses additional HTTP headers to tell browsers to give a web application running at one origin, access to selected resources from a different origin." Mozilla Developer Network - CORS

CORS in a part of the Fetch standard. CORS enables cross-domain requests when using AJAX requests. For example with fetch() and XMLHttpRequest().

Origin header is used with CORS requests. The server's response to the CORS request can include headers like:

Access-Control-Allow-Origin
Access-Control-Allow-Headers
Access-Control-Allow-Methods

A good starting point resource for enabling CORS with different servers is Enable CORS

Content Security Policy

Content Security Policy, or CSP for short, helps mitigate attacks like XSS and data injections. It is based on an HTTP header Content-Security-Policy. To use CSP, your server must use this header in its responses. If the server or browser does not support CSP, they will just ignore the header and use their default same-origin policy.

CSP policies are defined in the value of the Content-Security-Policy header where allowed origins are listed for each resource type. default-src sets the default allowed origins for all resources. For example:

Content-Security-Policy: default-src 'self' https://tuni.fi yle.fi

Similarly, script-src and img-src can be used to set the allowed origins for these resource types. The Mozilla Developer Network website has more examples.

| Extra: Mozilla Developer Network - Content Security Policy

Video: Build a Todo app

Part 4 - front-end and CORS

Summary

Web is a tricky platform security-wise. There is no shortage of possible attackers, all of whom could have 24/7 access to try out all the tricks in the book. And the most common vulnerabilities are so well-known and easy to exploit that it doesn't take a particularly skilled individual to wreck a lot of havoc.

Some security features make it more difficult to develop code. CORS is an important safeguard, but it can also prevent legitimate programmers from creating new web applications.