Preg_match vs. mysql_real_escape_string to prevent sqli
Replies: 2   Views: 1997  Subscribers: 0

Posted by Justin · 27-05-2012 - 09:25

Edited by reece · 12-11-2012 - 04:53
Recently I was in a debate with Litespeed about validating user input being queried to an SQL database to prevent SQLi(SQL Injections).
The resulting methods me and LiteSpeed came up with are the following. 

The mysql_real_escape_string function is specifically for validating query input. It is most effective due to its simplicity. How it works is once you connect to the database succesfully, you then call it and it validates the query strings. It accepts one argument(the input string being queried to the database) and automatically checks to make sure it is query safe. The following code snippet demonstrates it's use:
mysql_connect("localhost", "root", "");
­preg_match­ preg_match uses regex(regular expressions) patterns which can be made to limit user input. The advantage of using preg_match is it doesn't revolve around SQL and can be used for a number of things. In effect, using preg_match you can check user input before connecting to a database. The downside is that preg_match is much more complex to use(considering you need to know regex). Here is an example: ­
$input = "thiswillwork";
$match = null;
preg_match("/^[0-9a-zA-Z]{1,1024}$/", $input, $match);
if ($match != null)
	//make connection
­Compare and Contrast­ Lets go ahead now and take a look at both methods side by side. Rating each winner as preg_match, neutral, or mysql_real_escape_string, here is the rundown. ­Effectiveness:­ While mysql_real_escape_string is specifically meant for blocking out any type of invalid input, it is limited to only validating after a connection has been made. To the contrary, preg_match can test a string anytime anywhere, however this doesn't seem to matter much. mysql_real_escape_string winner: neutral ­Simplicity:­ When it comes to simplicity, it is much easier hands down to write 'mysql_real_escape_string($input);' then to implement the same functionality using preg_match. On top of the extra code, you will also need to have a decent understanding of regex. Finally, preg_match would leave more room for error since your regex pattern might not be perfect as apposed to using the premade mysql_real_escape_string function which will always make sure that nothing gets passed that shouldn't. winner: mysql_real_escape_string With the above stated, when it comes to the validating user input for SQL queries, mysql_real_escape_string seems to be the most efficient approach.­

Posted by reece · 27-05-2012 - 16:00

Edited by reece · 12-11-2012 - 04:54

As of PHP 5.4.0, ­magic_quotes_gpc­ is deprecated. On some (x)AMP setups, and even some webhosts they make the mistake of enabling this by default. It should however be disabled by default. If it is turned on, you will need to incorporate php's built in function ­stripslashes()­ into your code before your query.

If you are have access to your setups php.ini on your deployment server, turn it off:

; usually /etc/php.ini or something similar
magic_quotes_gpc = off­
Otherwise if you do not have access to php.ini, then you are not able to turn it off by ­ini_set()­ which means before using ­mysql_real_escape_string()­ you must make use of ­stripslashes()­ to sanitise your data first. Though the mysql_real_escape_string() docs do already mention this, i wanted to emphasise the importance of it as i have personally encountered issues with this myself. ­preg_match vs mysql_real_escape_string­ In my own personal opinion, it is often safer to go with the simpler solution than the more complex one, though of course there are them times when simple does not suffice. When it comes to security, i would much rather offload this task to the very capable hands of the hundreds/thousands of developers who engineer such security functions as mysql_real_escape_string as opposed to attempting to create my own. Using preg_match is indeed powerful, however you would be reinventing the wheel where circumstances do not call for it. The number of possible combinations you would need to account for would be exhaustive, and yet the numerous engineers who develop php would have already invested vast amounts of time into their own sanitising functions. And when it fails they fixed it as a community, so in all likely hood it would account for a great many more possible bad inputs that a custom solution written by one person. A vast team of open source developers are more likely to notice a security hole and fix it quickly then a single developer would notice and fix one. Which is why i personally would rather use the built in functions. This all said, there are occasions when such built in functions can garble your input when you really don't want it to. So if your expecting some special input, such as binary or a case of input somewhat resembling the traits you would find in a MySQL query then there are other approaches you can take. You could for example, serialise the input to uphold the integrity of the input, then unserialise it on retrieval. ­Conclusion:­ Go with php's built in functions, and when the occasion calls for it, use something more specialised when php's built in functions don't suffice. If you like, you can still use your custom solutions implementing preg_match and pass the result to mysql_real_escape_string for further sanitisation if you wanted to preserve some aspect of the output by pre-emptively escaping it. ­[EDIT]­ Out of curiosity, i would be curious to know what LiteSpeeds views on this were.­

Posted by LiteSpeed · 28-05-2012 - 06:47

I totally 100% agree with you Reece.

Go with mysql_real_escape_string, its made by the people who made MySQL and i think they would know a thing or two about protecting their software ­[:SMUG:]­