I am currently working on a building community website in PHP. This contains forms that a user can fill right from registration to lot of other functionality. I am not an Object-oriented guy, so I am using functions most of the time to handle my application. I know I have to learn OOPS, but currently need to develop this website and get it running soon.
Anyway, here's a sample of what I let my app. do: Consider a page (register.php) that has a form where a user has 3 fields to fill up, say: First Name, Last Name and Email. Upon submission of this form, I want to validate the form and show the corresponding errors to the users:
<form id="form1" name="form1" method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>">
<label for="name">Name:</label>
<input type="text" name="name" id="name" /><br />
<label for="lname">Last Name:</label>
<input type="text" name="lname" id="lname" /><br />
<label for="email">Email:</label>
<input type="text" name="email" id="email" /><br />
<input type="submit" name="submit" id="submit" value="Submit" />
</form>
This form will POST the info to the same page. So here's the code that will process the POST'ed info:
<?php
require("functions.php");
if( isset($_POST['submit']) )
{
$errors = fn_register();
if( count($errors) )
{
//Show error messages
}
else
{
//Send welcome mail to the user or do database stuff...
}
}
?>
<?php
//functions.php page:
function sql_quote( $value )
{
if( get_magic_quotes_gpc() )
{
$value = stripslashes( $value );
}
else
{
$value = addslashes( $value );
}
if( function_exists( "mysql_real_escape_string" ) )
{
$value = mysql_real_escape_string( $value );
}
return $value;
}
function clean($str) {
$str = strip_tags($str, '<br>,<br />');
$str = trim($str);
$str = sql_quote($str);
return $str;
}
foreach ($_POST as &$value)
{
if (!is_array($value))
{
$value = clean($value);
}
else
{
clean($value);
}
}
foreach ($_GET as &$value)
{
if (!is_array($value))
{
$value = clean($value);
}
else
{
clean($value);
}
}
function validate_name( $fld, $min, $max, $rule, $label ) {
if( $rule == 'required' )
{
if ( trim($fld) == '' )
{
$str = "$label: Cannot be left blank.";
return $str;
}
}
if ( isset($fld) && trim($fld) != '' )
{
if ( isset($fld) && $fld != '' && !preg_match("/^[a-zA-Z\ ]+$/", $fld))
{
$str = "$label: Invalid characters used! Only Lowercase, Uppercase alphabets and Spaces are allowed";
}
else if ( strlen($fld) < $min or strlen($fld) > $max )
{
$curr_char = strlen($fld);
$str = "$label: Must be atleast $min character & less than $max char. Entered characters: $curr_char";
}
else
{
$str = 0;
}
}
else
{
$str = 0;
}
return $str;
}
function validate_email( $fld, $min, $max, $rule, $label ) {
if( $rule == 'required' )
{
if ( trim($fld) == '' )
{
$str = "$label: Cannot be left blank.";
return $str;
}
}
if ( isset($fld) && trim($fld) != '' )
{
if ( !eregi('^[a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.([a-zA-Z]{2,4})$', $fld) )
{
$str = "$label: Invalid format. Please check.";
}
else if ( strlen($fld) < $min or strlen($fld) > $max )
{
$curr_char = strlen($fld);
$str = "$label: Must be atleast $min character & less than $max char. Entered characters: $curr_char";
}
else
{
$str = 0;
}
}
else
{
$str = 0;
}
return $str;
}
function val_rules( $str, $val_type, $rule='required' ){
switch ($val_type)
{
case 'name':
$val = validate_name( $str, 3, 20, $rule, 'First Name');
break;
case 'lname':
$val = validate_name( $str, 10, 20, $rule, 'Last Name');
break;
case 'email':
$val = validate_email( $str, 10, 60, $rule, 'Email');
break;
}
return $val;
}
function fn_register() {
$errors = array();
$val_name = val_rules( $_POST['name'], 'name' );
$val_lname = val_rules( $_POST['lname'], 'lname', 'optional' );
$val_email = val_rules( $_POST['email'], 'email' );
if ( $val_name != '0' ) { $errors['name'] = $val_name; }
if ( $val_lname != '0' ) { $errors['lname'] = $val_lname; }
if ( $val_email != '0' ) { $errors['email'] = $val_email; }
return $errors;
}
//END of functions.php page
?>
OK, now it might look like there's a lot, but lemme break it down target wise:
So here are my questions:
This pretty much makes me feel secure as I am forcing the user to correct malicious data and won't process the final data unless the errors are corrected. Am I correct?
Does the method that I follow guarantee the speed (as I am using lots of functions and their corresponding calls)? The fields of a form differ and the minimum number of fields I may have at any given point of time in any form may be 3 and can go upto as high as 100 (or even more, I am not sure as the website is still being developed). Will having 100's of fields and their validation in the above way, reduce the speed of application (say upto half a million users are accessing the website at the same time?). What can I do to improve the speed and reduce function calls (if possible)?
Can I do something to improve the current ways of validation?
I am holding off object oriented approach and using FILTERS in PHP for the later. So please, I request you all to suggest me way to improve/tweak the current ways and suggest me if the script is vulnerable or safe enough to be used in a Live production environment. If not, what I can do to be able to use it live?
To answer your questions:
$_POST
and $_GET
values as needed rather than all at once. In other words, I would call clean()
on $str
in val_rules()
. Also, I find it really confusing when using include()
or require()
to pull in a script and then realizing that that script executes and performs some function "behind the scenes". I think you'll find that using include()
or require()
works best when the files that are being pulled in are simply used to declare functions, constants, and classes. It is important to be able to clearly follow all execution "out in the open", that is, on the page that is actually doing some work. Similarly, I would not allow the fn_register()
function to directly access the $_POST
array without passing this in as an argument instead. It's important to be able to look at a function when reading some code and see immediately what data it is acting on without having to look at the function definition. In other words, all data that a function is acting on should ideally be passed in as a value. Finally, and this is a major issue, your name validation is too strict. It will not tolerate hyphens, apostrophes, accented characters, etc. I think you'll find that being liberal in name validation is the best policy. Some culture's naming conventions don't even have what we consider to be first and last names, for example. Another issue: instead of using 'required' and 'optional', use a boolean value for required. If it is set to TRUE
, then the field is required.Your script is vulnerable to cross-site scripting attacks in the first line. The value of $_SERVER['PHP_SELF'] can not be trusted. You must filter it. Here's a simple fix:
<form id="form1" name="form1" method="post" action="<?php echo htmlspecialchars($_SERVER['PHP_SELF'], ENT_QUOTES, 'utf-8'); ?>">
replace UTF-8 with other encoding if needed.
You can try input like
http://site.com/page.php?a=%22onload%3D%22javascript%3Aalert('XSS')%22
against your site to see if it pop ups an alert.
Update: it seems that at least in PHP 5.3, your code snippet is not vulnerable to XSS. However, I can not guarantee (neither can you) what PHP_SELF will produce in older versions of PHP, or in the future. The value of PHP_SELF must be encoded.