too long

I have this form in a index.html file.

<form method="post" action="index.php" accept-charset="UTF-8">
    <input id="a" name="a" type="text">
    <input type="submit" name="run_query" value="Add User" size="30">
</form>

And I am trying to pass the text input to a pyton script as argument by embedding the following php code into the index.html file:

<?
    session_start();
    ob_start();
    if(isset($_REQUEST['run_query'])) {
    $add_user = $_REQUEST['a'];
    $command = "add_author.py $add_user";
        exec($command); 
    }
?>

I have put the add_author.py file into the same folder where index.html lays. It works fine with any string. But if I try to use strings which contains ä ö é it does not work.

The python file looks like this.

import sys
import codecs
if __name__ == '__main__':
    wFile = codecs.open("test.txt", "w", "utf8")
    wFile.write(" ".join(sys.argv[1:]))
    wFile.close()

By the way: The index.html has this line in it.

<meta charset="utf-8" />

I would love to hear from a better approach of managing my task or a correction of my approach. Thank you!

You can use Python directly with CGI. It should be faster than calling Python from PHP. It should be easier to configure too.

Simple example.

#!/usr/bin/python

import cgi;
import codecs;

form = cgi.FieldStorage()
my_a = form.getvalue("a","")

wFile = codecs.open("test.txt", "w", "utf8")
wFile.write(my_a);
wFile.close()

print("Content-Type: text/plain")
print("Location: ../plain.html")
print()

You have to put this python file into directory for CGI scripts. The most common is /cgi-bin/. Well, the server may need some configuration too.

The 3 last lines are simple http headers. In my example it just redirect to other site. There is no content to display. The getvalue("a","") will return value of field "a" or the empty string (second argument). Well, almost regular Python file.

# -*- coding: utf-8 -*-

At the top of your file should force utf encoding.

# -*- coding: utf-8 -*-
import sys

if __name__ in '__main__':
    with open('test.txt','w') as out:
        out.write(''.join(sys.argv[1:]).encode("utf-8"))

Should work just fine

Why not have PHP write to the file instead of calling another python script?

if (!$handle = fopen("test.txt", 'a')) {
    echo "Cannot open file ($filename)";
    exit;
}

if (fwrite($handle, $_REQUEST['a']) === FALSE) {
    echo "Cannot write to file ($filename)";
    exit;
}

If you insist on using the python script maybe you need to encode it first, but generally check with the different approaches stated here: http://docs.python.org/howto/unicode.html My guess is that you just need to call unicode() on the string

wFile.write(unicode(sys.argv[1:]))

The actual problem with the PHP code seems to be that the argument "$add_user" added to the "command" is not escaped or secured in any way. This makes it possible to send anything into the "exec", making the system vulnerable to attacks. The webcomic XKCD has a "funny" example of this problem: http://xkcd.com/327/

The cause of what you see is that UTF-8 encoded "åäö" begins with an unprintable byte that causes problems in many older shells, depending on system configuration.