?

Log in

No account? Create an account

CGI pitfalls

« previous entry | next entry »
Oct. 5th, 2012 | 10:45 pm

Just for fun, try to make sense of the following situation. You've uploaded a CGI script written in Perl to your webserver; however, instead of doing anything useful, it only throws a 500 Internal Server Error at you, and your error_log only says "Premature end of script headers".

Script permissions are fine, so you try to work out where the problem is, and eventually end up with the following test script:

#!/usr/bin/perl

use CGI qw(:standard);
print header(), "Hello world!";

...which still gives the above result. So you decide to ssh into the webserver and see if that one's configuration is weird or different from yours somehow in a way that would cause an otherwise unremarkable and obviously correct script to fail:

$ perl test.pl
Content-Type: text/html; charset=ISO-8859-1

Hello world!
$ 

...huh. It's working just fine, and it does output headers and everything, so why isn't working as a CGI script? Your webhoster's Apache installation can be assumed to be in working order, BTW, given that they're a large commercial hoster.

The solution: after some head-scratching and hair-pulling, you wonder if it's something about permissions after all, so you try the following:

$ ./test.pl
zsh: ./test.pl: bad interpreter: /usr/bin/perl^M: no such file or directory
$ 

AHA! You happened to edited this script on a windows box, and while Perl itself ignores the hashbang line when interpreting the script, the shell obviously does not. Apparently, neither does Apache, and your (S)FTP client was not in a mood to perform line end conversion on a text file it transferred, either.

After you fix this, everything works like a charm, but you wonder if there couldn't have been an informative error message SOMEWHERE. (Just what are error logs for, anyway?)

Link | Leave a comment | Share

Comments {8}

tzisorey

(no subject)

from: tzisorey
date: Oct. 5th, 2012 10:30 pm (UTC)
Link

I have often wished that error 500 were a little more specific than just "Internal Server Error"

But from a security perspective, I can understand why it doesn't.

Reply | Thread

Schneelocke

(no subject)

from: schnee
date: Oct. 5th, 2012 10:39 pm (UTC)
Link

*nods* Well, there's things like use CGI::Carp qw(fatalsToBrowser);, but it doesn't help when the script doesn't even get a chance to run. :) Other than that, it's useful for debugging.

And yes, I can see why it isn't the default. OTOH, the error log should be a little more specific really.



Edited at 2012-10-05 10:39 pm (UTC)

Reply | Parent | Thread

(no subject)

from: tamino
date: Oct. 6th, 2012 04:35 am (UTC)
Link

You were lucky that zsh gave you a caret, and an M, rather than a literal ^M character -- usually what you get when that happens is:

$ ./test.pl
: no such file or directory
$

because it prints the string "/usr/bin/perl", and then a ^M, which makes it go back to the beginning of the line, and then it prints ": no such file or directory" which overwrites the /usr/bin/perl :-P

Reply | Thread

Schneelocke

(no subject)

from: schnee
date: Oct. 6th, 2012 09:16 am (UTC)
Link

*nods* Even that would've been a hint that there's something wrong with the script rather than webserver, at least. But yes, you're right, it would've been more confusing.

Reply | Parent | Thread

Jaffa

(no subject)

from: jaffa_tamarin
date: Oct. 6th, 2012 01:04 pm (UTC)
Link

So somebody back in the 60s decided, "oh, we're going to have a different line separator than these other guys, because it's a little bit more convenient for what we want to do right now," and since then 1000s of programmer hours have been wasted dealing with the bugs and side-effects from having two different standards for text files.

Reply | Thread

Schneelocke

(no subject)

from: schnee
date: Oct. 6th, 2012 06:38 pm (UTC)
Link

Pretty much — and we've been stuck with it ever since. :P (Hmm, I wonder if modern x86 CPUs still have an A20 gate...)

Was it actually in the 60s? I'm sort of wondering now just why there are different standards there, anyway. I'd imagine the windows convention probably goes back to CP/M, but why did they deviate from what must already have been a standard in the Unix world then?

*checks* Ah, looks like Wikipedia has a bit of history there. Hmm, as far as I can tell, the salient point is basically "Multics (and later Unix) used device drivers to abstract hardware details, while CP/M did not".

One wonders when windows will finally make the switch to the only sensible standard. :)

Reply | Parent | Thread

(no subject)

from: xviith_et_seq
date: Oct. 7th, 2012 08:49 am (UTC)
Link

FWIW, CRLF is already widely recognized in many of the core text-based Internet protocols as the only acceptable newline form (when delimiting protocol elements, not necessarily for data being carried), though you probably knew that.

One might playfully argue that CR and LF are both throwbacks to the teletype era, and that with neither of them truly representing the desired effect and the combination being unwieldy, the “correct” thing to do is migrate to U+2028 LINE SEPARATOR, with the side effect of smashing everyone over the head with UTF-8 and extinguishing the old mostly-8-bit code-page-y national encodings…

Reply | Parent | Thread

Schneelocke

(no subject)

from: schnee
date: Oct. 7th, 2012 09:42 am (UTC)
Link

Yeah — I wonder why CR/LF is mandated so often in the first place, though; what led to that decision?

And I certainly agree about Unicode, but that carries its own set of problems. (Which I shouldn't go into here, since I don't actually know THAT much about it.)

Reply | Parent | Thread