26 – Perl and CGI
December 5, 2009 – 10:17 pmPlease note… This information no longer exists at the referenced locations. This is only a copy of what was available in 2003.
Basic Linux Training™
Perl and CGI
Kay Schenk
Table of Contents
- What is CGI?
- What is Perl?
- How Are Perl and CGI Related?
- What Else Can I Do With Perl?
- Further on Down the Perly Road
- Assignments
What Is CGI?
The acronym CGI stands for common gateway interface. It is a mechanism, actually a specification, for allowing web servers “to gather information from a user request and respond to it” (defintion by Peter Wainwright, Professional Apache). In other words, CGI provides a means for web servers to run server-side programs and return results to the client browser. These server-side, or backend programs are often in response to a client initiated action like filling out a form, though this isn’t a requirement. The important thing to remember from this simple little introduction is that a CGI program is run on the web server, unlike client scripting mechanisms like Javascript which run in the client browser. Client scripts generally can not access server side files or entities without some explicit setup or mechanisms in place to allow for this.
CGI, unlike many other modern “server side” programming mechanisms like PHP for example, does not typically require additional software to be installed on the web server. It is a specification that is part of the HTTP protocol. On nearly all web servers that can be installed, this capability, along with many others, can simply be enabled. CGI, in the early days of the Web was the one of the few mechanisms that could be used for generating dynamic content. It is still widely used today.
In the very early days of the Web, CGI programming was accomplished by writing server programs with well-known and available programming tools like C and shell scripts since the oldest web servers were developed on UNIX systems. Now, of course, CGI programming can be done with virtually any tool or language available for your particular server, among these: shell scripts, C programs, Perl, Java, Python, etc. CGI programs are initiated or caused to run by an appropriate link in an HTML page that typically is of the form:
<a href ="http://www.webserver.com/cgi-bin/myscript"> run me </a>
In this case, the script myscript is run from a directory on the server called cgi-bin once the user clicks the “run me” phrase.
myscript must be constructed so that it can be executed correctly by your web server. Normally this means all standard output goes back to the client’s browser. Standard input is reassigned to what is passed from the client browser via various CGI parameter passing mechanisms (typically command line query parameters or form fields), so the traditional interpretation of standard input is meaningless in this case. In other words, a CGI program can not suddenly intercept keyboard input, for example. All input to a CGI program is basically “pre-given” to it.
It’s important to note from the above sample CGI call line that the CGI script/program is run from the web server directory called “cgi-bin”. This is just a convention, not a necessity. Sometimes you’ll read documentation that says to place the program or script in the server’s “cgi-bin” directory. While most web servers do, in fact, have a main “cgi-bin” directory that typically can only be accessed by the server administrator, it is not the only place CGI programs can be run. But this is very site dependent. CGIs can be placed nearly anywhere on a web server. Their location and other information about them, specific file naming conventions, etc., are matters of your internal server setup. Most web servers, including Apache, allow for specific directories to be used to “serve” CGI programs, as well as specific naming conventions, “myscript.cgi” for entities that are to be considered cgi programs/scripts.
A good source for more information CGI can be found at CGI101.com
What Is Perl?
The greatest thing since sliced bread! OK, seriously, PERL stands for “Practical Extraction and Reporting Language”. As Dave Barry would say, “I am not making this up”. According to the most recent version of Progamming Perl, 3rd Edition (the camel book, an absolute “must have”),
“Perl is a langauge for getting your job done”.
A rather boastful and encompassing statement but truth no less!
Perl is an interpretive procedural (well mostly) programming language. It can be used for thousands of applications on your system, from simple one-time utilities to full bore applications for systems management. (See, for a reference, O’Reilly’s Perl for System Administration). It’s capabilities run the gamut from very simple easy to understand code to incredibly complicated and powerful programming constructs. Learn Perl and become a happy person!
Perl does contain object oriented constructs but if you’ve used any other object oriented languages like Java, you’re sure not to love Perl’s OO features. Perl’s current production release is 5.8, with the highly revised and anticipated 6.0 still in the wings. (The current nasty OO syntax is supposedly undergoing complete revision with version 6.0–stay tuned.)
A nice online course covering Perl can be found at Introduction to Perl.
How Are Perl and CGI Related?
Perl and CGI have recently become culturally close in the web world, because, frankly, most CGI programs now are written in Perl. This has given Perl a stepchild reputation as a “scripting language”, which was never its original intention. It became popular for this and its many other uses because of its ease of use and incredible versatility.
There are many sites which address CGI programming with Perl. The Beginner’s Guide to CGI Scripting with Perl looks to be a good place to start.
But, just to give you glimpse of what CGI programming with Perl is like. Let’s assume you’ve got Apache set up on your Linux box with a cgi-bin area defined for your use. Typically, this is /usr/local/https/cgi-bin but you’ll need to check your specific installation. Use your favorite text editor and type in this little Perl program (or just cut-and-paste this into a file) . Then save it to something like “env.pl” (you might need to change the first line to where perl is really located on your system):
#!/usr/local/bin/perl
# CGI to dump client environment variables back
#
print "Content-type: text/html\n\n";
print<< "TOP";
<html>
<head><title>cgi test</title><head>
<body>
<h2>Env variables as seen by httpd</h2>
TOP
foreach $key (sort keys %ENV) {
print "$key: $ENV{$key}<br>\n";
}
print "</body> </html>\n";
Before we do more with this, here’s what’s going on. The first line, as in a shell script, just declares that this is a Perl script and what will be invoked when it’s run. The print "Content-type: text/html\n\n"; is a CGI necessity to feed the output back to the client’s browser, declaring the output to follow as being of type “text/html”. The next bunch of lines up to the foreach loop are what’s known as “here is” output style in Perl and shell scripting terms, and basically says, “take the stuff from here to my end tag,( in this case TOP) and dump them out just as you see them”. Use of the “here is” technique, as opposed to individual print statements can save loads of irritation with escaping characters that need to be escaped. The foreach loop will just take the values in the user’s environment variables, sort them by variable name, and dump out the variable name and value in a nice little list (all back to the user client, the browser).
Yes, this is designed as a CGI script, but, this is the fun part, you can just run it at the command line and have it produce something intelligible–your current system environment variables! You’ll get the HTML tags in the output as well of course, but run it to see what you get. The values you get should be identical to what’s produced by a env command. Naturally, not all CGI Perl scripts will lend themselves to this.
To run this as a CGI script on your own or another system, place it in a web server directory that is “cgi accessible”, then just access it directly from the location bar in your browser–www.mywebserver.com/cgi-bin/env.pl.
You may need to configure your web server to accept scripts with “.pl” as cgi scripts or rename the script to env.cgi to get it to work.
The output back to you should be a listing of the HTTP environment variables and their values supported by the server/browser combination you’re using.
On Apache 1.3 with Mozilla 0.9.7 you’ll get something like:
Env variables as seen by httpd
DOCUMENT_ROOT: /usr/local/httpd/htdocs
GATEWAY_INTERFACE: CGI/1.1
HTTP_ACCEPT: text/xml, application/xml, application/xhtml+xml,
text/html;q=0.9, image/png, image/jpeg, image/gif;q=0.2,
text/plain;q=0.8, text/css, */*;q=0.1
HTTP_ACCEPT_CHARSET: ISO-8859-1, utf-8;q=0.66, *;q=0.66
HTTP_ACCEPT_ENCODING: gzip, deflate, compress;q=0.9
HTTP_ACCEPT_LANGUAGE: en-us
HTTP_CONNECTION: keep-alive
HTTP_HOST: kayslinux:80
HTTP_KEEP_ALIVE: 300
HTTP_USER_AGENT: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.7)
Gecko/20011221
PATH: usr/sbin:/usr/bin
QUERY_STRING:
REMOTE_ADDR: 132.241.87.10
REMOTE_PORT: 32824
REQUEST_METHOD: GET
REQUEST_URI: /cgi-bin/env.pl
SCRIPT_FILENAME: /usr/local/httpd/cgi-bin/env.pl
SCRIPT_NAME: /cgi-bin/env.pl
SERVER_ADDR: 132.241.87.10
SERVER_ADMIN: root@kayslinux.com
SERVER_NAME: kayslinux.com
SERVER_PORT: 80
SERVER_PROTOCOL: HTTP/1.1
SERVER_SIGNATURE:
Apache/1.3.20 Server at kayslinux.com Port 80
A huge compendium of cgi scripts, most of which are written in Perl can be found at cgi-resources.com. And don’t be afraid to make changes to suit your needs!
One final note on CGI programming with Perl. At some point, Lincoln Stein decided there must be a better way to do some of this so the CGI.pm module was created. It might seem a bit daunting at first but well worth the investment in time for more complicated CGI programming with Perl.
What Else Can I Do With Perl?
General Programing
Two of Perl’s greatest strengths are its string processing capabilities and its nifty programming structure called an associative or “hash” array. Learning more about how to use Perl’s regular expression capabilities coupled with hash arrays can take you pretty far down the Perl programming road in no time. Because of these capabilities, Perl is great at processing input files which aren’t quite “regular”.
Consider, for example, the following bit of an ldif dump from an LDAP directory:
ssn: 664450033
accttype: 1
initstatus: N
uid: iluvperl
ssn: 445265286
accttype: 1
initstatus: N
uid: nmneil2
ssn: 047671519
accttype: 2
initstatus: N
uid: betsieb
ssn: 459009749
accttype: 1
initstatus: N
uid: jlsteinke
uid: mpoppins
ssn: 088127722
accttype: 7
initstatus: W
and you wanted to produce an output file one line per entry in the following order: ssn, accttype, initstatus, uid. You could use paste, specify the line end as two line-end characters and route the output to a new file. But that last item is out of order for some reason, so paste won’t work. Ok, this is a small file and you could just go fix it first, but what if the file were thousands of lines long and you didn’t know how many entries were out of order. Time to look into various options with sort perhaps? Or… use Perl! The little program below solves this problem.
#!/usr/local/bin/perl
$/="\n\n";
open (INFILE,"ldif_file");
open (OUTFILE,">new_ldif");
while ($input=<INFILE>) {
$input=~s/ //g;
@infield=split(/:|\n/,$input);
%myhash=@infield;
print OUTFILE "$myhash{'ssn'},$myhash{'accttype'},$myhash{'initstatus'},$myhash{'uid'},\n";
}
In summary, this program…
- Defines the record file separator as two consecutive line-end (\n) characters
- Defines an input file, INFILE, and opens it
- Defines an output file, OUTFILE, and opens it for output
- Reads records from the input file (actually four lines per record)
- Take spaces out of input record
- Splits up the records into fields based on either a ‘:’ or a line end character
putting the parts into an array. The array will look like:
[ssn,004450033,accttype,1,initstatus,N,uid,ilavran]
for the first record - Shove the normal array elements into a hash array.
The hash array will have “key” names then that are the ldif field names,
and data values which are the corresponding ldif values.
Or think of the hash key names as being now the odd-valued element values
of the normal array, and the hash values as being the even-numbered
element values of the original array. - Print the ldif values, without the field names, in the order you need.
Hopefully at this point you have a flavor for what you can do with Perl. But, you’ll never work with data files right? The point is that Perl is capable and often used for just regular programming chores when using shell scripts will just involve way too much effort for the outcome.
Other Web Projects
Back to the web, Perl is not only a great CGI mechanism but a wonderful way to maintain time or date sensitive information on a web site that needs to be dynamic but not interactively dynamic. This can be accomplished via a combination of Perl programs and cron. Ever thought a web page with a complete listing of all your system’s man pages would be cool? Do it with Perl and some other system utilities like rman. (Perl has some very cool mechanisms for running any command and piping the output into a Perl file for further processing.) Want to build a user index page so visitors can find pages for your system’s users? Do it with Perl and refresh it nightly. Perl has internal functions for with dealing many system entities like the /etc/passwd and /etc/group files.
Further on Down the Perly Road
perl.com
The perl.com web site is the primary resource site for all things Perl. There you’ll find the latest version to download and install. And, as the installation caveat says, you’ve got Linux, you’ve got the tools to do your own compile–just do it! The main Perl site also offers a “Resources” link which you will no doubt find invaluable.
Perl Modules–CPAN
One thing you should invesitgate relatively soon if you get into much Perl programming work is the CPAN site. CPAN hosts the complete compendium (well mostly all anyway) of Perl modules. Perl modules, developed using Perl OO (object oriented) techniques are “add on” packages that can make doing your Perl work much easier, if not downright possible at all. CPAN’s interface for searching for what you want has much improved over the past year so if you’re interested in some module to deal with “processing IMAP mail” for example, you won’t have any trouble finding what you need. Look for a module before reinventing the wheel. And, installing modules is relatively straightforward if your distro has installed perl in its default location, /usr. If not, well…you may have to mess with the module’s Makefile a bit.
If you intend to use modules much, you’ll probably want to find out a little more about them. Consult any good Perl reference which covers this or use man perlmod and man perlmodinstall. Perl module design makes use of Perl OO techniques, so you may want to look at the Perl OO Beginner’s Guide man perlboot as well at some point. Finally, a good introductory tutorial on OO Perl can be found at the IBM Developerworks site. Work backwards from this one, Chapter 5 in a series on Perl tutorials, and you’ll find other useful information as well.
A Perl Program Using ModulesHere’s a fun little Perl program using modules that queries a web server and dump’s its headers, if available. You’ll be able to see what server is installed along with other information. Consult the documentation for HTTP::Response (one of your assignments) for additional information.
#!/usr/bin/perl require LWP::UserAgent; require HTTP::Request; require HTTP::Response; $my_server=$ARGV[0]; print "$my_server\n"; $ua = LWP::UserAgent->new; #$request = HTTP::Request->new(GET => "http://${my_server}/"); $response = $ua->get("http://${my_server}/"); $headers=$response->headers()->as_string(); print "$headers\n";
ActiveState
A company called ActiveState also offers a Perl version for Linux complete with the Perl man pages all HTML-ized, as an alternative, and its own Perl package manager called ppm. ppm uses ActiveState’s own repository of modules, and has some interesting search features as well. It can not be used directly with CPAN modules, which still need to be installed separately, but the ppm repository contains many of the more popular modules from CPAN. If you install ActiveState’s Perl, it will erase the default perl installation, so be careful on this one!
Assignments
- If you installed Apache on your system locate the
cgi-bin directory and see what’s already there. - Locate all the Perl programs that came with your system.
These can usually be identified with the .pl file extension.
Take a look at some of them to see what they do. - Find all the Perl modules installed on your system.
They have a file extension of .pm. Look at the documenation for some of them usingman "module name"where “module name” has the form XXXX::something. - Go to CPAN’s search site and find the documentation for HTTP::Response used in the module example. Pay particular attention to the SYNOPSIS section–this is how you invoke the module, how ito “instantiate” an object for the module using the “new” constructor method, and what other methods (routines) are available for the module.If you’re feeling really gutsy, download and install the libwww-perl bundle which contains the HTTP as well as the LWP modules. Then you can play with them on your own system!
- Take the “env.pl” script and modify it to print a “Hi” message to the user (client browser) that says something like:
Hello, ipnumber! I see you’re using browser_info!
Have a nice day.before it dumps out the rest of the environment info.
Hint: Individual elements of the ENV hash can be referenced by –
$ENV{element-name}. - Modify the second little program to only process records if accttype=1.
(Look up Perl “if” blocks.) - Take the on-line Perl course.
- Purchase a good beginning Perl book. The original and still one of the best is Learning Perl, clear and to the point. You’ll be pleasantly surprised at how easy Perl really is!
Terms and Concepts:
- Perl
- CGI
- regular expression
- Perl module
- CGI environment variables
- array
- hash
- CPAN
- perlmongers
Files & Directories
- /usr/bin/perl
- /usr/lib/perl5
- /usr/local/httpd
- /etc/httpd
Online:
Copyright © 1997-2001 Henry White. Copyright © 2001-2003 Kay Schenk. All Rights Reserved.
Reproduction or redistribution without prior written consent is strictly prohibited. Address comments and inquiries to info@basiclinux.net.
Sorry, comments for this entry are closed at this time.