Writing Web Pages with SSI

		Some Notes on Creating Web Sites
		Using Server-Side Includes
		================================

* What does a web server do?
  --------------------------
Let's start by differentiating between "the Web" and "the Internet".
Even knowledgeable people often use these two terms incorrectly.

The Internet is a world-wide network of computers.  The first two
computers in the Internet were linked in 1969, over twenty years
before the World Wide Web existed.  Before the Web, people all over
the world used the Internet for e-mail, sharing files (via protocols
like ftp), reading news (via newsgroups distributed using a protocol
named nntp), and many other things.

When Tim Berners-Lee created the first "Web Server" in 1991, he was
just adding one more kind of service to the already-rich mix of things
that used the Internet.  A web server's job, at its simplest, is to
accept requests for files and send those files to the requestor.

Usually, someone using "the Web" runs a program called a "web browser"
on his or her computer.  This might be a program like Firefox, Chrome
or Internet Explorer.  The browser's job is to send requests to web
servers.  Each web server responds by sending back files, which often
contain a special kind of marked-up text called HTML ("Hyper Text
Markup Language").  It's "marked-up" in the sense that it contains
special instructions saying that some text should be bold, or in a
different color, or even that a particular bit of text refers to
another document, somewhere else.  When the browser receives such a
file, it displays it for the user.

Here's a simple HTML document:

       <html>
       <body>
       Hi there!  This is a test.
       Here's some text in <b>bold</b>.
       And here's some <font color=blue>blue text</font>.
       And here's a <a href="http://www.phys.virginia.edu">link to another page</a>.
       </body>
       </html>

If the web browser doesn't specify the name of the document, the web
server looks for a document with a default name.  This name will vary
from one web server to another, but often it will be something like
"index.html", "home.html", "default.html", etc.

Each document available from the web server is called a "web page".
The whole collection of documents available on a particular web server
is called a "web site".

A person who creates a web page can add images to it by simply
inserting some special markup commands that point to image
documents. When the web browser sees such markup commands in an HTML
document it has fetched, it asks the web server for those image
documents, too, and displays them in their proper places when it shows
you the web page.

The simplest web pages are just HTML files sitting on a web server's
disk.  When the web server receives a request for one of these files
(from your web browser), it just reads the file from the disk and
sends you a copy of the file's contents to you.



Early on in the history of the Web, people realized that they wanted
more than just static documents, though.  Wouldn't it be great if a
web server could generate a custom-made response by querying a
database, for example?  And wouldn't it be nice if web pages displayed
by your browser could have animated behavior, like pop-up menus?  And
wouldn't it be wonderful if it were possible for web page developers
to specify exactly how their web pages would look, moving things
around on the page into particular positions?

To satisfy these desires, web page developers needed to move beyond
static HTML files.  Today, if you intend to create a highly
interactive "web application" that's as complex as the programs
installed on your computer, you'll need to learn at least five
different languages:

* HTML, obviously, since this is still the basic structure of web
  pages.

* SQL, or some other database query language.

* Perl, PHP, Ruby, or some other "glue" language in which you can
  write programs to convert database results into web pages.  These
  programs are run on the web server.

* CSS, which is a language for specifying, in great detail, the
  "style" of each part of an HTML document.

* Javascript, which is a language for writing programs that run inside
  the web browsers of people who view your web page.

Developing a web application is thus a Big Deal, not to be attempted
without trepidation.

Fortunately, you don't have to make a choice between completely static
HTML files and an elaborate web application.  There are lots of
intermediate options that go slightly beyond the abilities of HTML,
but are easy to learn.  Today, I'm going to talk about one of these.
It's a system called "Server-Side Includes", or "SSI".

* Server-Side Includes:
  ---------------------
Consider the illustration above, showing what happens when your
browser contacts a web server.  In this simple case, the web server
just fetches the file off of the disk, and hands it across the network to
the browser.

What if we go a step farther, though, and allow the web server to make
changes to the file before sending it to your browser?  That's what
SSI does.  It's called "Server-Side" because the content of the file
is modified on the web server before it's sent to your browser.

You use SSI by inserting special directives into your files.  The
simplest of these is the "include" statement.  Take a look at the
following HTML file.  We've just modified the example above by adding
a SSI include:

       <html>
       <body>
       <!--#include virtual="header.html" -->
       Hi there!  This is a test.
       Here's some text in <b>bold</b>.
       And here's some <font color=blue>blue text</font>.
       And here's a <a href="http://www.phys.virginia.edu">link to another page</a>.
       </body>
       </html>

The third line tells the web server to fetch the content of another
file, "header.html", and insert it at this point in our file before
delivering the combined content to a web browser.  The header file
might contain some menus, a logo, or any other content.  We could
easily insert this line into all of the files on our web site, to give
them all a uniform look.  When we wanted to change the header of the
pages, we'd only need to edit header.html, and the change would affect
all of the files on our web site.

Note that the syntax of SSI statements is very strict.  Each of the
spaces in the "include" line is mandatory, and no extra spaces are
allowed.  SSI works in Apache, IIS, and nginx, the most popular
web servers.  Some web servers allow more freedom in inserting spaces,
but it's best to stick to the strict standards if you want your code
to be portable from one web server to another.

Remember that web browsers never see the SSI commands.  They're
processed by the web server before the file is sent over the network.

OK, so "include" statements look useful.  They're the part of SSI that
got it its name.  But SSI provides a lot of other useful stuff, too.

In general, SSI lets you:

* Dynamically include the content of other files.

* Insert "if" statements into your HTML, so that some sections are
  only sent to the browser under given conditions.

* Display the values of a set of pre-defined variables.

* Set your own variables, for display or for use within "if"
  statements.

* SSI Variables:
  --------------
Let's look at variables next.  You can see a list of all of the
built-in variables by adding a line like the following to an HTML
file:

	<!--#printenv -->

This acts like the "printenv" command at the Linux command line, and
when you point a browser at this page you'll see something like this:

	SERVER_SOFTWARE=Apache/2.2.3 (CentOS)
	SERVER_NAME=example.com
	SERVER_ADDR=192.168.100.106
	SERVER_PORT=80
	REMOTE_ADDR=192.168.4.8
	DOCUMENT_ROOT=/home/httpd/html
	SERVER_ADMIN=root@example.com
	SCRIPT_FILENAME=/home/httpd/html/compfac/junk.html
	REMOTE_PORT=56304
	GATEWAY_INTERFACE=CGI/1.1
	SERVER_PROTOCOL=HTTP/1.1
	REQUEST_METHOD=GET
	QUERY_STRING=
	REQUEST_URI=/products/ssitest.html
	SCRIPT_NAME=/products/ssitest.html
	DATE_LOCAL=Wednesday, 01-Apr-2009 20:32:22 EDT
	DATE_GMT=Thursday, 02-Apr-2009 00:32:22 GMT
	LAST_MODIFIED=Wednesday, 01-Apr-2009 20:31:51
	EDT
	DOCUMENT_URI=/products/ssitest.html
	USER_NAME=root
	DOCUMENT_NAME=ssitest.html
	etc...

Consider the variable called "LAST_MODIFIED", for example.  This
contains the date and time at which the current file was last
modified.  You can display the value of this variable in your HTML
files.  Let's modify our HTML example once again:

       <html>
       <body>
       <!--#include virtual="header.html" -->
       Hi there!  This is a test.
       Here's some text in <b>bold</b>.
       And here's some <font color=blue>blue text</font>.
       And here's a <a href="http://www.phys.virginia.edu">link to another page</a>.
       This file was last modified <!--#echo var="LAST_MODIFIED" -->.
       </body>
       </html>

The "echo" SSI directive just prints out the value of a given
variable.  You'll often see the "last modified" date displayed at the
bottom of web pages.  This is one way to put it there.  This will
automatically show you the time at which the file was last changed.

* SSI Conditional Statements:
  ---------------------------
What if we want a web page to look different, depending on whether the
viewer is inside our organization or outside?  We can accomplish that
by looking at the variable called "REMOTE_ADDR" and using an SSI "if"
statement.  REMOTE_ADDR contains the IP address of the web browser
that's requesting a document from the web server.

Let's modify our HTML example again:

       <html>
       <body>
       <!--#include virtual="header.html" -->
       Hi there!  This is a test.
       Here's some text in <b>bold</b>.
       And here's some <font color=blue>blue text</font>.
       And here's a <a href="http://www.phys.virginia.edu">link to another page</a>.

	<!--#if expr="$REMOTE_ADDR = /^192\.2\./" -->
	Hi there!
	This text is only visible to browsers
	on the 192.2.*.* network.
	<!--#endif -->

       This file was last modified <!--#echo var="LAST_MODIFIED" -->.
       </body>
       </html>

There are several different forms of the "if" statement in SSI, but
this is probably the most useful one.  Here we compare the value of
REMOTE_ADDR with a given Regular Expression.  (Note that Apache uses
the same regular expression syntax as Perl5, documented here:
http://perldoc.perl.org/perlretut.html .)  If it matches, then the web
server inserts the given content before delivering the file to the
browser.

We can test any variable similarly, even variables we create
ourselves.  Here's how to define your own variables in SSI:

 <!--#set var="PAGE_MAINTAINER" value="Elvis Presley (elvis@graceland.com)" -->

You could then use this variable anywhere within your page, with the
"echo" SSI command.  Sometimes, web developers define variables in a
separate file that can be "include"ed into every page.  If we put the
line above into "variables.html", for example, we could then say:

 <!--#include virtual="variables.html" -->

in each of the files on our web page, and then we'd have access to
those variables.  Then, whenever we need to change the value of a
variable, we only need to change it in one place: variables.html.

With the Apache web server, we can also set variables in a .htaccess
file (http://httpd.apache.org/docs/2.2/howto/htaccess.html).  The
syntax there is:

	SetEnv PAGE_MAINTAINER "Elvis Presley (elvis@graceland.com)"

Variables set in the .htaccess file are automatically available
to all web pages in directories underneath the one where the
.htaccess file resides.

* Enabling SSI:
  -------------
For most web servers, SSI can be enabled or not, depending on the
preferences of the system's administrator.  In many cases, you'll
find that the SSI commands above just work, without your needing
to do anything else.

In some cases, Apache administrators will give users the ability
to turn SSI on or off for their own web pages.  In this case, you
may find that you need to create a .htaccess file at the top
of your web directory tree containing a line like this:

	Options +Includes

* Further information:
  --------------------
For more information about SSI, see the following web pages:

- Apache SSI tutorial:
	http://httpd.apache.org/docs/current/howto/ssi.html

- Full Apache SSI documentation:
	http://httpd.apache.org/docs/current/mod/mod_include.html

- Nginx SSI documentation:
	http://wiki.nginx.org/HttpSsiModule

- IIS SSI documentation:
	http://msdn.microsoft.com/en-us/library/ms525185(v=vs.90).aspx