IVLE - System Architecture
==========================

    Author: Matt Giuca
    Date: 10/12/2007

This document describes the high-level system architecture of IVLE,
specifically with respect to the "pluggable clients" interface.

Users and authorization
-----------------------

We need some way to authenticate users and store information about a logged-in
user. Whether they are stored in a database local to our system remains to be
seen.

Importantly, we need some way to send user information to the clients. This is
discussed in the "pluggable clients" section.

Pluggable clients
-----------------

The IVLE system is largely just a collection of various components, called
"clients", such as the file browser, text editor, console, tutorial sheets,
etc.

The architecture provides a common interface in which clients can be plugged
in.

Firstly, we want all HTML pages on the site to be generated with a common
header. The easiest way to do this is to write our own Python handler which
is common to the entire application (this replaces the standard handlers such
as Publisher).

This top-level handler handles all authentication issues (for instance,
checking the session to see if a user is logged in properly and if not,
redirecting to the login page). It then outputs the header, and calls the
appropriate client based on the URL.

Before the handler outputs a HTML header, it needs to consult the client to
see what the MIME type is of the page being requested. For instance, many
requested files will output JSON instead of HTML. Therefore part of the client
interface will be a mime type declaration. The handler queries this first,
completes the HTTP headers, and then proceeds to write an XHTML heading
section iff the mime type is appropriate for HTML content.

Clients *must* supply the correct MIME type as the handler is allowed to make
such inferences based on the MIME type. (For example, returning JSON content
with a mime type of "text/html" is bad because the handler will then write an
HTML header section).

### Plugin interface ###

The top-level handler will keep a Python file (or a text, JSON, etc file)
containing a list of valid clients. This is a dictionary mapping clients'
internal names (the top-level directories, as described below in "URLs" and
the "planned clients") to some other date about the clients (such as a
friendly name to display in the tabs, and a boolean as to whether or not to
display the client in the tabs).

Part of the HTML header which the handler generates is a set of tabs linking
to all of the clients in this list.

Each client will be located physically in a directory "clients", in a
subdirectory of the client's name. (eg. the console is located in
"clients/console"). There *must* be a file in this directory called
 **client.py**. This file is called by the handler for most requests.

 **Discussion**: We want the handler to handle many requests (such as for CSS,
JavaScript and image files) directly, simply loading a file from an absolute
location and serving it. Is it OK to simply have a list of file extensions
which will automatically be served without going into the client interface?
(eg. .js, .css, .jpg, etc). If so, can we let the webserver do this without
even bothering our main handler?

The remainder of this discussion ignores the possibility of such "unhandled"
files, assuming they have been served up and not passed to the client.

Inside client.py, there is a fixed interface which all clients must follow.
Firstly, there is a set of information which the handler must pass to the
client in numerous calls - such as username, URL, and nicely split up parts of
the URL such as the path, the GET variables, and also the POST data, as well
as mod_python's low-level Request object.
This information is encapsulated into an object and passed as a single
argument to the client handling functions.

Note that as stated above, the handler may need to insert HTML contents into
the output stream. Instead of having two separate function calls (a call to
find the mime type and a call to get content), we'll simply provide a wrapper
object to the client where the client can make callbacks to.

To this end, the client receives an object containing all of the information,
as well as an object with some methods to call. The handler passes this to a
function in client.py, `handle`. The callback object contains the following
methods:

* set_mime_type(string) - Sets the output mime type. May be called any number
  of times (including 0, will default to HTML), but may not be called after
  any calls to `write`.
* set_status(string) - Sets the HTTP response status. The string is a numeric
  code followed by a description, for example "404 File Not Found". May not be
  called after any calls to `write`.
* set_location(string) - Sets the Location field of the HTTP response to a new
  URL. For use with 300-level HTTP response codes. May not be called after any
  calls to `write`.
* write(string) - Writes raw data to the output.

Note that this is very similar to the CGI interface, but much higher level (we
have functions to call instead of writing strings, and we send the GET and
POST data in a packaged object instead of environment variables and stdin).

Note that, as with CGI, there is a "cutoff point" during the processing
(immediately when the first call to `write` is made) - in which the response
headers are written to the server. It is during this point that the handler
also writes the HTML header if the mime type is appropriate.

URLs
----

It would be good if we had full control of URLs and were able to make them
"nice" at all times. The criteria for "nice" URLs are as follows:

* The paths in the URLs reflect a sensible hierarchy of where you are in the
  program at the current time.
* The URLs do not contain any file extensions for the pages (no .html or
  .py), although linked files such as CSS, JavaScript and image files should
  have appropriate file extensions.
* The URLs do not contain unnecessary garbage arguments, and preferably no GET
  arguments at all (for instance, the file browser will specify the path to
  browse in the actual URL path, not the GET arguments.
* The URL does not contain the student's login name. This is implicit in the
  browser session. (This requirement allows for us to link to URLs in
  documentation which will work for any student).

The top-level directory given in the URL determines the client which the
handler will pass off to. For instance,

    http://www.example.com/ivle/console

Since IVLE is located at `http://www.example.com/ivle`, it will consider the
"top-level directory" to be "console", and therefore will call the client
"console".

The file browser's client name will be "home". This is a bit of a trick to
allow the file browser URLs to be completely natural. eg:

    http://www.example.com/ivle/home/151/proj1/

In this instance, the handler will see the top-level directory as "home", and
will therefore link to the file browser client. The file browser client will
then receive the additional arguments passed to it in some way, which in this
case are "/151/proj1/". The file browser will know where students directories
are stored (maybe "/home/students/") and also know the name of the student
from the session information, and will therefore be able to navigate to
"/home/students/jbloggs/151/proj1/".

Planned Clients
---------------

### File Browser ###

Top-level directory: `home`

### Text Editor ###

Top-level directory: `edit`

### Console ###

Top-level directory: `console`

### Tutorial Pages ###

Top-level directory: `tutorial`