15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
1 |
IVLE - System Architecture |
2 |
========================== |
|
3 |
||
4 |
Author: Matt Giuca |
|
5 |
Date: 10/12/2007 |
|
6 |
||
7 |
This document describes the high-level system architecture of IVLE, |
|
8 |
specifically with respect to the "pluggable clients" interface. |
|
9 |
||
10 |
Users and authorization |
|
11 |
----------------------- |
|
12 |
||
13 |
We need some way to authenticate users and store information about a logged-in |
|
14 |
user. Whether they are stored in a database local to our system remains to be |
|
15 |
seen. |
|
16 |
||
17 |
Importantly, we need some way to send user information to the clients. This is |
|
18 |
discussed in the "pluggable clients" section. |
|
19 |
||
20 |
Pluggable clients |
|
21 |
----------------- |
|
22 |
||
23 |
The IVLE system is largely just a collection of various components, called |
|
24 |
"clients", such as the file browser, text editor, console, tutorial sheets, |
|
25 |
etc. |
|
26 |
||
27 |
The architecture provides a common interface in which clients can be plugged |
|
28 |
in. |
|
29 |
||
30 |
Firstly, we want all HTML pages on the site to be generated with a common |
|
31 |
header. The easiest way to do this is to write our own Python handler which |
|
32 |
is common to the entire application (this replaces the standard handlers such |
|
33 |
as Publisher). |
|
34 |
||
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
35 |
This top-level handler handles all authentication (for instance, |
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
36 |
checking the session to see if a user is logged in properly and if not, |
37 |
redirecting to the login page). It then outputs the header, and calls the |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
38 |
appropriate client based on the URL. The test of "whether the student is an |
39 |
Informatics student" is considered part of the authentication layer. (So |
|
40 |
students who are not enrolled in Informatics are treated the same way as a |
|
41 |
garbage username). |
|
42 |
||
43 |
Note that some clients ("login" and "exec") do not require authentication. |
|
44 |
This will be one of the properties of the client in the global clients file. |
|
45 |
||
46 |
Note that the handler does *not* perform authorization - that is left up to |
|
47 |
the clients. |
|
48 |
||
49 |
One special feature of the handler will be the ability to write an XHTML |
|
50 |
header (which includes the user's name and links to profile page, IVLE logo, |
|
51 |
and tabs for all the clients). This is important to keep a consistent |
|
52 |
interface between the clients. This header will be available upon request from |
|
53 |
the client. It is up to the client to NOT request a header for non-HTML |
|
54 |
content (or it will be ruined), and also not to request a header when |
|
55 |
executing student's code (ie. the exec module will never request a header). |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
56 |
|
57 |
### Plugin interface ### |
|
58 |
||
59 |
The top-level handler will keep a Python file (or a text, JSON, etc file) |
|
60 |
containing a list of valid clients. This is a dictionary mapping clients' |
|
61 |
internal names (the top-level directories, as described below in "URLs" and |
|
62 |
the "planned clients") to some other date about the clients (such as a |
|
63 |
friendly name to display in the tabs, and a boolean as to whether or not to |
|
64 |
display the client in the tabs). |
|
65 |
||
66 |
Part of the HTML header which the handler generates is a set of tabs linking |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
67 |
to all of the clients in this list, or at least the ones with "show in tabs" |
68 |
turned on. Clients such as "exec" and "admin" will not have a tab. |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
69 |
|
70 |
Each client will be located physically in a directory "clients", in a |
|
71 |
subdirectory of the client's name. (eg. the console is located in |
|
72 |
"clients/console"). There *must* be a file in this directory called |
|
73 |
**client.py**. This file is called by the handler for most requests. |
|
74 |
||
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
75 |
All requests will go through the handler. Note that there is some media (such |
76 |
as CSS, JavaScript and image files which are directly part of the application |
|
77 |
itself), which we do not want to pass through the handler. These will be |
|
78 |
placed in a special top-level directory, which Apache will be told to serve |
|
79 |
directly. (eg. "/media"). |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
80 |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
81 |
This means that the contents of each client directory is a Python program |
82 |
*only*, and contains no files accessible by the browser. It consists of |
|
83 |
client.py, plus any Python files imported by client.py (but none of these |
|
84 |
files will directly serve web content). |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
85 |
|
86 |
Inside client.py, there is a fixed interface which all clients must follow. |
|
87 |
Firstly, there is a set of information which the handler must pass to the |
|
88 |
client in numerous calls - such as username, URL, and nicely split up parts of |
|
89 |
the URL such as the path, the GET variables, and also the POST data, as well |
|
90 |
as mod_python's low-level Request object. |
|
91 |
This information is encapsulated into an object and passed as a single |
|
92 |
argument to the client handling functions. |
|
93 |
||
94 |
Note that as stated above, the handler may need to insert HTML contents into |
|
95 |
the output stream. Instead of having two separate function calls (a call to |
|
96 |
find the mime type and a call to get content), we'll simply provide a wrapper |
|
97 |
object to the client where the client can make callbacks to. |
|
98 |
||
99 |
To this end, the client receives an object containing all of the information, |
|
100 |
as well as an object with some methods to call. The handler passes this to a |
|
101 |
function in client.py, `handle`. The callback object contains the following |
|
102 |
methods: |
|
103 |
||
104 |
* set_mime_type(string) - Sets the output mime type. May be called any number |
|
105 |
of times (including 0, will default to HTML), but may not be called after |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
106 |
any writing has been done. |
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
107 |
* set_status(string) - Sets the HTTP response status. The string is a numeric |
108 |
code followed by a description, for example "404 File Not Found". May not be |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
109 |
called after any writing has been done. |
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
110 |
* set_location(string) - Sets the Location field of the HTTP response to a new |
111 |
URL. For use with 300-level HTTP response codes. May not be called after any |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
112 |
writing has been done. |
113 |
* write_html_headers() - Writes the general site headers to the output stream. |
|
114 |
May not be called after any writing has been done. |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
115 |
* write(string) - Writes raw data to the output. |
116 |
||
117 |
Note that this is very similar to the CGI interface, but much higher level (we |
|
118 |
have functions to call instead of writing strings, and we send the GET and |
|
119 |
POST data in a packaged object instead of environment variables and stdin). |
|
120 |
||
121 |
Note that, as with CGI, there is a "cutoff point" during the processing |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
122 |
(immediately when the first call to `write` or `write_html_headers` is made) - |
123 |
in which the response headers are written to the server. |
|
124 |
||
31
by mattgiuca
doc/notes/architecture: Help files |
125 |
### Help files ### |
126 |
||
127 |
There will be a "help" app which is special in that it goes inside all of the |
|
128 |
other apps directories looking for a help file. So aside from "client.py", |
|
129 |
another special file is "help.html" which is a static help file for each |
|
130 |
module, sitting in that app's top-level directory. |
|
131 |
||
132 |
help.html is not to be served directly. The "help" app will embed it within |
|
133 |
another page. Therefore it is not a real HTML file - it should just be the |
|
134 |
inside of a body (it should not contain html or body tags). |
|
135 |
||
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
136 |
### Application directory hierarchy ### |
137 |
||
138 |
Due to the handler, we have a nice property that the application directory |
|
139 |
hierarchy is completely removed from the apparent hierarchy on the web. This |
|
140 |
has two opportunities: we can call the applications (in their directory |
|
141 |
hierarchy) a different name than the URL suggests, and also we can lay out the |
|
142 |
directory hierarchy with developers interests in mind. |
|
143 |
||
144 |
We capitalise on the first issue by mapping the "action" (url name) of a |
|
145 |
client to the actual name. (Clients are indexed by url-name so they can |
|
146 |
be looked up when a URL is requested). |
|
147 |
||
148 |
The proposed application directory hierarchy is: |
|
149 |
||
150 |
/ |
|
151 |
/clients - All clients go in here |
|
152 |
/clients/myclient - "actual" names of the clients |
|
153 |
/dispatch - Code files for the top-level dispatch |
|
154 |
/dispatch.py - Entrypoint for the top-level dispatch |
|
155 |
/media - Publically viewable files |
|
156 |
(Note that this directory hierarchy maps onto the web site) |
|
157 |
/media/myclient - media files specific to each client go in a subdir |
|
158 |
/media/dispatch - media files for the top-level dispatch |
|
159 |
/conf - Special .py files which hold configuration info (for the admin to |
|
160 |
edit, not the programmers). |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
161 |
|
162 |
URLs |
|
163 |
---- |
|
164 |
||
165 |
It would be good if we had full control of URLs and were able to make them |
|
166 |
"nice" at all times. The criteria for "nice" URLs are as follows: |
|
167 |
||
168 |
* The paths in the URLs reflect a sensible hierarchy of where you are in the |
|
169 |
program at the current time. |
|
170 |
* The URLs do not contain any file extensions for the pages (no .html or |
|
171 |
.py), although linked files such as CSS, JavaScript and image files should |
|
172 |
have appropriate file extensions. |
|
173 |
* The URLs do not contain unnecessary garbage arguments, and preferably no GET |
|
174 |
arguments at all (for instance, the file browser will specify the path to |
|
175 |
browse in the actual URL path, not the GET arguments. |
|
176 |
* The URL does not contain the student's login name. This is implicit in the |
|
177 |
browser session. (This requirement allows for us to link to URLs in |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
178 |
documentation which will work for any student). (Note that URLs may contain |
179 |
other students login names for browsing their work - this is determined by |
|
180 |
the individual clients). |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
181 |
|
182 |
The top-level directory given in the URL determines the client which the |
|
183 |
handler will pass off to. For instance, |
|
184 |
||
185 |
http://www.example.com/ivle/console |
|
186 |
||
187 |
Since IVLE is located at `http://www.example.com/ivle`, it will consider the |
|
188 |
"top-level directory" to be "console", and therefore will call the client |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
189 |
whose action is "console". This may not be the actual name of the client. For |
190 |
example, the "edit" action maps onto the "editor" client, while the "serve" |
|
191 |
action maps onto the "exec" client. (Perhaps it is best for simplicity if |
|
192 |
these do in fact correspond). |
|
193 |
||
194 |
For another example, consider the file browser (action name "files"). The URL |
|
195 |
may have subdirectories after it which indicate the path to explore. This will |
|
196 |
be detailed in the clients section below. An example of a browse URL is: |
|
197 |
||
198 |
http://www.example.com/ivle/files/jdoe/151/proj1/ |
|
199 |
||
200 |
In this instance, the handler will see the top-level directory as "files", and |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
201 |
will therefore link to the file browser client. The file browser client will |
202 |
then receive the additional arguments passed to it in some way, which in this |
|
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
203 |
case are "jdoe/151/proj1/". The file browser client will then handle this path |
204 |
and serve up the correct directory. |
|
205 |
||
206 |
### Relative URLs inside HTML content ### |
|
207 |
||
208 |
It is a requirement that the application can be placed anywhere in a web |
|
209 |
server's directory hierarchy, not just at the top level. This means HTML |
|
210 |
should never contain absolute URLs (beginning with, eg, "/browse"). Then it |
|
211 |
would need to be in the site root. |
|
212 |
||
213 |
To solve the problem of how to generate URLs, one of the fields the handler |
|
214 |
will pass into the clients (which it will read from a config file somewhere) |
|
215 |
will be the "site root". This may be "/ivle", for instance. Therefore all |
|
216 |
absolute URLs generated by the applications must be prepended with the "site |
|
217 |
root". (In our case the site root will probably be "/", but it's a good |
|
218 |
feature to have). |
|
219 |
||
220 |
### Student's directory hierarchy, common code ### |
|
221 |
||
222 |
Many clients share the concept of exploring the student's directory hierarchy, |
|
223 |
as explained above for the browser module. The common code for handling the |
|
224 |
student id or group name (etc) and authorization will be available as a |
|
225 |
separate module for all such clients (browser, editor, exec) to use. |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
226 |
|
227 |
Planned Clients |
|
228 |
--------------- |
|
229 |
||
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
230 |
### File Browser, Text Editor and Executor ### |
231 |
||
232 |
Three of the most important clients are the file browser ("browser"), text |
|
233 |
editor ("editor") and executor ("exec"). These three share a commonality in |
|
234 |
that they all access the student's directory hierarchy and files. They all |
|
235 |
share a lot of code in common, and in particular, there is a common |
|
236 |
server-side handler for file access, directory listings and subversion. |
|
237 |
||
238 |
Firstly, every file and directory is classified into one of the following |
|
239 |
categories (based on its inferred MIME type and possibly whether it contains |
|
240 |
invalid Unicode characters): |
|
241 |
||
242 |
1. Directory |
|
243 |
2. Image |
|
244 |
3. Audio |
|
245 |
4. Text file (unless it fits the above, eg, SVG files) |
|
246 |
5. Any other binary file |
|
247 |
||
248 |
How each of these is handled depends on which of the 3 clients is accessing |
|
249 |
the file. |
|
250 |
||
251 |
#### File Browser #### |
|
252 |
||
253 |
Name: `browser` |
|
254 |
Action name: `files` |
|
255 |
Tab name: "Files" |
|
256 |
||
257 |
1. Directory - Displays a directory listing (this is its primary purpose). |
|
258 |
2. Image - Displays the image inside the main navigation interface. |
|
259 |
3. Audio - (non-core) Provides a streaming audio player within the main |
|
260 |
navigation interface. |
|
261 |
4. Text file - Redirect to edit. |
|
262 |
5. Binary file - Provides a download link within the main navigation |
|
263 |
interface. |
|
264 |
||
265 |
Note that no matter what, using browser will remain within the navigation |
|
266 |
interface so you will never be "lost" inside a raw image or something. It also |
|
267 |
will not throw binary files as downloads directly to you. |
|
268 |
||
269 |
Note that the src of the image tag in (2) and the href of the download link in |
|
270 |
(5) will simply be links to the exec version of the same file. |
|
271 |
||
272 |
File browser will include the Python file which serves up JSON responses to |
|
273 |
requests for directory hierarchies, and performs SVN and file access commands. |
|
274 |
This file will be used by the text editor (at least) and possibly exec. |
|
275 |
||
276 |
#### Text Editor #### |
|
277 |
||
278 |
Name: `editor` |
|
279 |
Action name: `edit` |
|
280 |
Tab name: "Edit" |
|
281 |
||
282 |
No matter what, editor provides a text area (with advanced editing |
|
283 |
capabilities and syntax highlighting) for any file, even if it is binary. The |
|
284 |
only exception is directories, which redirect to browser. |
|
285 |
||
286 |
Note that it will not be possible to click into the editor for a binary file |
|
287 |
(the browser will not offer an edit link). However, it will still be possible |
|
288 |
to manually nav there, and then you handle the shock yourself. |
|
289 |
||
290 |
#### Executor #### |
|
291 |
||
292 |
Name: `exec` |
|
293 |
Action name: `serve` |
|
294 |
Tab name: (not shown) |
|
295 |
||
296 |
The executor is used to directly serve files out of a student's directory, as |
|
297 |
if it was a standard web server. (It can be thought of as a little web server |
|
298 |
inside IVLE). This means that: |
|
299 |
||
300 |
* A whitelist of file types is kept which simply are served up raw. This |
|
301 |
includes HTML, JavaScript, CSS, all reasonable image and audio formats, etc. |
|
302 |
* Special "executable" file types (.py, .psp). Exec will call popen on a |
|
303 |
Python process which loads a mod_python handler, cgihandler or psphandler on |
|
304 |
the given file. |
|
305 |
* HTTP errors for banned files. |
|
17
by mattgiuca
design notes/architecture.txt: Figured out what to do with directories within |
306 |
* When presented with a directory, it first tries to execute `__init__.py` |
307 |
(the default item for the directory). It could also look for `index.html` or |
|
308 |
`index.psp` if that failed. Failing that, it returns an HTTP 403 Forbidden |
|
309 |
error. |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
310 |
|
311 |
### Console ### |
|
312 |
||
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
313 |
Name: `console` |
314 |
Action name: `console` |
|
315 |
Tab name: "Console" |
|
15
by mattgiuca
design/notes: Added architecture (describing client plugin interface). |
316 |
|
317 |
### Tutorial Pages ### |
|
318 |
||
16
by mattgiuca
design notes/architecture.txt: Reworked and added sections following |
319 |
Name: `tutorial` |
320 |
Action name: `tutorial` |
|
321 |
Tab name: "Tutorial" |
|
322 |
||
323 |
### Administration ### |
|
324 |
||
325 |
Name: `admin` |
|
326 |
Action name: `admin` |
|
327 |
Tab name: (not shown) |
|
328 |
||
329 |
Client checks authorization for admin status. Tab is not shown so students |
|
330 |
will not normally know about this (but even if they find it they will be |
|
331 |
denied access). |
|
332 |
||
333 |
### Login ### |
|
334 |
||
335 |
Name: `login` |
|
336 |
Action name: `login` |
|
337 |
Tab name: (not shown) |
|
338 |
||
339 |
Authentication not required. Presents a login box. |
|
340 |
||
341 |
Other similar clients are "logout" (which just immediately logs the current |
|
342 |
user out and redirects to the main page), and "profile" (user settings). |