41
by mattgiuca
Moved Tom's notes from README (application doc) to notes/session.txt |
1 |
Notes on mod_python.Session |
2 |
--------------------------- |
|
3 |
||
4 |
Author: Tom Conway |
|
5 |
Date: 12/12/2007 |
|
6 |
||
7 |
``mod_python`` provides some automagic for cookie based sessions. It |
|
8 |
carefully separates most of the session logic from how the session is |
|
9 |
stored. The base class `BaseSession` contains most of the logic, and |
|
10 |
``mod_python`` itself has three derived classes for storing session |
|
11 |
objects in-memory, dbm, and on the filesystem. In each case, the |
|
12 |
implementations use apache's locking mechanism to serialize updates to |
|
13 |
the store of cookies. This mechanism takes care of mutual exclusion |
|
14 |
between the multiple processes of an apache instance, but does not |
|
15 |
provide any facility to provide any kind locking for multiple servers |
|
16 |
sharing the filesystem for file bases session storage. There is code |
|
17 |
for storing sessions in MySQL (and SQLLite) floating round on the net, |
|
18 |
though none has made it in to any distributions. This code uses the |
|
19 |
underlying database to take care of the locking. |
|
20 |
||
21 |
In the case of IVLE, we wish to be able to share the session objects not |
|
22 |
merely between the separate processes of an apache instance, but between |
|
23 |
the multiple servers in a load balancing cluster. There are three high |
|
24 |
level strategies we could use to deal with this: |
|
25 |
||
26 |
1. Use a static load balancing strategy such as hashing the client's IP |
|
27 |
address to determine which node in the cluster should serve the |
|
28 |
request. |
|
29 |
||
30 |
2. Use a SQL backend to store sessions, or create a filesystem based |
|
31 |
storage mechanism that does the necessary locking. |
|
32 |
||
33 |
3. Work around the problem by using session objects in a way that avoids |
|
34 |
the locking problems. |
|
35 |
||
36 |
Strategy 1 has the advantage that we could use in-memory or dbm session |
|
37 |
storage without having to worry about race conditions between servers. |
|
38 |
On the other hand, it can run into serious problems if the distribution |
|
39 |
of IP addresses is such that load is not balanced. This can be the case |
|
40 |
if an ISP uses NAT firewalling (some do!), since all the requests from |
|
41 |
that ISP will aparently be coming from a single IP address and will therefore |
|
42 |
be routed to the same node in the cluster. As well as the potential for |
|
43 |
failing to balance the load, such a scheme, if it works routes an equal |
|
44 |
proportion of requests to each node in the cluster. At times when overall |
|
45 |
load is light, this may mean that we lose the opportunity to put nodes into |
|
46 |
a powersaving mode, when they are superfluous. |
|
47 |
||
48 |
Strategy 2, while having the advantage of avoiding race conditions, is likely |
|
49 |
to be expensive. The use of a SQL backend is likely to be quite slow, and the |
|
50 |
SQL backend itself will be subject to significant load (i.e. at least one op |
|
51 |
per request). A filesystem based solution is likely to be quite slow too. |
|
52 |
It has to work on a shared filesystem, for which locking is a general issue |
|
53 |
(generally, you end up using `mkdir` as the mechanism for creating a lock). |
|
54 |
If we want mutable session information, then we will *have* to do something in |
|
55 |
this vein. |
|
56 |
||
57 |
Strategy 3 is fragile because we need to be careful about how we use |
|
58 |
session objects, but if the constraints are simple enough to be practicle |
|
59 |
then avoiding the locking issue is highly desirable. A simple constraint |
|
60 |
that may be workable is to require that once created, a session object is |
|
61 |
treated as read-only until it is deleted. It is possible (though unlikely) |
|
62 |
we could create session objects that immediately become orphaned, but we |
|
63 |
will not ever create a situation in which the application does anything bad. |
|
64 |
If we can make strategy 3 work, then it is easily the best strategy to use. |
|
65 |
||
66 |
The main use for session objects in IVLE will be to *cache* authentication and |
|
67 |
authorization information. This means that when a user logs in, we authenticate |
|
68 |
(the authentication mechanism is not important to our current discussion), |
|
69 |
then retrieve the authorization information for that user, and store it in |
|
70 |
the session object. For each page access until the user logs out, we can then |
|
71 |
use the information from the session object. |