~drizzle-trunk/drizzle/development

1861.2.1 by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation.
1
Drizzle Replication
2
===================
3
4
Replication events are recorded using messages in the `Google Protocol Buffer
5
<http://code.google.com/p/protobuf/>`_ (GPB) format. GPB messages can contain
6
sub-messages. There is a single main "envelope" message, Transaction, that
7
is passed to plugins that subscribe to the replication stream.
8
1938.3.1 by David Shrewsbury
Add --replicate-query option.
9
Configuration Options
10
---------------------
11
12
**transaction_message_threshold**
13
14
    Controls the size, in bytes, of the Transaction messages. When a Transaction
15
    message exceeds this size, a new Transaction message with the same
16
    transaction ID will be created to continue the replication events.
17
    See :ref:`bulk-operations` below.
18
19
20
**replicate_query**
21
22
    Controls whether the originating SQL query will be included within each
23
    Statement message contained in the enclosing Transaction message. The
24
    default global value is FALSE which will not include the query in the
25
    messages. It can be controlled per session, as well. For example:
26
27
    ``drizzle> set @@replicate_query = 1;``
1861.2.1 by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation.
28
1994.5.14 by Stewart Smith
talk about statement rollback (briefly) in replication docs
29
    The stored query should be used as a guide only, and never executed
30
    on a slave to perform replication as this will lead to incorrect results.
31
1861.2.1 by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation.
32
Message Definitions
33
-------------------
34
35
The GPB messages are defined in .proto files in the drizzled/message
36
directory of the Drizzle source code. The primary definition file is
37
transaction.proto. Messages defined in this file are related in the
38
following ways::
39
40
41
  ------------------------------------------------------------------
42
  |                                                                |
43
  | Transaction message                                            |
44
  |                                                                |
45
  |   -----------------------------------------------------------  |
46
  |   |                                                         |  |
47
  |   | TransactionContext message                              |  |
48
  |   |                                                         |  |
49
  |   -----------------------------------------------------------  |
50
  |   -----------------------------------------------------------  |
51
  |   |                                                         |  |
52
  |   | Statement message 1                                     |  |
53
  |   |                                                         |  |
54
  |   -----------------------------------------------------------  |
55
  |   -----------------------------------------------------------  |
56
  |   |                                                         |  |
57
  |   | Statement message 2                                     |  |
58
  |   |                                                         |  |
59
  |   -----------------------------------------------------------  |
60
  |                             ...                                |
61
  |   -----------------------------------------------------------  |
62
  |   |                                                         |  |
63
  |   | Statement message N                                     |  |
64
  |   |                                                         |  |
65
  |   -----------------------------------------------------------  |
66
  ------------------------------------------------------------------
67
68
with each Statement message looking like so::
69
70
  ------------------------------------------------------------------
71
  |                                                                |
72
  | Statement message                                              |
73
  |                                                                |
74
  |   -----------------------------------------------------------  |
75
  |   |                                                         |  |
76
  |   | Common information                                      |  |
77
  |   |                                                         |  |
78
  |   |  - Type of Statement (INSERT, DELETE, etc)              |  |
79
  |   |  - Start Timestamp                                      |  |
80
  |   |  - End Timestamp                                        |  |
81
  |   |  - (OPTIONAL) Actual SQL query string                   |  |
82
  |   |                                                         |  |
83
  |   -----------------------------------------------------------  |
84
  |   -----------------------------------------------------------  |
85
  |   |                                                         |  |
86
  |   | Statement subclass message 1 (see below)                |  |
87
  |   |                                                         |  |
88
  |   -----------------------------------------------------------  |
89
  |                             ...                                |
90
  |   -----------------------------------------------------------  |
91
  |   |                                                         |  |
92
  |   | Statement subclass message N (see below)                |  |
93
  |   |                                                         |  |
94
  |   -----------------------------------------------------------  |
95
  ------------------------------------------------------------------
96
97
The Transaction Message
98
^^^^^^^^^^^^^^^^^^^^^^^
99
100
The main "envelope" message which represents an atomic transaction
101
which changed the state of a server is the Transaction message class.
102
103
The Transaction message contains two pieces:
104
105
#. A TransactionContext message containing information about the
106
   transaction as a whole, such as the ID of the executing server,
107
   the start and end timestamp of the transaction, and a globally-
108
   unique identifier for the transaction.
109
#. A vector of Statement messages representing the distinct SQL
110
   statements which modified the state of the server.  The Statement
111
   message is, itself, a generic envelope message containing a
112
   sub-message which describes the specific data modification which
113
   occurred on the server (such as, for instance, an INSERT statement.
114
115
The Statement Message
116
^^^^^^^^^^^^^^^^^^^^^
117
118
The generic "envelope" message containing information common to each
119
SQL statement executed against a server (such as a start and end timestamp
120
and the type of the SQL statement) as well as a Statement subclass message
121
describing the specific data modification event on the server.
122
123
Each Statement message contains a type member which indicates how readers
124
of the Statement should construct the inner Statement subclass representing
125
a data change.
126
1994.5.14 by Stewart Smith
talk about statement rollback (briefly) in replication docs
127
Statements are recorded seperatley as sometimes individual statements
128
have to be rolled back.
129
1861.2.1 by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation.
130
1938.3.1 by David Shrewsbury
Add --replicate-query option.
131
.. _bulk-operations:
132
1861.2.1 by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation.
133
How Bulk Operations Work
134
------------------------
135
136
Certain operations which change large volumes of data on a server
137
present a specific set of problems for a transaction coordinator or
138
replication service. If all operations must complete atomically on a
139
publishing server before replicas are delivered the complete
140
transactional unit:
141
142
#. The publishing server could consume a large amount of memory
143
   building an in-memory Transaction message containing all the
144
   operations contained  in the entire transaction.
145
#. A replica, or subscribing server, is wasting time waiting on the
146
   eventual completion (commit) of the large transaction on the
147
   publishing server. It could be applying pieces of the large
148
   transaction in the meantime...
149
150
In order to prevent the problems inherent in (1) and (2) above, Drizzle's
151
replication system uses a mechanism which provides bulk change
152
operations.
153
154
When a regular SQL statement modifies or inserts more rows than a
155
certain threshold, Drizzle's replication services component will begin
156
sending Transaction messages to replicas which contain a chunk
157
(or "segment") of the data which has been changed on the publisher.
158
159
When data is inserted, updated, or modified in the database, a
160
header containing information about modified tables and fields is
161
matched with one or more data segments which contain the actual
162
values changed in the statement.
163
164
It's easiest to understand this mechanism by following through a real-world
165
scenario.
166
167
Suppose the following table::
168
169
  CREATE TABLE test.person
170
  (
171
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
172
  , first_name VARCHAR(50)
173
  , last_name VARCHAR(50)
174
  , is_active CHAR(1) NOT NULL DEFAULT 'Y'
175
  );
176
177
Also suppose that test.t1 contains 1 million records.
178
179
Next, suppose a client issues the SQL statement::
180
181
  UPDATE test.person SET is_active = 'N';
182
183
It is clear that one million records could be updated by this statement
184
(we say, "could be" since Drizzle does not actually update a record if
185
the UPDATE would not change the existing record...).
186
187
In order to prevent the publishing server from having to construct an
188
enormous Transaction message, Drizzle's replication services component
189
will do the following:
190
191
#. Construct a Transaction message with a transaction context containing
192
   information about the originating server, the transaction ID, and
193
   timestamp information.
194
#. Construct an UpdateHeader message with information about the tables
195
   and fields involved in the UPDATE statement.  Push this UpdateHeader
196
   message onto the Transaction message's statement vector.
197
#. Construct an UpdateData message.  Set the segment_id member to 1.
198
   Set the end_segment member to true.
199
#. For every record updated in a storage engine, the ReplicationServices
200
   component builds a new UpdateRecord message and appends this message
201
   to the aforementioned UpdateData message's record vector.
202
#. After a certain threshold of records is reached, the
203
   ReplicationServices component sets the current UpdateData message's
204
   end_segment member to false, and proceeds to send the Transaction
205
   message to replicators.
206
#. The ReplicationServices component then constructs a new Transaction
207
   message and constructs a transaction context with the same
208
   transaction ID and server information.
209
#. A new UpdateData message is created.  The message's segment_id is
210
   set to N+1 and as new records are updated, new UpdateRecord messages
211
   are appended to the UpdateData message's record vector.
212
#. While records are being updated, we repeat steps 5 through 7, with
213
   only the final UpdateData message having its end_segment member set
214
   to true.
215
216
Handling ROLLBACKs
217
------------------
218
1994.5.14 by Stewart Smith
talk about statement rollback (briefly) in replication docs
219
Both transactions and individual statements may be rolled back.
220
1861.2.2 by David Shrewsbury
Correct rollback handling section.
221
When a transaction is rolled back, one of two things happen depending
222
on whether the transaction is made up of either a single Transaction
223
message, or if it is made up of multiple Transaction messages (e.g, bulk
224
load).
1861.2.1 by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation.
225
1861.2.2 by David Shrewsbury
Correct rollback handling section.
226
* For a transaction encapsulated entirely within a single Transaction
227
  message, the entire message is simply discarded and not sent through
228
  the replication stream.
229
* For a transaction which is made up of multiple messages, and at least
230
  one message has already been sent through the replication stream, then
231
  the Transaction message will contain a Statement message with type =
1938.3.1 by David Shrewsbury
Add --replicate-query option.
232
  ROLLBACK.