1861.2.1
by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation. |
1 |
Drizzle Replication
|
2 |
===================
|
|
3 |
||
4 |
Replication events are recorded using messages in the `Google Protocol Buffer |
|
5 |
<http://code.google.com/p/protobuf/>`_ (GPB) format. GPB messages can contain
|
|
6 |
sub-messages. There is a single main "envelope" message, Transaction, that |
|
7 |
is passed to plugins that subscribe to the replication stream. |
|
8 |
||
1938.3.1
by David Shrewsbury
Add --replicate-query option. |
9 |
Configuration Options
|
10 |
---------------------
|
|
11 |
||
12 |
**transaction_message_threshold**
|
|
13 |
||
14 |
Controls the size, in bytes, of the Transaction messages. When a Transaction |
|
15 |
message exceeds this size, a new Transaction message with the same |
|
16 |
transaction ID will be created to continue the replication events. |
|
17 |
See :ref:`bulk-operations` below. |
|
18 |
||
19 |
||
20 |
**replicate_query**
|
|
21 |
||
22 |
Controls whether the originating SQL query will be included within each |
|
23 |
Statement message contained in the enclosing Transaction message. The |
|
24 |
default global value is FALSE which will not include the query in the |
|
25 |
messages. It can be controlled per session, as well. For example: |
|
26 |
||
27 |
``drizzle> set @@replicate_query = 1;``
|
|
1861.2.1
by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation. |
28 |
|
1994.5.14
by Stewart Smith
talk about statement rollback (briefly) in replication docs |
29 |
The stored query should be used as a guide only, and never executed |
30 |
on a slave to perform replication as this will lead to incorrect results. |
|
31 |
||
1861.2.1
by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation. |
32 |
Message Definitions
|
33 |
-------------------
|
|
34 |
||
35 |
The GPB messages are defined in .proto files in the drizzled/message |
|
36 |
directory of the Drizzle source code. The primary definition file is |
|
37 |
transaction.proto. Messages defined in this file are related in the |
|
38 |
following ways:: |
|
39 |
||
40 |
||
41 |
------------------------------------------------------------------ |
|
42 |
| |
|
|
43 |
| Transaction message |
|
|
44 |
| |
|
|
45 |
| ----------------------------------------------------------- |
|
|
46 |
| | | |
|
|
47 |
| | TransactionContext message | |
|
|
48 |
| | | |
|
|
49 |
| ----------------------------------------------------------- |
|
|
50 |
| ----------------------------------------------------------- |
|
|
51 |
| | | |
|
|
52 |
| | Statement message 1 | |
|
|
53 |
| | | |
|
|
54 |
| ----------------------------------------------------------- |
|
|
55 |
| ----------------------------------------------------------- |
|
|
56 |
| | | |
|
|
57 |
| | Statement message 2 | |
|
|
58 |
| | | |
|
|
59 |
| ----------------------------------------------------------- |
|
|
60 |
| ... |
|
|
61 |
| ----------------------------------------------------------- |
|
|
62 |
| | | |
|
|
63 |
| | Statement message N | |
|
|
64 |
| | | |
|
|
65 |
| ----------------------------------------------------------- |
|
|
66 |
------------------------------------------------------------------ |
|
67 |
||
68 |
with each Statement message looking like so::
|
|
69 |
||
70 |
------------------------------------------------------------------
|
|
71 |
| |
|
|
72 |
| Statement message |
|
|
73 |
| |
|
|
74 |
| ----------------------------------------------------------- |
|
|
75 |
| | | |
|
|
76 |
| | Common information | |
|
|
77 |
| | | |
|
|
78 |
| | - Type of Statement (INSERT, DELETE, etc) | |
|
|
79 |
| | - Start Timestamp | |
|
|
80 |
| | - End Timestamp | |
|
|
81 |
| | - (OPTIONAL) Actual SQL query string | |
|
|
82 |
| | | |
|
|
83 |
| ----------------------------------------------------------- |
|
|
84 |
| ----------------------------------------------------------- |
|
|
85 |
| | | |
|
|
86 |
| | Statement subclass message 1 (see below) | |
|
|
87 |
| | | |
|
|
88 |
| ----------------------------------------------------------- |
|
|
89 |
| ... |
|
|
90 |
| ----------------------------------------------------------- |
|
|
91 |
| | | |
|
|
92 |
| | Statement subclass message N (see below) | |
|
|
93 |
| | | |
|
|
94 |
| ----------------------------------------------------------- |
|
|
95 |
------------------------------------------------------------------
|
|
96 |
||
97 |
The Transaction Message
|
|
98 |
^^^^^^^^^^^^^^^^^^^^^^^
|
|
99 |
||
100 |
The main "envelope" message which represents an atomic transaction |
|
101 |
which changed the state of a server is the Transaction message class. |
|
102 |
||
103 |
The Transaction message contains two pieces: |
|
104 |
||
105 |
#. A TransactionContext message containing information about the
|
|
106 |
transaction as a whole, such as the ID of the executing server, |
|
107 |
the start and end timestamp of the transaction, and a globally- |
|
108 |
unique identifier for the transaction. |
|
109 |
#. A vector of Statement messages representing the distinct SQL
|
|
110 |
statements which modified the state of the server. The Statement |
|
111 |
message is, itself, a generic envelope message containing a |
|
112 |
sub-message which describes the specific data modification which |
|
113 |
occurred on the server (such as, for instance, an INSERT statement. |
|
114 |
||
115 |
The Statement Message
|
|
116 |
^^^^^^^^^^^^^^^^^^^^^
|
|
117 |
||
118 |
The generic "envelope" message containing information common to each |
|
119 |
SQL statement executed against a server (such as a start and end timestamp |
|
120 |
and the type of the SQL statement) as well as a Statement subclass message |
|
121 |
describing the specific data modification event on the server. |
|
122 |
||
123 |
Each Statement message contains a type member which indicates how readers |
|
124 |
of the Statement should construct the inner Statement subclass representing |
|
125 |
a data change. |
|
126 |
||
1994.5.14
by Stewart Smith
talk about statement rollback (briefly) in replication docs |
127 |
Statements are recorded seperatley as sometimes individual statements |
128 |
have to be rolled back. |
|
129 |
||
1861.2.1
by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation. |
130 |
|
1938.3.1
by David Shrewsbury
Add --replicate-query option. |
131 |
.. _bulk-operations: |
132 |
||
1861.2.1
by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation. |
133 |
How Bulk Operations Work
|
134 |
------------------------
|
|
135 |
||
136 |
Certain operations which change large volumes of data on a server |
|
137 |
present a specific set of problems for a transaction coordinator or |
|
138 |
replication service. If all operations must complete atomically on a |
|
139 |
publishing server before replicas are delivered the complete |
|
140 |
transactional unit: |
|
141 |
||
142 |
#. The publishing server could consume a large amount of memory
|
|
143 |
building an in-memory Transaction message containing all the |
|
144 |
operations contained in the entire transaction. |
|
145 |
#. A replica, or subscribing server, is wasting time waiting on the
|
|
146 |
eventual completion (commit) of the large transaction on the |
|
147 |
publishing server. It could be applying pieces of the large |
|
148 |
transaction in the meantime... |
|
149 |
||
150 |
In order to prevent the problems inherent in (1) and (2) above, Drizzle's |
|
151 |
replication system uses a mechanism which provides bulk change |
|
152 |
operations. |
|
153 |
||
154 |
When a regular SQL statement modifies or inserts more rows than a |
|
155 |
certain threshold, Drizzle's replication services component will begin |
|
156 |
sending Transaction messages to replicas which contain a chunk |
|
157 |
(or "segment") of the data which has been changed on the publisher. |
|
158 |
||
159 |
When data is inserted, updated, or modified in the database, a |
|
160 |
header containing information about modified tables and fields is |
|
161 |
matched with one or more data segments which contain the actual |
|
162 |
values changed in the statement. |
|
163 |
||
164 |
It's easiest to understand this mechanism by following through a real-world |
|
165 |
scenario. |
|
166 |
||
167 |
Suppose the following table::
|
|
168 |
||
169 |
CREATE TABLE test.person
|
|
170 |
(
|
|
171 |
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
|
|
172 |
, first_name VARCHAR(50)
|
|
173 |
, last_name VARCHAR(50)
|
|
174 |
, is_active CHAR(1) NOT NULL DEFAULT 'Y'
|
|
175 |
);
|
|
176 |
||
177 |
Also suppose that test.t1 contains 1 million records. |
|
178 |
||
179 |
Next, suppose a client issues the SQL statement::
|
|
180 |
||
181 |
UPDATE test.person SET is_active = 'N';
|
|
182 |
||
183 |
It is clear that one million records could be updated by this statement |
|
184 |
(we say, "could be" since Drizzle does not actually update a record if |
|
185 |
the UPDATE would not change the existing record...). |
|
186 |
||
187 |
In order to prevent the publishing server from having to construct an |
|
188 |
enormous Transaction message, Drizzle's replication services component |
|
189 |
will do the following: |
|
190 |
||
191 |
#. Construct a Transaction message with a transaction context containing
|
|
192 |
information about the originating server, the transaction ID, and |
|
193 |
timestamp information. |
|
194 |
#. Construct an UpdateHeader message with information about the tables
|
|
195 |
and fields involved in the UPDATE statement. Push this UpdateHeader |
|
196 |
message onto the Transaction message's statement vector. |
|
197 |
#. Construct an UpdateData message. Set the segment_id member to 1.
|
|
198 |
Set the end_segment member to true. |
|
199 |
#. For every record updated in a storage engine, the ReplicationServices
|
|
200 |
component builds a new UpdateRecord message and appends this message |
|
201 |
to the aforementioned UpdateData message's record vector. |
|
202 |
#. After a certain threshold of records is reached, the
|
|
203 |
ReplicationServices component sets the current UpdateData message's |
|
204 |
end_segment member to false, and proceeds to send the Transaction |
|
205 |
message to replicators. |
|
206 |
#. The ReplicationServices component then constructs a new Transaction
|
|
207 |
message and constructs a transaction context with the same |
|
208 |
transaction ID and server information. |
|
209 |
#. A new UpdateData message is created. The message's segment_id is
|
|
210 |
set to N+1 and as new records are updated, new UpdateRecord messages |
|
211 |
are appended to the UpdateData message's record vector. |
|
212 |
#. While records are being updated, we repeat steps 5 through 7, with
|
|
213 |
only the final UpdateData message having its end_segment member set |
|
214 |
to true. |
|
215 |
||
216 |
Handling ROLLBACKs
|
|
217 |
------------------
|
|
218 |
||
1994.5.14
by Stewart Smith
talk about statement rollback (briefly) in replication docs |
219 |
Both transactions and individual statements may be rolled back. |
220 |
||
1861.2.2
by David Shrewsbury
Correct rollback handling section. |
221 |
When a transaction is rolled back, one of two things happen depending |
222 |
on whether the transaction is made up of either a single Transaction |
|
223 |
message, or if it is made up of multiple Transaction messages (e.g, bulk |
|
224 |
load). |
|
1861.2.1
by David Shrewsbury
Add some initial Transaction Log plugin and Replication documentation. |
225 |
|
1861.2.2
by David Shrewsbury
Correct rollback handling section. |
226 |
* For a transaction encapsulated entirely within a single Transaction
|
227 |
message, the entire message is simply discarded and not sent through |
|
228 |
the replication stream. |
|
229 |
* For a transaction which is made up of multiple messages, and at least
|
|
230 |
one message has already been sent through the replication stream, then |
|
231 |
the Transaction message will contain a Statement message with type = |
|
1938.3.1
by David Shrewsbury
Add --replicate-query option. |
232 |
ROLLBACK. |