4
Replication events are recorded using messages in the `Google Protocol Buffer
5
<http://code.google.com/p/protobuf/>`_ (GPB) format. GPB messages can contain
6
sub-messages. There is a single main "envelope" message, Transaction, that
7
is passed to plugins that subscribe to the replication stream.
12
**transaction_message_threshold**
14
Controls the size, in bytes, of the Transaction messages. When a Transaction
15
message exceeds this size, a new Transaction message with the same
16
transaction ID will be created to continue the replication events.
17
See :ref:`bulk-operations` below.
22
Controls whether the originating SQL query will be included within each
23
Statement message contained in the enclosing Transaction message. The
24
default global value is FALSE which will not include the query in the
25
messages. It can be controlled per session, as well. For example:
29
drizzle> set @@replicate_query = 1;
31
The stored query should be used as a guide only, and never executed
32
on a slave to perform replication as this will lead to incorrect results.
37
The GPB messages are defined in .proto files in the drizzled/message
38
directory of the Drizzle source code. The primary definition file is
39
transaction.proto. Messages defined in this file are related in the
43
------------------------------------------------------------------
45
| Transaction message |
47
| ----------------------------------------------------------- |
49
| | TransactionContext message | |
51
| ----------------------------------------------------------- |
52
| ----------------------------------------------------------- |
54
| | Statement message 1 | |
56
| ----------------------------------------------------------- |
57
| ----------------------------------------------------------- |
59
| | Statement message 2 | |
61
| ----------------------------------------------------------- |
63
| ----------------------------------------------------------- |
65
| | Statement message N | |
67
| ----------------------------------------------------------- |
68
------------------------------------------------------------------
70
with each Statement message looking like so::
72
------------------------------------------------------------------
76
| ----------------------------------------------------------- |
78
| | Common information | |
80
| | - Type of Statement (INSERT, DELETE, etc) | |
81
| | - Start Timestamp | |
82
| | - End Timestamp | |
83
| | - (OPTIONAL) Actual SQL query string | |
85
| ----------------------------------------------------------- |
86
| ----------------------------------------------------------- |
88
| | Statement subclass message 1 (see below) | |
90
| ----------------------------------------------------------- |
92
| ----------------------------------------------------------- |
94
| | Statement subclass message N (see below) | |
96
| ----------------------------------------------------------- |
97
------------------------------------------------------------------
99
The Transaction Message
100
^^^^^^^^^^^^^^^^^^^^^^^
102
The main "envelope" message which represents an atomic transaction
103
which changed the state of a server is the Transaction message class.
105
The Transaction message contains two pieces:
107
#. A TransactionContext message containing information about the
108
transaction as a whole, such as the ID of the executing server,
109
the start and end timestamp of the transaction, and a globally-
110
unique identifier for the transaction.
111
#. A vector of Statement messages representing the distinct SQL
112
statements which modified the state of the server. The Statement
113
message is, itself, a generic envelope message containing a
114
sub-message which describes the specific data modification which
115
occurred on the server (such as, for instance, an INSERT statement.
117
The Statement Message
118
^^^^^^^^^^^^^^^^^^^^^
120
The generic "envelope" message containing information common to each
121
SQL statement executed against a server (such as a start and end timestamp
122
and the type of the SQL statement) as well as a Statement subclass message
123
describing the specific data modification event on the server.
125
Each Statement message contains a type member which indicates how readers
126
of the Statement should construct the inner Statement subclass representing
129
Statements are recorded seperatley as sometimes individual statements
130
have to be rolled back.
135
How Bulk Operations Work
136
------------------------
138
Certain operations which change large volumes of data on a server
139
present a specific set of problems for a transaction coordinator or
140
replication service. If all operations must complete atomically on a
141
publishing server before replicas are delivered the complete
144
#. The publishing server could consume a large amount of memory
145
building an in-memory Transaction message containing all the
146
operations contained in the entire transaction.
147
#. A replica, or subscribing server, is wasting time waiting on the
148
eventual completion (commit) of the large transaction on the
149
publishing server. It could be applying pieces of the large
150
transaction in the meantime...
152
In order to prevent the problems inherent in (1) and (2) above, Drizzle's
153
replication system uses a mechanism which provides bulk change
156
When a regular SQL statement modifies or inserts more rows than a
157
certain threshold, Drizzle's replication services component will begin
158
sending Transaction messages to replicas which contain a chunk
159
(or "segment") of the data which has been changed on the publisher.
161
When data is inserted, updated, or modified in the database, a
162
header containing information about modified tables and fields is
163
matched with one or more data segments which contain the actual
164
values changed in the statement.
166
It's easiest to understand this mechanism by following through a real-world
169
Suppose the following table:
171
.. code-block:: mysql
173
CREATE TABLE test.person
175
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
176
, first_name VARCHAR(50)
177
, last_name VARCHAR(50)
178
, is_active CHAR(1) NOT NULL DEFAULT 'Y'
181
Also suppose that test.t1 contains 1 million records.
183
Next, suppose a client issues the SQL statement:
185
.. code-block:: mysql
187
UPDATE test.person SET is_active = 'N';
189
It is clear that one million records could be updated by this statement
190
(we say, "could be" since Drizzle does not actually update a record if
191
the UPDATE would not change the existing record...).
193
In order to prevent the publishing server from having to construct an
194
enormous Transaction message, Drizzle's replication services component
195
will do the following:
197
#. Construct a Transaction message with a transaction context containing
198
information about the originating server, the transaction ID, and
199
timestamp information.
200
#. Construct an UpdateHeader message with information about the tables
201
and fields involved in the UPDATE statement. Push this UpdateHeader
202
message onto the Transaction message's statement vector.
203
#. Construct an UpdateData message. Set the segment_id member to 1.
204
Set the end_segment member to true.
205
#. For every record updated in a storage engine, the ReplicationServices
206
component builds a new UpdateRecord message and appends this message
207
to the aforementioned UpdateData message's record vector.
208
#. After a certain threshold of records is reached, the
209
ReplicationServices component sets the current UpdateData message's
210
end_segment member to false, and proceeds to send the Transaction
211
message to replicators.
212
#. The ReplicationServices component then constructs a new Transaction
213
message and constructs a transaction context with the same
214
transaction ID and server information.
215
#. A new UpdateData message is created. The message's segment_id is
216
set to N+1 and as new records are updated, new UpdateRecord messages
217
are appended to the UpdateData message's record vector.
218
#. While records are being updated, we repeat steps 5 through 7, with
219
only the final UpdateData message having its end_segment member set
225
Both transactions and individual statements may be rolled back.
227
When a transaction is rolled back, one of two things happen depending
228
on whether the transaction is made up of either a single Transaction
229
message, or if it is made up of multiple Transaction messages (e.g, bulk
232
* For a transaction encapsulated entirely within a single Transaction
233
message, the entire message is simply discarded and not sent through
234
the replication stream.
235
* For a transaction which is made up of multiple messages, and at least
236
one message has already been sent through the replication stream, then
237
the Transaction message will contain a Statement message with type =