4
Replication events are recorded using messages in the `Google Protocol Buffer
5
<http://code.google.com/p/protobuf/>`_ (GPB) format. GPB messages can contain
6
sub-messages. There is a single main "envelope" message, Transaction, that
7
is passed to plugins that subscribe to the replication stream.
12
**transaction_message_threshold**
14
Controls the size, in bytes, of the Transaction messages. When a Transaction
15
message exceeds this size, a new Transaction message with the same
16
transaction ID will be created to continue the replication events.
17
See :ref:`bulk-operations` below.
22
Controls whether the originating SQL query will be included within each
23
Statement message contained in the enclosing Transaction message. The
24
default global value is FALSE which will not include the query in the
25
messages. It can be controlled per session, as well. For example:
27
``drizzle> set @@replicate_query = 1;``
32
The GPB messages are defined in .proto files in the drizzled/message
33
directory of the Drizzle source code. The primary definition file is
34
transaction.proto. Messages defined in this file are related in the
38
------------------------------------------------------------------
40
| Transaction message |
42
| ----------------------------------------------------------- |
44
| | TransactionContext message | |
46
| ----------------------------------------------------------- |
47
| ----------------------------------------------------------- |
49
| | Statement message 1 | |
51
| ----------------------------------------------------------- |
52
| ----------------------------------------------------------- |
54
| | Statement message 2 | |
56
| ----------------------------------------------------------- |
58
| ----------------------------------------------------------- |
60
| | Statement message N | |
62
| ----------------------------------------------------------- |
63
------------------------------------------------------------------
65
with each Statement message looking like so::
67
------------------------------------------------------------------
71
| ----------------------------------------------------------- |
73
| | Common information | |
75
| | - Type of Statement (INSERT, DELETE, etc) | |
76
| | - Start Timestamp | |
77
| | - End Timestamp | |
78
| | - (OPTIONAL) Actual SQL query string | |
80
| ----------------------------------------------------------- |
81
| ----------------------------------------------------------- |
83
| | Statement subclass message 1 (see below) | |
85
| ----------------------------------------------------------- |
87
| ----------------------------------------------------------- |
89
| | Statement subclass message N (see below) | |
91
| ----------------------------------------------------------- |
92
------------------------------------------------------------------
94
The Transaction Message
95
^^^^^^^^^^^^^^^^^^^^^^^
97
The main "envelope" message which represents an atomic transaction
98
which changed the state of a server is the Transaction message class.
100
The Transaction message contains two pieces:
102
#. A TransactionContext message containing information about the
103
transaction as a whole, such as the ID of the executing server,
104
the start and end timestamp of the transaction, and a globally-
105
unique identifier for the transaction.
106
#. A vector of Statement messages representing the distinct SQL
107
statements which modified the state of the server. The Statement
108
message is, itself, a generic envelope message containing a
109
sub-message which describes the specific data modification which
110
occurred on the server (such as, for instance, an INSERT statement.
112
The Statement Message
113
^^^^^^^^^^^^^^^^^^^^^
115
The generic "envelope" message containing information common to each
116
SQL statement executed against a server (such as a start and end timestamp
117
and the type of the SQL statement) as well as a Statement subclass message
118
describing the specific data modification event on the server.
120
Each Statement message contains a type member which indicates how readers
121
of the Statement should construct the inner Statement subclass representing
127
How Bulk Operations Work
128
------------------------
130
Certain operations which change large volumes of data on a server
131
present a specific set of problems for a transaction coordinator or
132
replication service. If all operations must complete atomically on a
133
publishing server before replicas are delivered the complete
136
#. The publishing server could consume a large amount of memory
137
building an in-memory Transaction message containing all the
138
operations contained in the entire transaction.
139
#. A replica, or subscribing server, is wasting time waiting on the
140
eventual completion (commit) of the large transaction on the
141
publishing server. It could be applying pieces of the large
142
transaction in the meantime...
144
In order to prevent the problems inherent in (1) and (2) above, Drizzle's
145
replication system uses a mechanism which provides bulk change
148
When a regular SQL statement modifies or inserts more rows than a
149
certain threshold, Drizzle's replication services component will begin
150
sending Transaction messages to replicas which contain a chunk
151
(or "segment") of the data which has been changed on the publisher.
153
When data is inserted, updated, or modified in the database, a
154
header containing information about modified tables and fields is
155
matched with one or more data segments which contain the actual
156
values changed in the statement.
158
It's easiest to understand this mechanism by following through a real-world
161
Suppose the following table::
163
CREATE TABLE test.person
165
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
166
, first_name VARCHAR(50)
167
, last_name VARCHAR(50)
168
, is_active CHAR(1) NOT NULL DEFAULT 'Y'
171
Also suppose that test.t1 contains 1 million records.
173
Next, suppose a client issues the SQL statement::
175
UPDATE test.person SET is_active = 'N';
177
It is clear that one million records could be updated by this statement
178
(we say, "could be" since Drizzle does not actually update a record if
179
the UPDATE would not change the existing record...).
181
In order to prevent the publishing server from having to construct an
182
enormous Transaction message, Drizzle's replication services component
183
will do the following:
185
#. Construct a Transaction message with a transaction context containing
186
information about the originating server, the transaction ID, and
187
timestamp information.
188
#. Construct an UpdateHeader message with information about the tables
189
and fields involved in the UPDATE statement. Push this UpdateHeader
190
message onto the Transaction message's statement vector.
191
#. Construct an UpdateData message. Set the segment_id member to 1.
192
Set the end_segment member to true.
193
#. For every record updated in a storage engine, the ReplicationServices
194
component builds a new UpdateRecord message and appends this message
195
to the aforementioned UpdateData message's record vector.
196
#. After a certain threshold of records is reached, the
197
ReplicationServices component sets the current UpdateData message's
198
end_segment member to false, and proceeds to send the Transaction
199
message to replicators.
200
#. The ReplicationServices component then constructs a new Transaction
201
message and constructs a transaction context with the same
202
transaction ID and server information.
203
#. A new UpdateData message is created. The message's segment_id is
204
set to N+1 and as new records are updated, new UpdateRecord messages
205
are appended to the UpdateData message's record vector.
206
#. While records are being updated, we repeat steps 5 through 7, with
207
only the final UpdateData message having its end_segment member set
213
When a transaction is rolled back, one of two things happen depending
214
on whether the transaction is made up of either a single Transaction
215
message, or if it is made up of multiple Transaction messages (e.g, bulk
218
* For a transaction encapsulated entirely within a single Transaction
219
message, the entire message is simply discarded and not sent through
220
the replication stream.
221
* For a transaction which is made up of multiple messages, and at least
222
one message has already been sent through the replication stream, then
223
the Transaction message will contain a Statement message with type =