1
package drizzled.message;
2
option optimize_for = SPEED;
4
option java_package = "org.drizzle.messages";
5
option java_outer_classname = "TransactionMessage";
14
* This file contains definitions for protobuffer messages that
15
* are involved in Drizzle's replication system.
19
* We do not discuss implementation specifics about how messages
20
* are transmitted across the wire in this documentation because a
21
* discussion about those implementation details is immaterial to the
22
* definition of the messages used in replication.
24
* For a discussion on the implementation of how messages are transmitted
25
* to replicas, see the plugins which implement the replication system.
29
* Messages defined in this file are related in the following ways:
31
* ------------------------------------------------------------------
33
* | Transaction message |
35
* | ----------------------------------------------------------- |
37
* | | TransactionContext message | |
39
* | ----------------------------------------------------------- |
40
* | ----------------------------------------------------------- |
42
* | | Statement message 1 | |
44
* | ----------------------------------------------------------- |
45
* | ----------------------------------------------------------- |
47
* | | Statement message 2 | |
49
* | ----------------------------------------------------------- |
51
* | ----------------------------------------------------------- |
53
* | | Statement message N | |
55
* | ----------------------------------------------------------- |
56
* ------------------------------------------------------------------
58
* with each Statement message looking like so:
60
* ------------------------------------------------------------------
62
* | Statement message |
64
* | ----------------------------------------------------------- |
66
* | | Common information | |
68
* | | - Type of Statement (INSERT, DELETE, etc) | |
69
* | | - Start Timestamp | |
70
* | | - End Timestamp | |
71
* | | - (OPTIONAL) Actual SQL query string | |
73
* | ----------------------------------------------------------- |
74
* | ----------------------------------------------------------- |
76
* | | Statement subclass message 1 (see below) | |
78
* | ----------------------------------------------------------- |
80
* | ----------------------------------------------------------- |
82
* | | Statement subclass message N (see below) | |
84
* | ----------------------------------------------------------- |
85
* ------------------------------------------------------------------
87
* The Transaction Message
88
* =======================
90
* The main "envelope" message which represents an atomic transaction
91
* which changed the state of a server is the Transaction message class.
93
* The Transaction message contains two pieces:
95
* 1) A TransactionContext message containing information about the
96
* transaction as a whole, such as the ID of the executing server,
97
* the start and end timestamp of the transaction, and a globally-
98
* unique identifier for the transaction.
100
* 2) A vector of Statement messages representing the distinct SQL
101
* statements which modified the state of the server. The Statement
102
* message is, itself, a generic envelope message containing a
103
* sub-message which describes the specific data modification which
104
* occurred on the server (such as, for instance, an INSERT statement.
106
* The Statement Message
107
* =====================
109
* The generic "envelope" message containing information common to each
110
* SQL statement executed against a server (such as a start and end timestamp
111
* and the type of the SQL statement) as well as a Statement subclass message
112
* describing the specific data modification event on the server.
114
* Each Statement message contains a type member which indicates how readers
115
* of the Statement should construct the inner Statement subclass representing
118
* For example, suppose we have have read a Transaction message and have executed
119
* the following code:
123
* message::Statement &statement= transaction.statement(0); // Get first statement
125
* switch (statement.type())
127
* case Statement::INSERT:
129
* // Assume this is where we land...so, we know that the Statement
130
* // message represents an INSERT SQL statement. We must now get the
131
* // specific information describing the INSERT statement.
133
* // We assume here a simple INSERT statement that does not represent
134
* // a bulk operation.
135
* message::InsertHeader &insert_header= statement.insert_header();
136
* message::TableMetadata &table_metadata= insert_header.table_metadata();
138
* // Grab table name and echo out as start of INSERT SQL string...
139
* cout << "INSERT INTO `" << table_metadata.schema_name();
140
* cout << "`.`" << table_metadata.table_name() << "` (";
142
* // Add field list to SQL string...
143
* uint32_t num_fields= insert_header.field_metadata_size();
146
* for (x= 0; x < num_fields; ++x)
148
* message::FieldMetadata &field_metadata= insert_header.field_metadata(x);
151
* cout << "`" << field_metadata.name() << "`";
154
* cout << ") VALUES (";
156
* // Add insert values
157
* message::InsertData &insert_data= statement.insert_data();
158
* uint32_t num_records= insert_data.record_size();
160
* for (x= 0; x < num_records; ++x)
164
* for (y= 0; y < num_fields; ++y)
168
* cout << "'" << insert_data.record(x).insert_value(y) << "'";
178
* How Bulk Operations Work
179
* ========================
181
* Certain operations which change large volumes of data on a server
182
* present a specific set of problems for a transaction coordinator or
183
* replication service. If all operations must complete atomically on a
184
* publishing server before replicas are delivered the complete
185
* transactional unit:
187
* 1) The publishing server could consume a large amount of memory
188
* building an in-memory Transaction message containing all the
189
* operations contained in the entire transaction.
191
* 2) A replica, or subscribing server, is wasting time waiting on the
192
* eventual completion (commit) of the large transaction on the
193
* publishing server. It could be applying pieces of the large
194
* transaction in the meantime...
196
* In order to prevent the problems inherent in 1) and 2) above, Drizzle's
197
* replication system uses a mechanism which provides bulk change
200
* When a regular SQL statement modifies or inserts more rows than a
201
* certain threshold, Drizzle's replication services component will begin
202
* sending Transaction messages to replicas which contain a chunk
203
* (or "segment") of the data which has been changed on the publisher.
205
* When data is inserted, updated, or modified in the database, a
206
* header containing information about modified tables and fields is
207
* matched with one or more data segments which contain the actual
208
* values changed in the statement.
210
* It's easiest to understand this mechanism by following through a real-world
213
* Suppose the following table:
215
* CREATE TABLE test.person
217
* id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
218
* , first_name VARCHAR(50)
219
* , last_name VARCHAR(50)
220
* , is_active CHAR(1) NOT NULL DEFAULT 'Y'
223
* Also suppose that test.t1 contains <strong>1 million records</strong>.
225
* Next, suppose a client issues the SQL statement:
227
* UPDATE test.person SET is_active = 'N';
229
* It is clear that one million records could be updated by this statement
230
* (we say, "could be" since Drizzle does not actually update a record if
231
* the UPDATE would not change the existing record...).
233
* In order to prevent the publishing server from having to construct an
234
* enormous Transaction message, Drizzle's replication services component
235
* will do the following:
237
* 1) Construct a Transaction message with a transaction context containing
238
* information about the originating server, the transaction ID, and
239
* timestamp information.
241
* 2) Construct an UpdateHeader message with information about the tables
242
* and fields involved in the UPDATE statement. Push this UpdateHeader
243
* message onto the Transaction message's statement vector.
245
* 3) Construct an UpdateData message. Set the segment_id member to 1.
246
* Set the end_segment member to true.
248
* 4) For every record updated in a storage engine, the ReplicationServices
249
* component builds a new UpdateRecord message and appends this message
250
* to the aforementioned UpdateData message's record vector.
252
* 5) After a certain threshold of records is reached, the
253
* ReplicationServices component sets the current UpdateData message's
254
* end_segment member to false, and proceeds to send the Transaction
255
* message to replicators.
257
* 6) The ReplicationServices component then constructs a new Transaction
258
* message and constructs a transaction context with the same
259
* transaction ID and server information.
261
* 7) A new UpdateData message is created. The message's segment_id is
262
* set to N+1 and as new records are updated, new UpdateRecord messages
263
* are appended to the UpdateData message's record vector.
265
* 8) While records are being updated, we repeat steps 5 through 7, with
266
* only the final UpdateData message having its end_segment member set
272
* Any time a transaction is rolled back, a single Transaction message is
273
* containing a Statement with type = ROLLBACK is sent to replicators.
275
* What the replicator does with this information depends on the
276
* replicator. For most, only rollbacks of bulk operations will actually
277
* trigger any real action on a replica, since non-bulk operations won't
278
* have changed any data on a replica and the ROLLBACK is only sent to a
279
* replica to notify it of a transaction being rolled back (so that the
280
* replica can understand transaction ID sequence gaps...)
284
* Some minimal information transferred in the header of Statement
285
* submessage classes which identifies metadata about a specific
286
* field involved in a Statemet.
288
message FieldMetadata
290
required Table.Field.FieldType type = 1; /* Type of the field */
291
required string name = 2; /* Name of the field */
295
* Minimal information transferred in the header of Statement submessage
296
* classes which identifies metadata about the schema objects being
297
* modified in a Statement.
299
message TableMetadata
301
required string schema_name = 1; /* Name of the containing schema */
302
required string table_name = 2; /* Name of the table */
303
optional string catalog_name = 3; /* Name of the catalog */
307
Context for a transaction.
309
message TransactionContext
311
required uint32 server_id = 1; /* Unique identifier of a server */
312
required uint64 transaction_id = 2; /* Globally-unique transaction ID */
313
required uint64 start_timestamp = 3; /* Timestamp of when the transaction started */
314
required uint64 end_timestamp = 4; /* Timestamp of when the transaction ended */
318
* Represents a single record being inserted into a single table.
322
* An INSERT Statement contains one or more InsertRecord submessages, each
323
* of which represents a single record being inserted into a table.
327
repeated bytes insert_value = 1;
328
repeated bool is_null = 2;
332
* Represents statements which insert data into the database:
337
* REPLACE (is a delete and an insert)
339
* The statement is composed of a header (InsertHeader) containing
340
* metadata about affected tables and fields, as well as one or more data
341
* segments (InsertData) containing the actual records
346
* Bulk insert operations will have >1 data segment, with the last data
347
* segment having its end_segment member set to true.
351
required TableMetadata table_metadata = 1; /* Minimal metadata about the table affected */
352
repeated FieldMetadata field_metadata = 2; /* Collection of metadata about fields affected */
357
required uint32 segment_id = 1; /* The segment number */
358
required bool end_segment = 2; /* Is this the final segment? */
359
repeated InsertRecord record = 3; /* The records inserted */
363
* Represents a single record being updated in a single table.
367
* An UPDATE Statement contains one or more UpdateRecord submessages, each
368
* of which represents a single record being updated in a table.
372
repeated bytes key_value = 1; /* The value of keys of updated records (unique or primary key) */
373
repeated bytes after_value = 2; /* The value of the field after the update */
374
repeated bytes before_value = 3; /* The value of the field before the update (optional) */
375
repeated bool is_null = 4;
379
* Represents statements which update data in the database:
381
* INSERT ... ON DUPLICATE KEY UPDATE
383
* REPLACE INTO when the UPDATE optimization occurs.
385
* The statement is composed of a header (UpdateHeader) containing
386
* metadata about affected tables and fields, as well as one or more data
387
* segments (UpdateData) containing the actual records
392
* Bulk update operations will have >1 data segment, with the last data
393
* segment having its end_segment member set to true.
397
required TableMetadata table_metadata = 1; /* Minimal metadata about the table affected */
398
repeated FieldMetadata key_field_metadata = 2; /* Collection of metadata about key fields */
399
repeated FieldMetadata set_field_metadata = 3; /* Collection of metadata about fields affected */
404
required uint32 segment_id = 1; /* The segment number */
405
required bool end_segment = 2; /* Is this the final segment? */
406
repeated UpdateRecord record = 3; /* Collection of same size as above metadata containing the
407
values of all records of all the fields being updated. */
411
* Represents a single record being deleted in a single table.
415
* A DELETE Statement contains one or more DeleteRecord submessages, each
416
* of which represents a single record being delete from a table.
420
repeated bytes key_value = 1;
424
* Represents statements which delete data from the database:
427
* REPLACE (is a delete and an insert)
429
* The statement is composed of a header (DeleteHeader) containing
430
* metadata about affected tables and fields, as well as one or more data
431
* segments (DeleteData) containing the actual records
436
* Bulk delete operations will have >1 data segment, with the last data
437
* segment having its end_segment member set to true.
441
required TableMetadata table_metadata = 1; /* Minimal metadata about the table affected */
442
repeated FieldMetadata key_field_metadata = 2; /* Collection of metadata about key fields */
447
required uint32 segment_id = 1; /* The segment number */
448
required bool end_segment = 2; /* Is this the final segment? */
449
repeated DeleteRecord record = 3; /* Collection of same size as above metadata containing the
450
values of all records of all the fields being deleted. */
454
* Represents a TRUNCATE TABLE statement
456
message TruncateTableStatement
458
required TableMetadata table_metadata = 1; /* Metadata about table to truncate */
462
* Represents a CREATE SCHEMA statement
464
message CreateSchemaStatement
466
required Schema schema = 1; /* Definition of new schema */
470
* Represents an ALTER SCHEMA statement
472
message AlterSchemaStatement
474
required Schema before = 1; /* Definition of old schema */
475
required Schema after = 2; /* Definition of new schema */
479
* Represents a DROP SCHEMA statement
481
message DropSchemaStatement
483
required string schema_name = 1; /* Name of the schema to drop */
484
optional string catalog_name = 2; /* Name of the catalog containing the schema */
488
* Represents a CREATE TABLE statement.
490
message CreateTableStatement
492
required Table table = 1; /* The full table definition for the new table */
496
* Represents an ALTER TABLE statement.
498
message AlterTableStatement
500
required Table before = 1; /* The prior full table definition */
501
required Table after = 2; /* The new full table definition */
505
* Represents a DROP TABLE statement
507
message DropTableStatement
509
required TableMetadata table_metadata = 1; /* Minimal metadata about the table to be dropped */
510
optional bool if_exists_clause = 2; /* Did the user specify an IF EXISTS clause? */
514
* Represents a SET statement
518
* This is constructed only for changes in a global variable. For changes
519
* to a session-local variable, the variable's contents are not transmitted
520
* as Statement messages.
522
message SetVariableStatement
524
required FieldMetadata variable_metadata = 1; /* Metadata about the variable to set */
525
required bytes variable_value = 2; /* Value to set variable */
529
* Base message class for a Statement. A Transaction is composed of
530
* one or more Statement messages. The transaction message represents
531
* a single atomic unit of change in the server's state. Depending on
532
* whether the server/connection is in AUTOCOMMIT mode or not, a
533
* Transaction message may contain one or more than one Statement message.
539
ROLLBACK = 0; /* A ROLLBACK indicator */
540
INSERT = 1; /* An INSERT statement */
541
DELETE = 2; /* A DELETE statement */
542
UPDATE = 3; /* An UPDATE statement */
543
TRUNCATE_TABLE = 4; /* A TRUNCATE TABLE statement */
544
CREATE_SCHEMA = 5; /* A CREATE SCHEMA statement */
545
ALTER_SCHEMA = 6; /* An ALTER SCHEMA statement */
546
DROP_SCHEMA = 7; /* A DROP SCHEMA statement */
547
CREATE_TABLE = 8; /* A CREATE TABLE statement */
548
ALTER_TABLE = 9; /* An ALTER TABLE statement */
549
DROP_TABLE = 10; /* A DROP TABLE statement */
550
ROLLBACK_STATEMENT = 11; /* ROLLBACK current statement */
551
SET_VARIABLE = 98; /* A SET statement */
552
RAW_SQL = 99; /* A raw SQL statement */
554
required Type type = 1; /* The type of the Statement */
555
required uint64 start_timestamp = 2; /* Nanosecond precision timestamp of when the
556
Statement was started on the server */
557
required uint64 end_timestamp = 3; /* Nanosecond precision timestamp of when the
558
Statement finished executing on the server */
559
optional string sql = 4; /* May contain the original SQL string */
562
* Each Statement message may contain one or more of
563
* the below sub-messages, depending on the Statement's type.
565
optional InsertHeader insert_header = 5;
566
optional InsertData insert_data = 6;
567
optional UpdateHeader update_header = 7;
568
optional UpdateData update_data = 8;
569
optional DeleteHeader delete_header = 9;
570
optional DeleteData delete_data = 10;
571
optional TruncateTableStatement truncate_table_statement = 11;
572
optional CreateSchemaStatement create_schema_statement = 12;
573
optional DropSchemaStatement drop_schema_statement = 13;
574
optional AlterSchemaStatement alter_schema_statement = 14;
575
optional CreateTableStatement create_table_statement = 15;
576
optional AlterTableStatement alter_table_statement = 16;
577
optional DropTableStatement drop_table_statement = 17;
578
optional SetVariableStatement set_variable_statement = 18;
582
* Represents a collection of Statement messages that
583
* has been executed atomically on a server.
587
* For bulk operations, the transaction may contain only
588
* a data segment, and the atomic guarantee can only be enforced by XA.
592
required TransactionContext transaction_context = 1;
593
repeated Statement statement = 2;
594
optional Event event = 3;
597
* A single transaction in the database can possibly be represented with
598
* multiple protobuf Transaction messages if the message grows too large.
599
* This can happen if you have a bulk transaction, or a single statement
600
* affecting a very large number of rows, or just a large transaction with
601
* many statements/changes.
603
* For the first two examples, it is likely that the Statement sub-message
604
* itself will get segmented, causing another Transaction message to be
605
* created to hold the rest of the Statement's row changes. In these cases,
606
* it is enough to look at the segment information stored in the Statement
609
* For the last example, the Statement sub-messages may or may not be
610
* segmented, but we could still need to split the Statements up into
611
* multiple Transaction messages to keep the Transaction message size from
612
* growing too large. In this case, the segment information in the Statement
613
* submessages is not helpful if the Statement isn't segmented. We need this
614
* information in the Transaction message itself.
616
* These values should be set appropriately whether or not the Statement
617
* sub-messages are segmented.
619
optional uint32 segment_id = 4; /* Segment number of the Transaction msg */
620
optional bool end_segment = 5; /* FALSE if Transaction msg is split into multiples */