1712.1.2
by Monty Taylor
Added missing file. |
1 |
#
|
2 |
# Drizzle Client & Protocol Library |
|
3 |
# |
|
4 |
# Copyright (C) 2008 Eric Day (eday@oddments.org) |
|
5 |
# All rights reserved. |
|
6 |
# |
|
7 |
# Use and distribution licensed under the BSD license. See |
|
8 |
# the COPYING file in this directory for full text. |
|
9 |
#
|
|
10 |
||
11 |
STATUS: This is currently a proposed draft as of November 29, 2008 |
|
12 |
||
13 |
Drizzle Protocol |
|
14 |
----------------
|
|
15 |
||
16 |
The Drizzle protocol works over TCP, UDP, and Unix Domain Sockets |
|
17 |
(UDS, also known as IPC sockets), although there are limitations when |
|
18 |
using UDP (this is discussed below). In the case of TCP and UDS, |
|
19 |
a connection is made, a command is sent, and a response loop is |
|
20 |
started. Socket communication ends when either side closes the |
|
21 |
connection or a QUIT command is issued. |
|
22 |
||
23 |
TCP and UDS communications will be full duplex. This means that as |
|
24 |
the client is sending a command, it is possible for the server to |
|
25 |
report an error before the sending of data completes. This allows |
|
26 |
the server to do preliminary checks (table exists, authentication, |
|
27 |
...) before a request is completely sent so the client may abort. This |
|
28 |
will primarily be used for large requests (INSERTing large BLOBs). |
|
29 |
||
30 |
TCP and UDS communications will also allow for pipe-lining of requests |
|
31 |
and concurrent command execution. This means a client does not need |
|
32 |
to wait for a command to finish before a new command is sent. It is |
|
33 |
even possible a later command issued will complete and have a result |
|
34 |
before an earlier command. Result packets may be interleaved so a |
|
35 |
client issuing concurrent commands must be able to parse results |
|
36 |
concurrently. |
|
37 |
||
38 |
UDP sockets are supported to allow small, fast updates for |
|
39 |
applications such as statistical gathering. Since UDP does not |
|
40 |
guarantee delivery, this method should not be used for applications |
|
41 |
that require reliable transport. When using UDP, the authentication |
|
42 |
packet (if needed) and command packet are bundled into a single UDP |
|
43 |
packet and sent. This puts a limitation on the size of the request |
|
44 |
being made, and this limit can be different between network hosts. The |
|
45 |
absolute limit is 65,507 bytes (28 bytes used for IPv4 and UDP |
|
46 |
headers), but again, this can depend on the network hosts. Responses |
|
47 |
are optional when issuing UDP commands, and this preference is |
|
48 |
specified in the handshake packet. |
|
49 |
||
50 |
All sizes given throughout this document are in bytes. Byte order |
|
51 |
for all multi-byte binary objects such as lengths and mutli-byte |
|
52 |
bit-fields are packed little-endian. |
|
53 |
||
54 |
||
55 |
Packet Sequence Overview |
|
56 |
------------------------
|
|
57 |
||
58 |
The sequence of packets for a simple connection and command that |
|
59 |
responds with an OK packet: |
|
60 |
||
61 |
C: Command |
|
62 |
S: OK |
|
63 |
||
64 |
The sequence of packets for a simple connection and query command |
|
65 |
with results: |
|
66 |
||
67 |
C: Command |
|
68 |
S: OK |
|
69 |
S: Fields (optional, multiple packets) |
|
70 |
S: Rows (multiple packets) |
|
71 |
S: EOF |
|
72 |
||
73 |
When authentication is required for a command, the server will ask |
|
74 |
for it. For example: |
|
75 |
||
76 |
C: Command |
|
77 |
S: Authentication Required |
|
78 |
C: Authentication Credentials |
|
79 |
S: OK |
|
80 |
S: Fields |
|
81 |
S: Rows |
|
82 |
S: EOF |
|
83 |
||
84 |
The server will use the most recent credential information when |
|
85 |
processing subsequent commands. |
|
86 |
||
87 |
If a client wishes to multiplex commands on a single connection, |
|
88 |
it can do so using the command identifiers. Here is an example of |
|
89 |
how the packets could be ordered, but this will largely depend on |
|
90 |
the servers ability to process the commands concurrently and the |
|
91 |
processing time for each command. |
|
92 |
||
93 |
C: Command (Command ID=1) |
|
94 |
C: Command (Command ID=2) |
|
95 |
S: OK (Command ID=2) |
|
96 |
S: Field (Command ID=2) |
|
97 |
S: OK (Command ID=1) |
|
98 |
S: Fields (Command ID=2) |
|
99 |
S: Rows (Command ID=2) |
|
100 |
S: EOF |
|
101 |
||
102 |
As you can see, the commands may be executed with results generated |
|
103 |
in any order, and the packet containing the results may be interleaved. |
|
104 |
||
105 |
||
106 |
Length Encoding |
|
107 |
---------------
|
|
108 |
||
109 |
Some lengths used within the protocol packets are length encoded. This |
|
110 |
means the size of the length field will vary between 1 and 9 bytes, |
|
111 |
and is determined by the value of the first byte. |
|
112 |
||
113 |
0-252 - Actual length |
|
114 |
253 - NULL value (only applicable in row results) |
|
115 |
254 - Following 8 bytes contain actual length |
|
116 |
255 - Depends on context, usually signifies end |
|
117 |
||
118 |
||
119 |
Packets
|
|
120 |
-------
|
|
121 |
||
122 |
Packets consist of two layers. The first is meant to be small, |
|
123 |
simple, and have just enough information for fast router and proxy |
|
124 |
processing. It consists of a fixed-size part, along with a variable |
|
125 |
sized client id (explained later), a series of chunked data, followed |
|
126 |
by a checksum at the end. The chunked transfer encoding allows for |
|
127 |
not having to pre-compute the packet data length before sending, |
|
128 |
and support packets of any size. It also allows for a large packet |
|
129 |
to be aborted gracefully (without having to close the connection) |
|
130 |
in the event of an error. |
|
131 |
||
132 |
|----------------------------- 32 Bits -----------------------------| |
|
133 |
||
134 |
|----------------|----------------|---------------------------------| |
|
135 |
0 | Magic | Protocol | Command ID | |
|
136 |
|----------------|----------------|---------------------------------| |
|
137 |
32 | Command / Result Code | Client ID Length | |
|
138 |
|---------------------------------|---------------------------------| |
|
139 |
64 | Client ID (optional, variable length) | |
|
140 |
|---------------------------------|---------------------------------| |
|
141 |
64+ | Chunk Length and Value Pairs (optional, variable length) | |
|
142 |
|---------------------------------|---------------------------------| |
|
143 |
64+ | Chunk Length = 0 | |
|
144 |
|---------------------------------|---------------------------------| |
|
145 |
80+ | Checksum | |
|
146 |
|-------------------------------------------------------------------| |
|
147 |
||
148 |
The first part of a packet is: |
|
149 |
||
150 |
1-byte Magic number, the value should be 0x44. |
|
151 |
||
152 |
1-byte Protocol version, currently 1. |
|
153 |
||
154 |
2-byte Command ID. This is a unique number among all other queries |
|
155 |
currently being executed on the connection. The client is |
|
156 |
responsible for choosing a unique number while generating a |
|
157 |
command packet, and all response packets associated with that |
|
158 |
command must have the same command ID. Once a command has been |
|
159 |
completed, the client may reuse the ID. |
|
160 |
||
161 |
2-byte Command/result code. For commands, this may be: |
|
162 |
||
163 |
1 ECHO - The entire packet is simply echoed back to the caller. |
|
164 |
2 SET - Set protocol options. |
|
165 |
3 QUERY - Execute query. |
|
166 |
4 QUERY_RO - Same as QUERY, but hints that this is a read-only |
|
167 |
query. This is only useful for routers/proxies who may want |
|
168 |
to redirect the request to a read slave. |
|
169 |
||
170 |
Result codes may be: |
|
171 |
||
172 |
1 OK - Single packet success response. No data associated |
|
173 |
with the result besides parameters. |
|
174 |
2 ERROR - Single packet error response. |
|
175 |
3 DATA - Start of a multi-packet result set. |
|
176 |
3 DATA_END - Mark the end of a series of data packets. This is |
|
177 |
useful so a low level router or proxy can know when a |
|
178 |
response is complete without inspecting the contents of |
|
179 |
the packets. |
|
180 |
||
181 |
2-byte Client ID length. |
|
182 |
X-byte Client ID (length is value of client ID length). The client ID is |
|
183 |
there for the client and routers/proxies to use. The server |
|
184 |
treats this as opaque data, and will only preserve it to send |
|
185 |
in responses. This can be used as a sharding key, to keep |
|
186 |
state information in a proxy, or any other use. |
|
187 |
||
188 |
Next, zero or more chunks are given, terminated by a chunk length of |
|
189 |
0. Each chunk consist of a length and then that amount of data. |
|
190 |
||
191 |
2-byte Chunk length |
|
192 |
X-byte Chunk (length is value of chunk length) |
|
193 |
||
194 |
After the the chunk length of 0 is given, a checksum value is given |
|
195 |
that was computed for the entire packet. |
|
196 |
||
197 |
4-byte Checksum |
|
198 |
||
199 |
The second layer of the protocol is encapsulated inside of the |
|
200 |
chunked encoding. This consists of zero or more packet parameters, |
|
201 |
an end of parameter marker, followed by an optional data set that is |
|
202 |
given until the end of a packet (or the end of all chunks). |
|
203 |
||
204 |
||
205 |
Packet Parameters |
|
206 |
-----------------
|
|
207 |
||
208 |
Packet parameter names are defined in a global namespace, although |
|
209 |
not all parameters are relevant for all packet types. Parameters are |
|
210 |
enumerated, and the name is specified with a 1-byte value representing |
|
211 |
the enumerated name. Each packet parameter may have a value associated |
|
212 |
with it, and each parameter defines the size and how that value is |
|
213 |
given. The list of possible packet parameters are: |
|
214 |
||
215 |
0 END_OF_PARAMETERS - Marks the end of a parameter list. |
|
216 |
||
217 |
Parameters used for setting options: |
|
218 |
||
219 |
1 AUTH - 1-byte value with authentication mechanism |
|
220 |
to use. Possible values are: |
|
221 |
0 - None. |
|
222 |
1 - MD5 on user and password. |
|
223 |
2 - 3-way handshake. |
|
224 |
2 CHECKSUM - 1-byte value with preferred checksum |
|
225 |
type. Possible values are: |
|
226 |
0 - None. |
|
227 |
1 - CRC32 |
|
228 |
3 COMPRESSION - 1-byte value with preferred compression |
|
229 |
type. Possible values are: |
|
230 |
0 - None. |
|
231 |
1 - zlib. |
|
232 |
2 - bzip2. |
|
233 |
4 FIELD_ENCODING - 1-byte value with preferred field encoding |
|
234 |
type. Possible values are: |
|
235 |
0 - String. |
|
236 |
1 - Native. |
|
237 |
5 FIELD_INFO - 1-byte value to determine if field information |
|
238 |
should be sent. Possible values are: |
|
239 |
0 - None. |
|
240 |
1 - Send field info. |
|
241 |
||
242 |
(6-63 Reserved for future options that can be set) |
|
243 |
||
244 |
Parameters used in responses: |
|
245 |
||
246 |
64 STATUS - 4-byte bit field. |
|
247 |
65 NUM_ROWS_AFFECTED - Length-encoded count of rows affected. |
|
248 |
66 NUM_ROWS_SCANNED - Length-encoded count of rows scanned. |
|
249 |
67 NUM_WARNINGS - Length-encoded count of warnings encountered. |
|
250 |
68 INSERT_ID - Last insert ID. |
|
251 |
69 ERROR_CODE - 4-byte error code. |
|
252 |
70 ERROR_STRING - Length-encoded string. |
|
253 |
71 SQL_STATE - Length-encoded string. |
|
254 |
72 NUM_FIELDS - 4-byte integer. |
|
255 |
73 FIELD_START - No value, starts a new set of field parameters. |
|
256 |
74 FIELD_TYPE - 2-byte enumerated type. |
|
257 |
75 FIELD_LENGTH - Length-encoded value. |
|
258 |
76 FIELD_FLAGS - 4-byte bit-field. |
|
259 |
77 DB_NAME - Length-encoded string. |
|
260 |
78 TABLE_NAME - Length-encoded string. |
|
261 |
79 ORIG_TABLE_NAME - Length-encoded string. |
|
262 |
80 FIELD_NAME - Length-encoded string. |
|
263 |
81 ORIG_FIELD_NAME - Length-encoded string. |
|
264 |
82 DEFAULT_VALUE - Length-encoded string. |
|
265 |
||
266 |
(83-255 Reserved for future responses parameters) |
|
267 |
||
268 |
"Length-encoded string" means a length-encoded value, followed by a |
|
269 |
string of that length. |
|
270 |
||
271 |
||
272 |
Command
|
|
273 |
-------
|
|
274 |
||
275 |
Inside of the chunked data, command packets consist of zero or more |
|
276 |
parameters depending on which options are being set, followed by |
|
277 |
a end of parameter marker, and then all data until the end of the |
|
278 |
chunks are considered arguments for the command. For a QUERY, this |
|
279 |
will be the actual query to run. |
|
280 |
||
281 |
||
282 |
OK/ERROR |
|
283 |
--------
|
|
284 |
||
285 |
The server responds with an OK or ERROR if no row data is given. A |
|
286 |
list of parameters may follow, and the marked with an end of parameter |
|
287 |
value. |
|
288 |
||
289 |
||
290 |
DATA
|
|
291 |
----
|
|
292 |
||
293 |
A data packet consists of a series of parameters, followed by the end |
|
294 |
of parameter, and then a series of length-encoded values holding field |
|
295 |
values. The NUM_FIELDS parameter must be given before any values, as |
|
296 |
this indicates when a start of a new row happens. The field values may |
|
297 |
either be in string format or native data type, depending on the value |
|
298 |
of FIELD_ENCODING. |
|
299 |
||
300 |
There may be multiple rows inside of a single DATA result packet. In |
|
301 |
the case of large result sets, the result should be split into multiple |
|
302 |
DATA packets since other concurrent commands on the connection will |
|
303 |
block if a single large packet is sent. By breaking resulting rows |
|
304 |
into multiple DATA packets, other commands are then allowed to send |
|
305 |
interleaved response packets. |