~drizzle-trunk/drizzle/development

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
=========================
Notable MySQL Differences
=========================

Drizzle was forked from the (now defunct) MySQL 6.0 tree. Since then there
have been a lot of changes. Some areas are similar, others unrecognisable.

This section aims to explore some of the notable differences between MySQL
and Drizzle.

This section was originally adapted from the Drizzle Wiki.

Usage
-----
 * There is no embedded server. The Drizzle Server is not loadable as a shared
   library.
 * Drizzle is optimized for massively concurrent environments. If we have the
   choice of improving performance for 1024 simultaneous connections to the
   detriment of performance with only 64 connections, we will take that choice.
 * Designed for modern POSIX systems
 * Microsoft Windows is not a supported platform (neither is HP-UX or IRIX).
 * No timezones. Everything is UTC.

Installation
------------

 * No scripts/mysql_install_db or similar. A "just works" installation
   mentality without administrative overhead.
 * No system database that needs upgrading between versions.
 * Drizzle can listen on the Drizzle port (4427) and/or MySQL port (3306)
   and speak the respective protocols.

Architecture
------------

Drizzle is designed around the concept of being a micro-kernel. There should
be a small core of the server with most functionality being provided through
small, efficient and hard to misuse plugin interfaces. The goal is a small,
light weight kernel which is easy to maintain, understand and extend.

Drizzle is written in C++ making use of the Standard Template Library (STL)
and Boost. Only where performance or correctness proves to be inadequate will
we consider rolling our own, and our preference is to fix the upstream library
instead.

Network protocol
----------------

Pluggable network protocols allow Drizzle to speak one (or more) of several
protocols. Currently we support the MySQL protocol (compatible with existing
MySQL client libraries) and the Drizzle protocol which is still under
development but aims for several important differences:

 * Client sends first packet instead of server
 * built in sharding
 * multi statement support (without using a semicolon to separate them)
 * room for expansion to include NoSQL type commands inline with SQL.

There is also a console plugin that instead of providing access over a network
socket, allows access from the current tty.

Plugin API
----------

Existing plugin APIs inherited from MySQL have been reworked.

 * User Defined Functions (UDFs) now follow the same API as within the
   server instead of a different C API. This means that UDFs are on the
   exact same level as builtin functions.
 * Storage Engine API has had some parts extensively reworked, especially
   around transactions and DDL.
 * Logging is now pluggable
 * Authentication is pluggable
 * Replication is pluggable
 * INFORMATION_SCHEMA plugins have been replaced by the function_engine, which
   is a lot more space and time efficient.
 * Network protocols are pluggable
 * Scheduler is pluggable (multi_thread, pool_of_threads etc)
 * Plugin points for manipulating rows before/after operations: used for
   replication and the PBMS Blob Streaming plugin.

Stored Procedures
-----------------

Drizzle does not currently have any plugins implement stored procedures. We
viewed the implementation in MySQL to be non-optimal, bloating the parser
and only supporting one language (SQL2003 stored procedures) which was not
well known.

Fundamentally, stored procedures usually are not the correct architectural
decision for applications that need to scale. Pushing more computation down
into the database (which is the trickiest layer to scale) isn't a good idea.

We do recognise that the ability to reduce the time row locks are held
by using stored procedures is valuable, but think the same advantage can
be gotten with improved batching of commands over the wire instead of adding
administering stored procedures to the list of things that can go wrong in
admisistering the database.

Triggers
--------

Drizzle does not currently have any plugin that provides SQL triggers. We
have some hooks for callbacks inside the server so that plugins can hook
into points that triggers could.

Views
-----

SQL Views are not currently supported in Drizzle. We believe they should be
implemented via a query rewrite plugin. See the `Query Rewrite Blueprint <https://blueprints.launchpad.net/drizzle/+spec/query-rewrite>`_ on launchpad.

Partitioning
------------

INFORMATION_SCHEMA
------------------

The INFORMATION_SCHEMA in Drizzle is strictly ANSI compliant. If you write
a query to any of the tables in the INFORMATION_SCHEMA in Drizzle, you can
directly run these on any other ANSI compliant system.

For information that does not fit into the standard, there is also the
DATA_DICTIONARY schema. Use of tables in DATA_DICTIONARY is non-portable.

This allows developers to easily know if the query is portable or not.

Authentication, Authorization and Access
----------------------------------------

Plugins. Currently there are PAM and HTTP AUTH plugins for authentication.
Through the PAM plugin, you can use any PAM module (such as LDAP).

Command line clients
--------------------

We've stopped the confusion: -p means port and -P means password.

No gotcha of using the unix socket when localhost is specified and then
connecting you to the wrong database server.

There is no drizzleadmin command.

Storage Engines
---------------

 * MERGE storage engine has been removed
 * FEDERATED storage engine has been removed (all current development is
   focused on FederatedX, so having FEDERATED made no sense).
 * CSV engine is now for temporary tables only. See the filesystem_engine for
   the future of reading files as database tables.
 * MyISAM is for temporary tables only.
 * ARCHIVE is fully supported
 * PBXT is merged

FRM Files
---------

There are no FRM files in Drizzle. Engines now own their own metadata.
Some choose to still store these in files on disk. These are now in a
documented file format (using the google protobuf library).

SHOW commands
-------------

Several SHOW commands have been removed, replaced with INFORMATION_SCHEMA
or DATA_DICTIONARY views. All SHOW commands are aliases to INFORMATION_SCHEMA
queries. Our INFORMATION_SCHEMA implementation does not have the drawbacks
of the MySQL implementation.

 * SHOW ENGINES: use DATA_DICTIONARY

Removed commands
----------------

 * ALTER TABLE UPGRADE
 * REPAIR TABLE
 * CREATE FUNCTION
 * CONVERT
 * SET NAMES

Operators Removed
-----------------

Bit operators: &&, >>, <<, ~, ^, |, &

Removed functions
-----------------

 * crypt()
 * bit_length()
 * bit_count()

Keywords removed
----------------
 * BIT_AND
 * BIT_OR
 * BIT_XOR
 * CIPHER
 * CLIENT
 * CODE
 * CONTRIBUTORS
 * CPU
 * DEFINER
 * DES_KEY_FILE
 * ENGINES
 * EVERY
 * IO
 * IPC
 * ISSUSER

Objects Removed
---------------

 * There is no requirement for a 'mysql' schema.
 * There is no SET datatype, use ENUM.
 * There is no SET NAMES command, UTF-8 by default
 * There is no CHARSET or CHARACTER SET commands, everything defaults to UTF8
 * There is no TIME type, use DATETIME or INT.
 * There is no TINYINT, SMALLINT or MEDIUMINT. Integer operations have been optimized around 32 and 64 bit integers.
 * There are no TINYBLOB, MEDIUMBLOB and LONGBLOB datatypes. We have optimized a single BLOB container.
 * There are no TINYTEXT, MEDIUMTEXT and LONGTEXT datatypes. Use TEXT or BLOB.
 * There is no UNSIGNED (as per the standard).
 * There are no spatial data types GEOMETRY, POINT, LINESTRING & POLYGON (go use `Postgres <http://www.postgresql.org>`_).
 * No YEAR field type.
 * There are no FULLTEXT indexes for the MyISAM storage engine (the only engine FULLTEXT was supported in). Look at either Lucene, Sphinx, or Solr.
 * No "dual" table.
 * The "LOCAL" keyword in "LOAD DATA LOCAL INFILE" is not supported