~drizzle-trunk/drizzle/development

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
=========================
Notable MySQL Differences
=========================

Drizzle was forked from the (now defunct) MySQL 6.0 tree in 2008. Since then there have been a lot of changes. Drizzle is in some ways similar to MySQL, and in other ways, unrecognizable.

This section of documentation aims to explore some of the notable differences between MySQL and Drizzle, and has been modified from its original state on the Drizzle Wiki.

Usage
-----
 * There is no embedded server. The Drizzle Server is not loadable as a shared library.
 * Drizzle is optimized for massively concurrent environments. If we have the choice of improving performance for 1024 simultaneous connections to the detriment of performance with only 64 connections, we will take that choice.
 * It is designed for modern POSIX systems
 * Microsoft Windows is not a supported platform (neither is HP-UX or IRIX).
 * Drizzle doesn't use timezones. Everything is UTC.

Installation
------------

 * No scripts/mysql_install_db or similar. Drizzle aims for a "just works" installation, without administrative overhead.
 * No system database that needs upgrading between versions.
 * Drizzle can listen on the Drizzle port (4427) and/or MySQL port (3306) and speak the respective protocols.

Architecture
------------

Drizzle is designed around the concept of being a microkernel. There should
be a small core of the server with most functionality being provided through
small, efficient and hard to misuse plugin interfaces. The goal is a small,
light-weight kernel that is easy to maintain, understand and extend.

Drizzle is written in C++ and makes use of the Standard Template Library (STL)
and Boost. Only where performance or correctness proves to be inadequate will
we consider rolling our own; our preference is to fix the upstream library
instead.

Network protocol
----------------

Pluggable network protocols allow Drizzle to speak one (or more) of several
protocols. Currently we support the MySQL protocol (compatible with existing
MySQL client libraries) and the Drizzle protocol, which is still under
development.

The Drizzle protocol embodies several important differences from MySQL:

 * Client sends first packet (rather than the server)
 * Built in sharding
 * Multi statement support (without requiring a semicolon to separate them)
 * Room for expansion to include NoSQL-type commands inline with SQL commands.

There is also a console plugin -- instead of providing access over a network
socket, this plugin allows access from the current tty.

Plugin API
----------

The existing plugin APIs that Drizzle inherited from MySQL have been reworked.

 * User Defined Functions (UDFs) now follow the same API as a given
   server instead of a different C API. This means that UDFs are on the
   exact same level as built-in functions
 * Some parts of the storage Engine API have been extensively reworked, especially
   around transactions and DDL
 * Logging is now pluggable
 * Authentication is pluggable
 * Replication is pluggable
 * INFORMATION_SCHEMA plugins have been replaced by the function_engine, which
   is a lot more space and time efficient
 * Network protocols are pluggable
 * Scheduler is pluggable (multi_thread, pool_of_threads, etc)
 * Plugin points for manipulating rows before/after operations: these can be used for
   replication and the PBMS Blob Streaming plugin

Stored Procedures
-----------------

Drizzle does not currently have any plugins that implement stored procedures. We
viewed the implementation in MySQL to be non-optimal. They bloat the parser
and only support one language (SQL2003 stored procedures), which was not
well known.

Fundamentally, stored procedures usually are not the correct architectural
decision for applications that need to scale. Pushing more computation down
into the database (which is the trickiest layer to scale) isn't a good idea.

We do recognize the value of using stored procedures to reduce the time row locks are held, but think we can achieve the same advantage by improved batching of commands over the wire. This removes adding and administering stored procedures from the list of things that can go wrong in administering the database.

Triggers
--------

Drizzle does not currently have any plugin that provides SQL triggers. We
have some hooks for callbacks inside the server so that plugins can hook
into points that triggers could.

Views
-----

SQL Views are not currently supported in Drizzle. We believe they should be
implemented via a query rewrite plugin. 

See the `Query Rewrite Blueprint <https://blueprints.launchpad.net/Drizzle/+spec/query-rewrite>`_ on launchpad.

Partitioning
------------

INFORMATION_SCHEMA
------------------

The INFORMATION_SCHEMA provides access to database metadata.

The INFORMATION_SCHEMA in Drizzle is strictly ANSI compliant. If you write
a query to any of the tables in the INFORMATION_SCHEMA in Drizzle, you can
directly run these on any other ANSI compliant system.

For information that does not fit into the standard, there is also the
DATA_DICTIONARY schema. Use of tables in DATA_DICTIONARY is non-portable.

This allows developers to easily know if the query is portable or not.

Authentication, Authorization and Access
----------------------------------------

Authentication lies in Drizzle plugins. Currently there are PAM and HTTP AUTH plugins for authentication.
Through the PAM plugin, you can use any PAM module (such as LDAP).

For more information, see our :doc:`authentication` doc.

Command line clients
--------------------

We've stopped the confusion: -p means port and -P means password.

No gotcha of using the unix socket when localhost is specified and then
connecting you to the wrong database server.

There is no Drizzle admin command.

Storage Engines
---------------

 * MERGE storage engine has been removed
 * FEDERATED storage engine has been removed (all current development is
   focused on FederatedX, so having FEDERATED made no sense).
 * CSV engine is now for temporary tables only. See the filesystem_engine for
   the future of reading files as database tables.
 * MyISAM is for temporary tables only.
 * ARCHIVE is fully supported
 * PBXT is merged

FRM Files
---------

There are no FRM files in Drizzle. Engines now own their own metadata.
Some still choose to store these in files on disk. These are now in a
documented file format (using the google protobuf library).

SHOW commands
-------------

Several SHOW commands have been removed, replaced with INFORMATION_SCHEMA
or DATA_DICTIONARY views. All SHOW commands are aliases to INFORMATION_SCHEMA
queries. Our INFORMATION_SCHEMA implementation does not have the drawbacks
of the MySQL implementation.

Here are some specific SHOW differences:

 * SHOW ENGINES: use DATA_DICTIONARY
 * SHOW CREATE TABLE: specifies the AUTO_INCREMENT at CREATE/ALTER TABLE time,
   not the current AUTO_INCREMENT
 * SHOW TABLE STATUS: only shows tables in the current table cache
 * SHOW [ENGINE] STATUS: use the DATA_DICTIONARY tables for that engine

Removed commands
----------------

 * ALTER TABLE UPGRADE
 * REPAIR TABLE
 * CREATE FUNCTION
 * CONVERT
 * SET NAMES
 * Multi-table delete and multi-table update code was removed and can be accomplished through subqueries. More detailed information can be found in the :doc:`dml` section.

Removed functions
-----------------

 * crypt()
 * bit_length()
 * bit_count()

Keywords removed
----------------

 * CIPHER
 * CLIENT
 * CODE
 * CONTRIBUTORS
 * CPU
 * DEFINER
 * DES_KEY_FILE
 * ENGINES
 * EVERY
 * IO
 * IPC
 * ISSUSER

Objects Removed
---------------

 * There is no requirement for a 'mysql' schema.
 * There is no SET datatype, use ENUM.
 * There is no SET NAMES command, UTF-8 by default
 * There is no CHARSET or CHARACTER SET commands, everything defaults to UTF8
 * There is no TIME type, use DATETIME or INT.
 * There is no TINYINT, SMALLINT or MEDIUMINT. Integer operations have been optimized around 32 and 64 bit integers.
 * There are no TINYBLOB, MEDIUMBLOB and LONGBLOB datatypes. We have optimized a single BLOB container.
 * There are no TINYTEXT, MEDIUMTEXT and LONGTEXT datatypes. Use TEXT or BLOB.
 * There is no UNSIGNED (as per the standard).
 * There are no spatial data types GEOMETRY, POINT, LINESTRING & POLYGON (go use `Postgres <http://www.postgresql.org>`_).
 * No YEAR field type.
 * There are no FULLTEXT indexes for the MyISAM storage engine (the only engine FULLTEXT was supported in). Look at either Lucene, Sphinx, or Solr.
 * No "dual" table.
 * The "LOCAL" keyword in "LOAD DATA LOCAL INFILE" is not supported