26
26
This file contains:
29
A module that accepts a condition, index (or partitioning) description,
30
and builds lists of intervals (in index/partitioning space), such that
31
all possible records that match the condition are contained within the
29
A module that accepts a condition, index (or partitioning) description,
30
and builds lists of intervals (in index/partitioning space), such that
31
all possible records that match the condition are contained within the
33
33
The entry point for the range analysis module is get_mm_tree() function.
35
35
The lists are returned in form of complicated structure of interlinked
36
36
SEL_TREE/SEL_IMERGE/SEL_ARG objects.
37
See quick_range_seq_next, find_used_partitions for examples of how to walk
37
See quick_range_seq_next, find_used_partitions for examples of how to walk
39
39
All direct "users" of this module are located within this file, too.
46
46
The module has single entry point - prune_partitions() function.
49
Range/index_merge/groupby-minmax optimizer module
50
A module that accepts a table, condition, and returns
49
Range/index_merge/groupby-minmax optimizer module
50
A module that accepts a table, condition, and returns
51
51
- a QUICK_*_SELECT object that can be used to retrieve rows that match
52
the specified condition, or a "no records will match the condition"
52
the specified condition, or a "no records will match the condition"
55
55
The module entry points are
65
65
The code in this file (and elsewhere) makes operations on key value tuples.
66
66
Those tuples are stored in the following format:
68
68
The tuple is a sequence of key part values. The length of key part value
69
69
depends only on its type (and not depends on the what value is stored)
71
71
KeyTuple: keypart1-data, keypart2-data, ...
73
73
The value of each keypart is stored in the following format:
75
75
keypart_data: [isnull_byte] keypart-value-bytes
77
77
If a keypart may have a NULL value (key_part->field->real_maybe_null() can
78
be used to check this), then the first byte is a NULL indicator with the
78
be used to check this), then the first byte is a NULL indicator with the
79
79
following valid values:
80
80
1 - keypart has NULL value.
81
81
0 - keypart has non-NULL value.
87
87
keypart-value-bytes holds the value. Its format depends on the field type.
88
88
The length of keypart-value-bytes may or may not depend on the value being
89
stored. The default is that length is static and equal to
89
stored. The default is that length is static and equal to
90
90
KEY_PART_INFO::length.
92
Key parts with (key_part_flag & HA_BLOB_PART) have length depending of the
92
Key parts with (key_part_flag & HA_BLOB_PART) have length depending of the
95
95
keypart-value-bytes: value_length value_bytes
97
97
The value_length part itself occupies HA_KEY_BLOB_LENGTH=2 bytes.
99
99
See key_copy() and key_restore() for code to move data between index tuple
102
CAUTION: the above description is only sergefp's understanding of the
102
CAUTION: the above description is only sergefp's understanding of the
103
103
subject and may omit some details.
106
106
#include <drizzled/server_includes.h>
107
107
#include <drizzled/sql_select.h>
110
#define test_rb_tree(A,B) {}
111
#define test_use_count(A) {}
108
#include <drizzled/error.h>
109
#include <drizzled/cost_vect.h>
110
#include <drizzled/item/cmpfunc.h>
111
#include <drizzled/field/num.h>
112
#include <drizzled/check_stack_overrun.h>
118
#if defined(CMATH_NAMESPACE)
119
using namespace CMATH_NAMESPACE;
124
132
class RANGE_OPT_PARAM;
126
134
A construction block of the SEL_ARG-graph.
128
The following description only covers graphs of SEL_ARG objects with
136
The following description only covers graphs of SEL_ARG objects with
129
137
sel_arg->type==KEY_RANGE:
131
139
One SEL_ARG object represents an "elementary interval" in form
133
141
min_value <=? table.keypartX <=? max_value
135
143
The interval is a non-empty interval of any kind: with[out] minimum/maximum
136
144
bound, [half]open/closed, single-point interval, etc.
138
146
1. SEL_ARG GRAPH STRUCTURE
140
148
SEL_ARG objects are linked together in a graph. The meaning of the graph
141
149
is better demostrated by an example:
146
154
| part=1 $ part=2 $ part=3
161
169
+-------+ $ $ +--------+
163
171
... $ $ +--------+
165
173
The entire graph is partitioned into "interval lists".
167
175
An interval list is a sequence of ordered disjoint intervals over the same
168
176
key part. SEL_ARG are linked via "next" and "prev" pointers. Additionally,
169
all intervals in the list form an RB-tree, linked via left/right/parent
177
all intervals in the list form an RB-tree, linked via left/right/parent
170
178
pointers. The RB-tree root SEL_ARG object will be further called "root of the
173
In the example pic, there are 4 interval lists:
181
In the example pic, there are 4 interval lists:
174
182
"kp<1 OR kp1=2 OR kp1=3", "kp2=5", "kp3=10 OR kp3=12", "kp3=11 OR kp3=13".
175
183
The vertical lines represent SEL_ARG::next/prev pointers.
177
185
In an interval list, each member X may have SEL_ARG::next_key_part pointer
178
186
pointing to the root of another interval list Y. The pointed interval list
179
187
must cover a key part with greater number (i.e. Y->part > X->part).
181
189
In the example pic, the next_key_part pointers are represented by
182
190
horisontal lines.
186
194
It represents a condition in a special form (we don't have a name for it ATM)
187
195
The SEL_ARG::next/prev is "OR", and next_key_part is "AND".
189
197
For example, the picture represents the condition in form:
190
(kp1 < 1 AND kp2=5 AND (kp3=10 OR kp3=12)) OR
191
(kp1=2 AND (kp3=11 OR kp3=14)) OR
198
(kp1 < 1 AND kp2=5 AND (kp3=10 OR kp3=12)) OR
199
(kp1=2 AND (kp3=11 OR kp3=14)) OR
192
200
(kp1=3 AND (kp3=11 OR kp3=14))
197
205
Use get_mm_tree() to construct SEL_ARG graph from WHERE condition.
198
206
Then walk the SEL_ARG graph and get a list of dijsoint ordered key
199
207
intervals (i.e. intervals in form
201
209
(constA1, .., const1_K) < (keypart1,.., keypartK) < (constB1, .., constB_K)
203
211
Those intervals can be used to access the index. The uses are in:
208
216
and create QUICK_RANGE_SELECT object that will
209
217
read records within these intervals.
211
4. SPACE COMPLEXITY NOTES
219
4. SPACE COMPLEXITY NOTES
213
221
SEL_ARG graph is a representation of an ordered disjoint sequence of
214
222
intervals over the ordered set of index tuple values.
216
224
For multi-part keys, one can construct a WHERE expression such that its
217
225
list of intervals will be of combinatorial size. Here is an example:
219
(keypart1 IN (1,2, ..., n1)) AND
220
(keypart2 IN (1,2, ..., n2)) AND
227
(keypart1 IN (1,2, ..., n1)) AND
228
(keypart2 IN (1,2, ..., n2)) AND
221
229
(keypart3 IN (1,2, ..., n3))
223
231
For this WHERE clause the list of intervals will have n1*n2*n3 intervals
226
234
(keypart1, keypart2, keypart3) = (k1, k2, k3), where 1 <= k{i} <= n{i}
228
236
SEL_ARG graph structure aims to reduce the amount of required space by
229
237
"sharing" the elementary intervals when possible (the pic at the
230
beginning of this comment has examples of such sharing). The sharing may
238
beginning of this comment has examples of such sharing). The sharing may
231
239
prevent combinatorial blowup:
233
241
There are WHERE clauses that have combinatorial-size interval lists but
234
242
will be represented by a compact SEL_ARG graph.
236
(keypartN IN (1,2, ..., n1)) AND
244
(keypartN IN (1,2, ..., n1)) AND
238
(keypart2 IN (1,2, ..., n2)) AND
246
(keypart2 IN (1,2, ..., n2)) AND
239
247
(keypart1 IN (1,2, ..., n3))
241
249
but not in all cases:
244
252
representation but get_mm_tree() and its callees will construct a
245
253
graph of combinatorial size.
247
(keypart1 IN (1,2, ..., n1)) AND
248
(keypart2 IN (1,2, ..., n2)) AND
255
(keypart1 IN (1,2, ..., n1)) AND
256
(keypart2 IN (1,2, ..., n2)) AND
250
258
(keypartN IN (1,2, ..., n3))
255
263
By induction: Let's take any interval on some keypart in the middle:
259
267
Then let's AND it with this interval 'structure' from preceding and
260
268
following keyparts:
262
270
(kp14=c1 AND kp16=c3) OR keypart14=c2) (*)
264
272
We will obtain this SEL_ARG graph:
266
274
kp14 $ kp15 $ kp16
268
276
+---------+ $ +---------+ $ +---------+
269
277
| kp14=c1 |--$-->| kp15=c0 |--$-->| kp16=c3 |
270
278
+---------+ $ +---------+ $ +---------+
272
+---------+ $ +---------+ $
273
| kp14=c2 |--$-->| kp15=c0 | $
274
+---------+ $ +---------+ $
280
+---------+ $ +---------+ $
281
| kp14=c2 |--$-->| kp15=c0 | $
282
+---------+ $ +---------+ $
277
285
Note that we had to duplicate "kp15=c0" and there was no way to avoid
279
287
The induction step: AND the obtained expression with another "wrapping"
280
288
expression like (*).
281
When the process ends because of the limit on max. number of keyparts
289
When the process ends because of the limit on max. number of keyparts
284
292
WHERE clause length is O(3*#max_keyparts)
319
327
SEL_ARG *left,*right; /* R-B tree children */
320
328
SEL_ARG *next,*prev; /* Links for bi-directional interval list */
321
329
SEL_ARG *parent; /* R-B tree parent */
322
SEL_ARG *next_key_part;
330
SEL_ARG *next_key_part;
323
331
enum leaf_color { BLACK,RED } color;
324
332
enum Type { IMPOSSIBLE, MAYBE, MAYBE_KEY, KEY_RANGE } type;
345
353
inline void merge_flags(SEL_ARG *arg) { maybe_flag|=arg->maybe_flag; }
346
354
inline void maybe_smaller() { maybe_flag=1; }
347
355
/* Return true iff it's a single-point null interval */
348
inline bool is_null_interval() { return maybe_null && max_value[0] == 1; }
356
inline bool is_null_interval() { return maybe_null && max_value[0] == 1; }
349
357
inline int cmp_min_to_min(SEL_ARG* arg)
351
359
return sel_cmp(field,min_value, arg->min_value, min_flag, arg->min_flag);
482
490
uint32_t res= key_tree->store_min(key[key_tree->part].store_length,
483
491
range_key, *range_key_flag);
484
492
*range_key_flag|= key_tree->min_flag;
486
494
if (key_tree->next_key_part &&
487
495
key_tree->next_key_part->part == key_tree->part+1 &&
488
496
!(*range_key_flag & (NO_MIN_RANGE | NEAR_MIN)) &&
604
612
Starting an effort to document this field:
605
(for some i, keys[i]->type == SEL_ARG::IMPOSSIBLE) =>
613
(for some i, keys[i]->type == SEL_ARG::IMPOSSIBLE) =>
606
614
(type == SEL_TREE::IMPOSSIBLE)
608
616
enum Type { IMPOSSIBLE, ALWAYS, MAYBE, KEY, KEY_SMALLER } type;
656
664
#keys elements are not empty)
661
669
If true, the index descriptions describe real indexes (and it is ok to
662
670
call field->optimize_range(real_keynr[...], ...).
663
671
Otherwise index description describes fake indexes.
665
673
bool using_real_indexes;
667
675
bool remove_jump_scans;
670
678
used_key_no -> table_key_no translation table. Only makes sense if
671
679
using_real_indexes==true
673
681
uint32_t real_keynr[MAX_KEY];
674
682
/* Number of SEL_ARG objects allocated by SEL_ARG::clone_tree operations */
675
uint32_t alloced_sel_args;
683
uint32_t alloced_sel_args;
676
684
bool force_default_mrr;
723
731
static bool is_key_scan_ror(PARAM *param, uint32_t keynr, uint8_t nparts);
724
732
static ha_rows check_quick_select(PARAM *param, uint32_t idx, bool index_only,
725
SEL_ARG *tree, bool update_tbl_stats,
733
SEL_ARG *tree, bool update_tbl_stats,
726
734
uint32_t *mrr_flags, uint32_t *bufsize,
727
735
COST_VECT *cost);
728
736
//bool update_tbl_stats);
734
742
QUICK_RANGE_SELECT *get_quick_select(PARAM *param,uint32_t index,
735
SEL_ARG *key_tree, uint32_t mrr_flags,
743
SEL_ARG *key_tree, uint32_t mrr_flags,
736
744
uint32_t mrr_buf_size, MEM_ROOT *alloc);
737
745
static TRP_RANGE *get_key_scans_params(PARAM *param, SEL_TREE *tree,
738
746
bool index_read_must_be_used,
1070
bool SQL_SELECT::check_quick(Session *session, bool force_quick_range,
1075
return test_quick_select(session, tmp, 0, limit,
1076
force_quick_range, false) < 0;
1080
bool SQL_SELECT::skip_record()
1082
return cond ? cond->val_int() == 0 : 0;
1061
1086
QUICK_SELECT_I::QUICK_SELECT_I()
1062
1087
:max_used_key_length(0),
1063
1088
used_key_parts(0)
1066
QUICK_RANGE_SELECT::QUICK_RANGE_SELECT(THD *thd, Table *table, uint32_t key_nr,
1091
QUICK_RANGE_SELECT::QUICK_RANGE_SELECT(Session *session, Table *table, uint32_t key_nr,
1067
1092
bool no_alloc, MEM_ROOT *parent_alloc,
1068
1093
bool *create_error)
1069
1094
:free_file(0),cur_range(NULL),last_range(0),dont_free(0)
1077
1102
key_part_info= head->key_info[index].key_part;
1078
1103
my_init_dynamic_array(&ranges, sizeof(QUICK_RANGE*), 16, 16);
1080
/* 'thd' is not accessible in QUICK_RANGE_SELECT::reset(). */
1081
mrr_buf_size= thd->variables.read_rnd_buff_size;
1105
/* 'session' is not accessible in QUICK_RANGE_SELECT::reset(). */
1106
mrr_buf_size= session->variables.read_rnd_buff_size;
1082
1107
mrr_buf_desc= NULL;
1084
1109
if (!no_alloc && !parent_alloc)
1086
1111
// Allocates everything through the internal memroot
1087
init_sql_alloc(&alloc, thd->variables.range_alloc_block_size, 0);
1088
thd->mem_root= &alloc;
1112
init_sql_alloc(&alloc, session->variables.range_alloc_block_size, 0);
1113
session->mem_root= &alloc;
1091
1116
memset(&alloc, 0, sizeof(alloc));
1156
QUICK_INDEX_MERGE_SELECT::QUICK_INDEX_MERGE_SELECT(THD *thd_param,
1180
QUICK_INDEX_MERGE_SELECT::QUICK_INDEX_MERGE_SELECT(Session *session_param,
1158
:pk_quick_select(NULL), thd(thd_param)
1182
:pk_quick_select(NULL), session(session_param)
1160
1184
index= MAX_KEY;
1162
1186
memset(&read_record, 0, sizeof(read_record));
1163
init_sql_alloc(&alloc, thd->variables.range_alloc_block_size, 0);
1187
init_sql_alloc(&alloc, session->variables.range_alloc_block_size, 0);
1206
QUICK_ROR_INTERSECT_SELECT::QUICK_ROR_INTERSECT_SELECT(THD *thd_param,
1230
QUICK_ROR_INTERSECT_SELECT::QUICK_ROR_INTERSECT_SELECT(Session *session_param,
1208
1232
bool retrieve_full_rows,
1209
1233
MEM_ROOT *parent_alloc)
1210
: cpk_quick(NULL), thd(thd_param), need_to_fetch_row(retrieve_full_rows),
1234
: cpk_quick(NULL), session(session_param), need_to_fetch_row(retrieve_full_rows),
1211
1235
scans_inited(false)
1213
1237
index= MAX_KEY;
1215
1239
record= head->record[0];
1216
1240
if (!parent_alloc)
1217
init_sql_alloc(&alloc, thd->variables.range_alloc_block_size, 0);
1241
init_sql_alloc(&alloc, session->variables.range_alloc_block_size, 0);
1219
1243
memset(&alloc, 0, sizeof(MEM_ROOT));
1220
1244
last_rowid= (unsigned char*) alloc_root(parent_alloc? parent_alloc : &alloc,
1288
if (!(file= head->file->clone(thd->mem_root)))
1311
session= head->in_use;
1312
if (!(file= head->file->clone(session->mem_root)))
1291
1315
Manually set the error flag. Note: there seems to be quite a few
1292
1316
places where a failure could cause the server to "hang" the client by
1293
sending no response to a query. ATM those are not real errors because
1294
the storage engine calls in question happen to never fail with the
1295
existing storage engines.
1317
sending no response to a query. ATM those are not real errors because
1318
the storage engine calls in question happen to never fail with the
1319
existing storage engines.
1297
1321
my_error(ER_OUT_OF_RESOURCES, MYF(0)); /* purecov: inspected */
1298
1322
/* Caller will free the memory */
1302
1326
head->column_bitmaps_set(&column_bitmap, &column_bitmap);
1304
if (file->ha_external_lock(thd, F_RDLCK))
1328
if (file->ha_external_lock(session, F_RDLCK))
1307
1331
if (init() || reset())
1309
file->ha_external_lock(thd, F_UNLCK);
1333
file->ha_external_lock(session, F_UNLCK);
1445
QUICK_ROR_UNION_SELECT::QUICK_ROR_UNION_SELECT(THD *thd_param,
1475
QUICK_ROR_UNION_SELECT::QUICK_ROR_UNION_SELECT(Session *session_param,
1447
: thd(thd_param), scans_inited(false)
1477
: session(session_param), scans_inited(false)
1449
1479
index= MAX_KEY;
1451
1481
rowid_length= table->file->ref_length;
1452
1482
record= head->record[0];
1453
init_sql_alloc(&alloc, thd->variables.range_alloc_block_size, 0);
1454
thd_param->mem_root= &alloc;
1483
init_sql_alloc(&alloc, session->variables.range_alloc_block_size, 0);
1484
session_param->mem_root= &alloc;
1795
1825
KEY_PART_INFO *keyinfo= table->key_info[idx].key_part;
1796
1826
uint32_t n_parts= table->key_info[idx].key_parts;
1797
1827
uint32_t partno= 0;
1800
The below check is sufficient considering we now have either BTREE
1801
indexes (records are returned in order for any index prefix) or HASH
1830
The below check is sufficient considering we now have either BTREE
1831
indexes (records are returned in order for any index prefix) or HASH
1802
1832
indexes (records are not returned in order for any index prefix).
1804
1834
if (!(table->file->index_flags(idx, 0, 1) & HA_READ_ORDER))
1810
1840
((Item_field*)item)->field->eq(keyinfo[partno].field)))
1814
1844
if (!ord && table->key_info[idx].key_length < match_key_len)
1817
1847
Ok, the ordering is compatible and this key is shorter then
1818
1848
previous match (we want shorter keys as we'll have to read fewer
1819
1849
index pages for the same number of records)
1882
1912
/* Table read plans are allocated on MEM_ROOT and are never deleted */
1883
1913
static void *operator new(size_t size, MEM_ROOT *mem_root)
1884
1914
{ return (void*) alloc_root(mem_root, (uint) size); }
1885
static void operator delete(void *ptr __attribute__((unused)),
1886
size_t size __attribute__((unused)))
1915
static void operator delete(void *, size_t)
1887
1916
{ TRASH(ptr, size); }
1888
static void operator delete(void *ptr __attribute__((unused)),
1889
MEM_ROOT *mem_root __attribute__((unused)))
1917
static void operator delete(void *, MEM_ROOT *)
1890
1918
{ /* Never called */ }
1891
1919
virtual ~TABLE_READ_PLAN() {} /* Remove gcc warning */
1918
1946
virtual ~TRP_RANGE() {} /* Remove gcc warning */
1920
QUICK_SELECT_I *make_quick(PARAM *param,
1921
bool retrieve_full_rows __attribute__((unused)),
1922
MEM_ROOT *parent_alloc)
1948
QUICK_SELECT_I *make_quick(PARAM *param, bool, MEM_ROOT *parent_alloc)
1924
1950
QUICK_RANGE_SELECT *quick;
1925
1951
if ((quick= get_quick_select(param, key_idx, key, mrr_flags, mrr_buf_size,
2107
2133
quick_condition_rows value is obtained as follows:
2109
2135
It is a minimum of E(#output rows) for all considered table access
2110
2136
methods (range and index_merge accesses over various indexes).
2112
2138
The obtained value is not a true E(#rows that satisfy table condition)
2113
2139
but rather a pessimistic estimate. To obtain a true E(#...) one would
2114
2140
need to combine estimates of various access methods, taking into account
2115
2141
correlations between sets of rows they will return.
2117
2143
For example, if values of tbl.key1 and tbl.key2 are independent (a right
2118
2144
assumption if we have no information about their correlation) then the
2119
2145
correct estimate will be:
2121
E(#rows("tbl.key1 < c1 AND tbl.key2 < c2")) =
2147
E(#rows("tbl.key1 < c1 AND tbl.key2 < c2")) =
2122
2148
= E(#rows(tbl.key1 < c1)) / total_rows(tbl) * E(#rows(tbl.key2 < c2)
2124
which is smaller than
2150
which is smaller than
2126
2152
MIN(E(#rows(tbl.key1 < c1), E(#rows(tbl.key2 < c2)))
2128
2154
which is currently produced.
2131
2157
* Change the value returned in quick_condition_rows from a pessimistic
2132
estimate to true E(#rows that satisfy table condition).
2133
(we can re-use some of E(#rows) calcuation code from index_merge/intersection
2158
estimate to true E(#rows that satisfy table condition).
2159
(we can re-use some of E(#rows) calcuation code from index_merge/intersection
2136
2162
* Check if this function really needs to modify keys_to_use, and change the
2137
2163
code to pass it by reference if it doesn't.
2146
2172
1 if found usable ranges and quick select has been successfully created.
2149
int SQL_SELECT::test_quick_select(THD *thd, key_map keys_to_use,
2175
int SQL_SELECT::test_quick_select(Session *session, key_map keys_to_use,
2150
2176
table_map prev_tables,
2151
ha_rows limit, bool force_quick_range,
2177
ha_rows limit, bool force_quick_range,
2152
2178
bool ordered_output)
2183
if (check_stack_overrun(thd, 2*STACK_MIN_SIZE, NULL))
2209
if (check_stack_overrun(session, 2*STACK_MIN_SIZE, NULL))
2184
2210
return(0); // Fatal error flag is set
2186
2212
/* set up parameter that is passed to all functions */
2213
param.session= session;
2188
2214
param.baseflag= head->file->ha_table_flags();
2189
2215
param.prev_tables=prev_tables | const_tables;
2190
2216
param.read_tables=read_tables;
2192
2218
param.table=head;
2194
2220
param.mem_root= &alloc;
2195
param.old_root= thd->mem_root;
2221
param.old_root= session->mem_root;
2196
2222
param.needed_reg= &needed_reg;
2197
2223
param.imerge_cost_buff_size= 0;
2198
2224
param.using_real_indexes= true;
2199
2225
param.remove_jump_scans= true;
2200
2226
param.force_default_mrr= ordered_output;
2202
thd->no_errors=1; // Don't warn about NULL
2203
init_sql_alloc(&alloc, thd->variables.range_alloc_block_size, 0);
2228
session->no_errors=1; // Don't warn about NULL
2229
init_sql_alloc(&alloc, session->variables.range_alloc_block_size, 0);
2204
2230
if (!(param.key_parts= (KEY_PART*) alloc_root(&alloc,
2205
2231
sizeof(KEY_PART)*
2206
2232
head->s->key_parts)) ||
2207
2233
fill_used_fields_bitmap(¶m))
2235
session->no_errors=0;
2210
2236
free_root(&alloc,MYF(0)); // Return memory & allocator
2211
2237
return(0); // Can't use range
2213
2239
key_parts= param.key_parts;
2214
thd->mem_root= &alloc;
2240
session->mem_root= &alloc;
2217
2243
Make an array with description of all key parts of all table keys.
2248
2274
if (!head->covering_keys.is_clear_all())
2250
2276
int key_for_use= head->find_shortest_key(&head->covering_keys);
2251
double key_read_time=
2252
param.table->file->index_only_read_time(key_for_use,
2277
double key_read_time=
2278
param.table->file->index_only_read_time(key_for_use,
2253
2279
rows2double(records)) +
2254
2280
(double) records / TIME_FOR_COMPARE;
2255
2281
if (key_read_time < read_time)
2353
2379
new_conj_trp= get_best_disjunct_quick(¶m, imerge, best_read_time);
2354
2380
if (new_conj_trp)
2355
set_if_smaller(param.table->quick_condition_rows,
2381
set_if_smaller(param.table->quick_condition_rows,
2356
2382
new_conj_trp->records);
2357
2383
if (!best_conj_trp || (new_conj_trp && new_conj_trp->read_cost <
2358
2384
best_conj_trp->read_cost))
2545
2571
/* Calculate cost(rowid_to_row_scan) */
2547
2573
COST_VECT sweep_cost;
2548
JOIN *join= param->thd->lex->select_lex.join;
2574
JOIN *join= param->session->lex->select_lex.join;
2549
2575
bool is_interrupted= test(join && join->tables == 1);
2550
2576
get_sweep_read_cost(param->table, non_cpk_scan_records, is_interrupted,
2558
2584
unique_calc_buff_size=
2559
2585
Unique::get_cost_calc_buff_size((ulong)non_cpk_scan_records,
2560
2586
param->table->file->ref_length,
2561
param->thd->variables.sortbuff_size);
2587
param->session->variables.sortbuff_size);
2562
2588
if (param->imerge_cost_buff_size < unique_calc_buff_size)
2564
2590
if (!(param->imerge_cost_buff= (uint*)alloc_root(param->mem_root,
2571
2597
Unique::get_use_cost(param->imerge_cost_buff, (uint)non_cpk_scan_records,
2572
2598
param->table->file->ref_length,
2573
param->thd->variables.sortbuff_size);
2599
param->session->variables.sortbuff_size);
2574
2600
if (imerge_cost < read_time)
2576
2602
if ((imerge_trp= new (param->mem_root)TRP_INDEX_MERGE))
2661
2687
double roru_total_cost;
2663
2689
COST_VECT sweep_cost;
2664
JOIN *join= param->thd->lex->select_lex.join;
2690
JOIN *join= param->session->lex->select_lex.join;
2665
2691
bool is_interrupted= test(join && join->tables == 1);
2666
2692
get_sweep_read_cost(param->table, roru_total_records, is_interrupted,
2881
2907
void ror_intersect_cpy(ROR_INTERSECT_INFO *dst, const ROR_INTERSECT_INFO *src)
2883
2909
dst->param= src->param;
2884
memcpy(dst->covered_fields.bitmap, src->covered_fields.bitmap,
2910
memcpy(dst->covered_fields.bitmap, src->covered_fields.bitmap,
2885
2911
no_bytes_in_map(&src->covered_fields));
2886
2912
dst->out_rows= src->out_rows;
2887
2913
dst->is_covering= src->is_covering;
2981
3007
Selectivity of given ROR scan.
2984
static double ror_scan_selectivity(const ROR_INTERSECT_INFO *info,
3010
static double ror_scan_selectivity(const ROR_INTERSECT_INFO *info,
2985
3011
const ROR_SCAN_INFO *scan)
2987
3013
double selectivity_mult= 1.0;
3084
3110
E(rows_to_retrieve) = #rows_in_table * ror_scan_selectivity(null, scan1) *
3085
3111
ror_scan_selectivity({scan1}, scan2) * ... *
3086
ror_scan_selectivity({scan1,...}, scanN).
3112
ror_scan_selectivity({scan1,...}, scanN).
3088
3114
true ROR scan added to ROR-intersection, cost updated.
3089
3115
false It doesn't make sense to add this ROR scan to this ROR-intersection.
3100
3126
/* Don't add this scan if it doesn't improve selectivity. */
3104
3130
info->out_rows *= selectivity_mult;
3106
3132
if (is_cpk_scan)
3109
CPK scan is used to filter out rows. We apply filtering for
3135
CPK scan is used to filter out rows. We apply filtering for
3110
3136
each record of every scan. Assuming 1/TIME_FOR_COMPARE_ROWID
3111
3137
per check this gives us:
3113
info->index_scan_costs += rows2double(info->index_records) /
3139
info->index_scan_costs += rows2double(info->index_records) /
3114
3140
TIME_FOR_COMPARE_ROWID;
3129
3155
if (!info->is_covering)
3131
3157
COST_VECT sweep_cost;
3132
JOIN *join= info->param->thd->lex->select_lex.join;
3158
JOIN *join= info->param->session->lex->select_lex.join;
3133
3159
bool is_interrupted= test(join && join->tables == 1);
3134
3160
get_sweep_read_cost(info->param->table, double2rows(info->out_rows),
3135
3161
is_interrupted, &sweep_cost);
3272
3298
/* Create and incrementally update ROR intersection. */
3273
3299
ROR_INTERSECT_INFO *intersect, *intersect_best;
3274
if (!(intersect= ror_intersect_init(param)) ||
3300
if (!(intersect= ror_intersect_init(param)) ||
3275
3301
!(intersect_best= ror_intersect_init(param)))
3317
3343
Ok, found the best ROR-intersection of non-CPK key scans.
3318
Check if we should add a CPK scan. If the obtained ROR-intersection is
3344
Check if we should add a CPK scan. If the obtained ROR-intersection is
3319
3345
covering, it doesn't make sense to add CPK scan.
3321
3347
if (cpk_scan && !intersect->is_covering)
3323
if (ror_intersect_add(intersect, cpk_scan, true) &&
3349
if (ror_intersect_add(intersect, cpk_scan, true) &&
3324
3350
(intersect->total_cost < min_cost))
3326
3352
cpk_scan_used= true;
3409
3435
ror_scan_mark= tree->ror_scans;
3411
3437
MY_BITMAP *covered_fields= ¶m->tmp_covered_fields;
3412
if (!covered_fields->bitmap)
3438
if (!covered_fields->bitmap)
3413
3439
covered_fields->bitmap= (my_bitmap_map*)alloc_root(param->mem_root,
3414
3440
param->fields_bitmap_size);
3415
3441
if (!covered_fields->bitmap ||
3509
3535
(except for clustered PK indexes)
3510
3536
update_tbl_stats true <=> update table->quick_* with information
3511
3537
about range scans we've evaluated.
3512
read_time Maximum cost. i.e. don't create read plans with
3538
read_time Maximum cost. i.e. don't create read plans with
3513
3539
cost > read_time.
3516
Find the best "range" table read plan for given SEL_TREE.
3517
The side effects are
3542
Find the best "range" table read plan for given SEL_TREE.
3543
The side effects are
3518
3544
- tree->ror_scans is updated to indicate which scans are ROR scans.
3519
3545
- if update_tbl_stats=true then table->quick_* is updated with info
3520
3546
about every possible range scan.
3555
3581
(*key)->maybe_flag)
3556
3582
param->needed_reg->set_bit(keynr);
3558
bool read_index_only= index_read_must_be_used ||
3584
bool read_index_only= index_read_must_be_used ||
3559
3585
param->table->covering_keys.is_set(keynr);
3561
3587
found_records= check_quick_select(param, idx, read_index_only, *key,
3599
QUICK_SELECT_I *TRP_INDEX_MERGE::make_quick(PARAM *param,
3600
bool retrieve_full_rows __attribute__((unused)),
3601
MEM_ROOT *parent_alloc __attribute__((unused)))
3625
QUICK_SELECT_I *TRP_INDEX_MERGE::make_quick(PARAM *param, bool, MEM_ROOT *)
3603
3627
QUICK_INDEX_MERGE_SELECT *quick_imerge;
3604
3628
QUICK_RANGE_SELECT *quick;
3605
3629
/* index_merge always retrieves full rows, ignore retrieve_full_rows */
3606
if (!(quick_imerge= new QUICK_INDEX_MERGE_SELECT(param->thd, param->table)))
3630
if (!(quick_imerge= new QUICK_INDEX_MERGE_SELECT(param->session, param->table)))
3609
3633
quick_imerge->records= records;
3676
QUICK_SELECT_I *TRP_ROR_UNION::make_quick(PARAM *param,
3677
bool retrieve_full_rows __attribute__((unused)),
3678
MEM_ROOT *parent_alloc __attribute__((unused)))
3700
QUICK_SELECT_I *TRP_ROR_UNION::make_quick(PARAM *param, bool, MEM_ROOT *)
3680
3702
QUICK_ROR_UNION_SELECT *quick_roru;
3681
3703
TABLE_READ_PLAN **scan;
3684
3706
It is impossible to construct a ROR-union that will not retrieve full
3685
3707
rows, ignore retrieve_full_rows parameter.
3687
if ((quick_roru= new QUICK_ROR_UNION_SELECT(param->thd, param->table)))
3709
if ((quick_roru= new QUICK_ROR_UNION_SELECT(param->session, param->table)))
3689
3711
for (scan= first_ror; scan != last_ror; scan++)
3711
3733
gt_value constant that field should be greaterr
3712
3734
cmp_type compare type for the field
3715
3737
# Pointer to tree built tree
3719
static SEL_TREE *get_ne_mm_tree(RANGE_OPT_PARAM *param, Item_func *cond_func,
3741
static SEL_TREE *get_ne_mm_tree(RANGE_OPT_PARAM *param, Item_func *cond_func,
3721
3743
Item *lt_value, Item *gt_value,
3722
3744
Item_result cmp_type)
3745
3767
value constant in the predicate
3746
3768
cmp_type compare type for the field
3747
3769
inv true <> NOT cond_func is considered
3748
(makes sense only when cond_func is BETWEEN or IN)
3770
(makes sense only when cond_func is BETWEEN or IN)
3751
3773
Pointer to the tree built tree
3754
static SEL_TREE *get_func_mm_tree(RANGE_OPT_PARAM *param, Item_func *cond_func,
3776
static SEL_TREE *get_func_mm_tree(RANGE_OPT_PARAM *param, Item_func *cond_func,
3755
3777
Field *field, Item *value,
3756
3778
Item_result cmp_type, bool inv)
3815
3837
We get here for conditions in form "t.key NOT IN (c1, c2, ...)",
3816
where c{i} are constants. Our goal is to produce a SEL_TREE that
3838
where c{i} are constants. Our goal is to produce a SEL_TREE that
3817
3839
represents intervals:
3819
3841
($MIN<t.key<c1) OR (c1<t.key<c2) OR (c2<t.key<c3) OR ... (*)
3821
3843
where $MIN is either "-inf" or NULL.
3823
3845
The most straightforward way to produce it is to convert NOT IN
3824
3846
into "(t.key != c1) AND (t.key != c2) AND ... " and let the range
3825
3847
analyzer to build SEL_TREE from that. The problem is that the
3830
3852
Another problem with big lists like (*) is that a big list is
3831
3853
unlikely to produce a good "range" access, while considering that
3832
range access will require expensive CPU calculations (and for
3854
range access will require expensive CPU calculations (and for
3833
3855
MyISAM even index accesses). In short, big NOT IN lists are rarely
3834
3856
worth analyzing.
3841
3863
#define NOT_IN_IGNORE_THRESHOLD 1000
3842
3864
MEM_ROOT *tmp_root= param->mem_root;
3843
param->thd->mem_root= param->old_root;
3865
param->session->mem_root= param->old_root;
3845
3867
Create one Item_type constant object. We'll need it as
3846
3868
get_mm_parts only accepts constant values wrapped in Item_Type
3848
3870
We create the Item on param->mem_root which points to
3849
per-statement mem_root (while thd->mem_root is currently pointing
3871
per-statement mem_root (while session->mem_root is currently pointing
3850
3872
to mem_root local to range optimizer).
3852
3874
Item *value_item= func->array->create_item();
3853
param->thd->mem_root= tmp_root;
3875
param->session->mem_root= tmp_root;
3855
3877
if (func->array->count > NOT_IN_IGNORE_THRESHOLD || !value_item)
3858
3880
/* Get a SEL_TREE for "(-inf|NULL) < X < c_0" interval. */
3862
3884
func->array->value_to_item(i, value_item);
3863
3885
tree= get_mm_parts(param, cond_func, field, Item_func::LT_FUNC,
3900
3922
new_interval->min_flag= NEAR_MIN;
3904
3926
The following doesn't try to allocate memory so no need to
3905
3927
check for NULL.
3907
3929
tree= tree_or(param, tree, tree2);
3911
3933
if (tree && tree->type != SEL_TREE::IMPOSSIBLE)
3914
Get the SEL_TREE for the last "c_last < X < +inf" interval
3936
Get the SEL_TREE for the last "c_last < X < +inf" interval
3915
3937
(value_item cotains c_last already)
3917
3939
tree2= get_mm_parts(param, cond_func, field, Item_func::GT_FUNC,
3930
3952
for (arg= func->arguments()+2, end= arg+func->argument_count()-2;
3931
3953
arg < end ; arg++)
3933
tree= tree_and(param, tree, get_ne_mm_tree(param, cond_func, field,
3955
tree= tree_and(param, tree, get_ne_mm_tree(param, cond_func, field,
3934
3956
*arg, *arg, cmp_type));
3941
3963
tree= get_mm_parts(param, cond_func, field, Item_func::EQ_FUNC,
3942
3964
func->arguments()[1], cmp_type);
3946
3968
for (arg= func->arguments()+2, end= arg+func->argument_count()-2;
3947
3969
arg < end ; arg++)
3949
tree= tree_or(param, tree, get_mm_parts(param, cond_func, field,
3971
tree= tree_or(param, tree, get_mm_parts(param, cond_func, field,
3950
3972
Item_func::EQ_FUNC,
3951
3973
*arg, cmp_type));
3960
3982
Here the function for the following predicates are processed:
3961
3983
<, <=, =, >=, >, LIKE, IS NULL, IS NOT NULL.
3962
3984
If the predicate is of the form (value op field) it is handled
3984
4006
field_item field in the predicate
3985
4007
value constant in the predicate
3986
4008
(for BETWEEN it contains the number of the field argument,
3987
for IN it's always 0)
4009
for IN it's always 0)
3988
4010
inv true <> NOT cond_func is considered
3989
4011
(makes sense only when cond_func is BETWEEN or IN)
3993
4015
c is a constant, the function builds a conjunction of all SEL_TREES that can
3994
4016
be obtained by the substitution of f for all different fields equal to f.
3997
4019
If the WHERE condition contains a predicate (fi op c),
3998
4020
then not only SELL_TREE for this predicate is built, but
3999
4021
the trees for the results of substitution of fi for
4000
4022
each fj belonging to the same multiple equality as fi
4001
4023
are built as well.
4002
E.g. for WHERE t1.a=t2.a AND t2.a > 10
4024
E.g. for WHERE t1.a=t2.a AND t2.a > 10
4003
4025
a SEL_TREE for t2.a > 10 will be built for quick select from t2
4005
4027
a SEL_TREE for t1.a > 10 will be built for quick select from t1.
4007
4029
A BETWEEN predicate of the form (fi [NOT] BETWEEN c1 AND c2) is treated
4010
4032
Yet a predicate of the form (c BETWEEN f1i AND f2i) is processed
4011
4033
differently. It is considered as a conjuction of two SARGable
4012
4034
predicates (f1i <= c) and (f2i <=c) and the function get_full_func_mm_tree
4013
is called for each of them separately producing trees for
4014
AND j (f1j <=c ) and AND j (f2j <= c)
4035
is called for each of them separately producing trees for
4036
AND j (f1j <=c ) and AND j (f2j <= c)
4015
4037
After this these two trees are united in one conjunctive tree.
4016
4038
It's easy to see that the same tree is obtained for
4017
4039
AND j,k (f1j <=c AND f2k<=c)
4018
which is equivalent to
4040
which is equivalent to
4019
4041
AND j,k (c BETWEEN f1j AND f2k).
4020
4042
The validity of the processing of the predicate (c NOT BETWEEN f1i AND f2i)
4021
4043
which equivalent to (f1i > c OR f2i < c) is not so obvious. Here the
4022
4044
function get_full_func_mm_tree is called for (f1i > c) and (f2i < c)
4023
4045
producing trees for AND j (f1j > c) and AND j (f2j < c). Then this two
4024
trees are united in one OR-tree. The expression
4046
trees are united in one OR-tree. The expression
4025
4047
(AND j (f1j > c) OR AND j (f2j < c)
4026
4048
is equivalent to the expression
4027
AND j,k (f1j > c OR f2k < c)
4028
which is just a translation of
4049
AND j,k (f1j > c OR f2k < c)
4050
which is just a translation of
4029
4051
AND j,k (c NOT BETWEEN f1j AND f2k)
4031
4053
In the cases when one of the items f1, f2 is a constant c1 we do not create
4038
4060
As to IN predicates only ones of the form (f IN (c1,...,cn)),
4039
4061
where f1 is a field and c1,...,cn are constant, are considered as
4040
4062
SARGable. We never try to narrow the index scan using predicates of
4041
the form (c IN (c1,...,f,...,cn)).
4063
the form (c IN (c1,...,f,...,cn)).
4044
4066
Pointer to the tree representing the built conjunction of SEL_TREEs
4047
4069
static SEL_TREE *get_full_func_mm_tree(RANGE_OPT_PARAM *param,
4048
4070
Item_func *cond_func,
4049
Item_field *field_item, Item *value,
4071
Item_field *field_item, Item *value,
4052
4074
SEL_TREE *tree= 0;
4106
4128
while ((item=li++))
4108
4130
SEL_TREE *new_tree=get_mm_tree(param,item);
4109
if (param->thd->is_fatal_error ||
4131
if (param->session->is_fatal_error ||
4110
4132
param->alloced_sel_args > SEL_ARG::MAX_SEL_ARGS)
4111
4133
return(0); // out of memory
4112
4134
tree=tree_and(param,tree,new_tree);
4137
4159
if (cond->const_item())
4140
During the cond->val_int() evaluation we can come across a subselect
4141
item which may allocate memory on the thd->mem_root and assumes
4142
all the memory allocated has the same life span as the subselect
4162
During the cond->val_int() evaluation we can come across a subselect
4163
item which may allocate memory on the session->mem_root and assumes
4164
all the memory allocated has the same life span as the subselect
4143
4165
item itself. So we have to restore the thread's mem_root here.
4145
4167
MEM_ROOT *tmp_root= param->mem_root;
4146
param->thd->mem_root= param->old_root;
4168
param->session->mem_root= param->old_root;
4147
4169
tree= cond->val_int() ? new(tmp_root) SEL_TREE(SEL_TREE::ALWAYS) :
4148
4170
new(tmp_root) SEL_TREE(SEL_TREE::IMPOSSIBLE);
4149
param->thd->mem_root= tmp_root;
4171
param->session->mem_root= tmp_root;
4188
4210
if (cond_func->arguments()[i]->real_item()->type() == Item::FIELD_ITEM)
4190
4212
field_item= (Item_field*) (cond_func->arguments()[i]->real_item());
4191
SEL_TREE *tmp= get_full_func_mm_tree(param, cond_func,
4213
SEL_TREE *tmp= get_full_func_mm_tree(param, cond_func,
4192
4214
field_item, (Item*)(intptr_t)i, inv);
4194
4216
tree= !tree ? tmp : tree_or(param, tree, tmp);
4196
4218
tree= tree_and(param, tree, tmp);
4259
4281
static SEL_TREE *
4260
4282
get_mm_parts(RANGE_OPT_PARAM *param, COND *cond_func, Field *field,
4261
4283
Item_func::Functype type,
4263
Item_result cmp_type __attribute__((unused)))
4284
Item *value, Item_result)
4265
4286
if (field->table != param->table)
4324
4345
the argument can be any, e.g. a subselect. The subselect
4325
4346
items, in turn, assume that all the memory allocated during
4326
4347
the evaluation has the same life span as the item itself.
4327
TODO: opt_range.cc should not reset thd->mem_root at all.
4348
TODO: opt_range.cc should not reset session->mem_root at all.
4329
param->thd->mem_root= param->old_root;
4350
param->session->mem_root= param->old_root;
4330
4351
if (!value) // IS NULL or IS NOT NULL
4332
4353
if (field->table->maybe_null) // Can't use a key on this
4468
4489
/* For comparison purposes allow invalid dates like 2000-01-32 */
4469
4490
orig_sql_mode= field->table->in_use->variables.sql_mode;
4470
4491
if (value->real_item()->type() == Item::STRING_ITEM &&
4471
(field->type() == DRIZZLE_TYPE_NEWDATE ||
4492
(field->type() == DRIZZLE_TYPE_DATE ||
4472
4493
field->type() == DRIZZLE_TYPE_DATETIME))
4473
4494
field->table->in_use->variables.sql_mode|= MODE_INVALID_DATES;
4474
4495
err= value->save_in_field_no_warnings(field, 1);
4491
4512
for the cases like int_field > 999999999999999999999999 as well.
4494
if (err == 3 && field->type() == DRIZZLE_TYPE_NEWDATE &&
4515
if (err == 3 && field->type() == DRIZZLE_TYPE_DATE &&
4495
4516
(type == Item_func::GT_FUNC || type == Item_func::GE_FUNC ||
4496
4517
type == Item_func::LT_FUNC || type == Item_func::LE_FUNC) )
4500
4521
but a non-zero time part was cut off.
4502
4523
In MySQL's SQL dialect, DATE and DATETIME are compared as datetime
4503
values. Index over a DATE column uses DATE comparison. Changing
4524
values. Index over a DATE column uses DATE comparison. Changing
4504
4525
from one comparison to the other is possible:
4506
4527
datetime(date_col)< '2007-12-10 12:34:55' -> date_col<='2007-12-10'
4783
4804
tree->part != 0. (e.g. it could represent "keypart2 < const").
4785
4806
WHY THIS FUNCTION IS NEEDED
4787
4808
Normally we allow construction of SEL_TREE objects that have SEL_ARG
4788
4809
trees that do not allow quick range select construction. For example for
4789
4810
" keypart1=1 AND keypart2=2 " the execution will proceed as follows:
4793
4814
call tree_and(tree1, tree2) -- this joins SEL_ARGs into a usable SEL_ARG
4796
4817
There is an exception though: when we construct index_merge SEL_TREE,
4797
4818
any SEL_ARG* tree that cannot be used to construct quick range select can
4798
4819
be removed, because current range analysis code doesn't provide any way
4799
4820
that tree could be later combined with another tree.
4800
4821
Consider an example: we should not construct
4802
merges = SEL_IMERGE {
4803
SEL_TREE(t.key1part1 = 1),
4823
merges = SEL_IMERGE {
4824
SEL_TREE(t.key1part1 = 1),
4804
4825
SEL_TREE(t.key2part2 = 2) -- (*)
4808
- (*) cannot be used to construct quick range select,
4809
- There is no execution path that would cause (*) to be converted to
4829
- (*) cannot be used to construct quick range select,
4830
- There is no execution path that would cause (*) to be converted to
4810
4831
a tree that could be used.
4812
4833
The latter is easy to verify: first, notice that the only way to convert
4813
4834
(*) into a usable tree is to call tree_and(something, (*)).
4815
4836
Second look at what tree_and/tree_or function would do when passed a
4816
SEL_TREE that has the structure like st1 tree has, and conlcude that
4837
SEL_TREE that has the structure like st1 tree has, and conlcude that
4817
4838
tree_and(something, (*)) will not be called.
4913
4934
/* one tree is index merge tree and another is range tree */
4914
4935
if (tree1->merges.is_empty())
4915
4936
std::swap(tree1, tree2);
4917
4938
if (param->remove_jump_scans && remove_nonrange_trees(param, tree2))
4918
4939
return(new SEL_TREE(SEL_TREE::ALWAYS));
4919
4940
/* add tree2 to tree1->merges, checking if it collapses to ALWAYS */
4930
4951
/* And key trees where key1->part < key2 -> part */
4932
4953
static SEL_ARG *
4933
and_all_keys(RANGE_OPT_PARAM *param, SEL_ARG *key1, SEL_ARG *key2,
4954
and_all_keys(RANGE_OPT_PARAM *param, SEL_ARG *key1, SEL_ARG *key2,
4934
4955
uint32_t clone_flag)
5031
5052
if (key1->type == SEL_ARG::MAYBE_KEY)
5032
5053
{ // Both are maybe key
5033
key1->next_key_part=key_and(param, key1->next_key_part,
5054
key1->next_key_part=key_and(param, key1->next_key_part,
5034
5055
key2->next_key_part, clone_flag);
5035
5056
if (key1->next_key_part &&
5036
5057
key1->next_key_part->type == SEL_ARG::IMPOSSIBLE)
5537
5558
return(0); // Maybe root later
5538
5559
if (remove_color == BLACK)
5539
5560
root=rb_delete_fixup(root,nod,fix_par);
5540
5562
test_rb_tree(root,root->parent);
5563
#endif /* EXTRA_DEBUG */
5542
5565
root->use_count=this->use_count; // Fix root counters
5543
5566
root->elements=this->elements-1;
5772
5798
This function counts how many times the node "key" is referred (via
5773
SEL_ARG::next_key_part) by
5774
- intervals of RB-tree pointed by "root",
5775
- intervals of RB-trees that are pointed by SEL_ARG::next_key_part from
5799
SEL_ARG::next_key_part) by
5800
- intervals of RB-tree pointed by "root",
5801
- intervals of RB-trees that are pointed by SEL_ARG::next_key_part from
5776
5802
intervals of RB-tree pointed by "root",
5779
Here is an example (horizontal links represent next_key_part pointers,
5780
vertical links - next/prev prev pointers):
5805
Here is an example (horizontal links represent next_key_part pointers,
5806
vertical links - next/prev prev pointers):
5783
5809
|root|-----------------+
5870
5896
****************************************************************************/
5872
5898
/* MRR range sequence, SEL_ARG* implementation: stack entry */
5873
typedef struct st_range_seq_entry
5899
typedef struct st_range_seq_entry
5876
5902
Pointers in min and max keys. They point to right-after-end of key
5877
5903
images. The 0-th entry has these pointing to key tuple start.
5879
5905
unsigned char *min_key, *max_key;
5882
5908
Flags, for {keypart0, keypart1, ... this_keypart} subtuple.
5883
5909
min_key_flag may have NULL_RANGE set.
5885
5911
uint32_t min_key_flag, max_key_flag;
5887
5913
/* Number of key parts */
5888
5914
uint32_t min_key_parts, max_key_parts;
5889
5915
SEL_ARG *key_tree;
5899
5925
uint32_t real_keyno; /* Number of the index in tables */
5901
5927
SEL_ARG *start; /* Root node of the traversed SEL_ARG* graph */
5903
5929
RANGE_SEQ_ENTRY stack[MAX_REF_PARTS];
5904
5930
int i; /* Index of last used element in the above array */
5906
5932
bool at_start; /* true <=> The traversal has just started */
5907
5933
} SEL_ARG_RANGE_SEQ;
5915
5941
init_params SEL_ARG tree traversal context
5916
n_ranges [ignored] The number of ranges obtained
5942
n_ranges [ignored] The number of ranges obtained
5917
5943
flags [ignored] HA_MRR_SINGLE_POINT, HA_MRR_FIXED_KEY
5920
5946
Value of init_param
5923
range_seq_t sel_arg_range_seq_init(void *init_param,
5924
uint32_t n_ranges __attribute__((unused)),
5925
uint32_t flags __attribute__((unused)))
5949
range_seq_t sel_arg_range_seq_init(void *init_param, uint32_t, uint32_t)
5927
5951
SEL_ARG_RANGE_SEQ *seq= (SEL_ARG_RANGE_SEQ*)init_param;
5928
5952
seq->at_start= true;
6039
6063
Walk right-up while we can
6041
6065
walk_right_n_up:
6042
while (key_tree->next_key_part && key_tree->next_key_part != &null_element &&
6066
while (key_tree->next_key_part && key_tree->next_key_part != &null_element &&
6043
6067
key_tree->next_key_part->part == key_tree->part + 1 &&
6044
6068
key_tree->next_key_part->type == SEL_ARG::KEY_RANGE)
6055
6079
seq->param->is_ror_scan= false;
6056
6080
if (!key_tree->min_flag)
6057
cur->min_key_parts +=
6081
cur->min_key_parts +=
6058
6082
key_tree->next_key_part->store_min_key(seq->param->key[seq->keyno],
6060
6084
&cur->min_key_flag);
6061
6085
if (!key_tree->max_flag)
6062
cur->max_key_parts +=
6086
cur->max_key_parts +=
6063
6087
key_tree->next_key_part->store_max_key(seq->param->key[seq->keyno],
6065
6089
&cur->max_key_flag);
6071
6095
Ok, current atomic interval is in form "t.field=const" and there is
6072
6096
next_key_part interval. Step right, and walk up from there.
6085
6109
/* Ok got a tuple */
6086
6110
RANGE_SEQ_ENTRY *cur= &seq->stack[seq->i];
6088
6112
range->ptr= (char*)(int)(key_tree->part);
6090
6114
range->range_flag= cur->min_key_flag | cur->max_key_flag;
6092
6116
range->start_key.key= seq->param->min_key;
6093
6117
range->start_key.length= cur->min_key - seq->param->min_key;
6094
6118
range->start_key.keypart_map= make_prev_keypart_map(cur->min_key_parts);
6095
range->start_key.flag= (cur->min_key_flag & NEAR_MIN ? HA_READ_AFTER_KEY :
6119
range->start_key.flag= (cur->min_key_flag & NEAR_MIN ? HA_READ_AFTER_KEY :
6096
6120
HA_READ_KEY_EXACT);
6098
6122
range->end_key.key= seq->param->max_key;
6099
6123
range->end_key.length= cur->max_key - seq->param->max_key;
6100
range->end_key.flag= (cur->max_key_flag & NEAR_MAX ? HA_READ_BEFORE_KEY :
6124
range->end_key.flag= (cur->max_key_flag & NEAR_MAX ? HA_READ_BEFORE_KEY :
6101
6125
HA_READ_AFTER_KEY);
6102
6126
range->end_key.keypart_map= make_prev_keypart_map(cur->max_key_parts);
6164
6188
ha_rows check_quick_select(PARAM *param, uint32_t idx, bool index_only,
6165
SEL_ARG *tree, bool update_tbl_stats,
6189
SEL_ARG *tree, bool update_tbl_stats,
6166
6190
uint32_t *mrr_flags, uint32_t *bufsize, COST_VECT *cost)
6168
6192
SEL_ARG_RANGE_SEQ seq;
6190
6214
param->is_ror_scan= true;
6191
6215
if (file->index_flags(keynr, 0, true) & HA_KEY_SCAN_NOT_ROR)
6192
6216
param->is_ror_scan= false;
6194
6218
*mrr_flags= param->force_default_mrr? HA_MRR_USE_DEFAULT_IMPL: 0;
6195
6219
*mrr_flags|= HA_MRR_NO_ASSOCIATION;
6197
6221
bool pk_is_clustered= file->primary_key_is_clustered();
6199
6223
(file->index_flags(keynr, param->max_key_part, 1) & HA_KEYREAD_ONLY) &&
6200
6224
!(pk_is_clustered && keynr == param->table->s->primary_key))
6201
6225
*mrr_flags |= HA_MRR_INDEX_ONLY;
6203
if (current_thd->lex->sql_command != SQLCOM_SELECT)
6227
if (current_session->lex->sql_command != SQLCOM_SELECT)
6204
6228
*mrr_flags |= HA_MRR_USE_DEFAULT_IMPL;
6206
*bufsize= param->thd->variables.read_rnd_buff_size;
6230
*bufsize= param->session->variables.read_rnd_buff_size;
6207
6231
rows= file->multi_range_read_info_const(keynr, &seq_if, (void*)&seq, 0,
6208
6232
bufsize, mrr_flags, cost);
6209
6233
if (rows != HA_POS_ERROR)
6222
6246
enum ha_key_alg key_alg= param->table->key_info[seq.real_keyno].algorithm;
6223
6247
if ((key_alg != HA_KEY_ALG_BTREE) && (key_alg!= HA_KEY_ALG_UNDEF))
6226
6250
All scans are non-ROR scans for those index types.
6227
TODO: Don't have this logic here, make table engines return
6251
TODO: Don't have this logic here, make table engines return
6228
6252
appropriate flags instead.
6230
6254
param->is_ror_scan= false;
6264
6288
where the index is defined on (key1_1, ..., key1_N [,a_1, ..., a_n])
6266
and the table has a clustered Primary Key defined as
6267
PRIMARY KEY(a_1, ..., a_n, b1, ..., b_k)
6269
i.e. the first key parts of it are identical to uncovered parts ot the
6290
and the table has a clustered Primary Key defined as
6291
PRIMARY KEY(a_1, ..., a_n, b1, ..., b_k)
6293
i.e. the first key parts of it are identical to uncovered parts ot the
6270
6294
key being scanned. This function assumes that the index flags do not
6271
6295
include HA_KEY_SCAN_NOT_ROR flag (that is checked elsewhere).
6345
6369
QUICK_RANGE_SELECT *quick;
6346
6370
bool create_err= false;
6348
quick=new QUICK_RANGE_SELECT(param->thd, param->table,
6372
quick=new QUICK_RANGE_SELECT(param->session, param->table,
6349
6373
param->real_keynr[idx],
6350
6374
test(parent_alloc), NULL, &create_err);
6600
QUICK_RANGE_SELECT *get_quick_select_for_ref(THD *thd, Table *table,
6624
QUICK_RANGE_SELECT *get_quick_select_for_ref(Session *session, Table *table,
6601
6625
TABLE_REF *ref, ha_rows records)
6603
6627
MEM_ROOT *old_root, *alloc;
6609
6633
bool create_err= false;
6610
6634
COST_VECT cost;
6612
old_root= thd->mem_root;
6613
/* The following call may change thd->mem_root */
6614
quick= new QUICK_RANGE_SELECT(thd, table, ref->key, 0, 0, &create_err);
6636
old_root= session->mem_root;
6637
/* The following call may change session->mem_root */
6638
quick= new QUICK_RANGE_SELECT(session, table, ref->key, 0, 0, &create_err);
6615
6639
/* save mem_root set by QUICK_RANGE_SELECT constructor */
6616
alloc= thd->mem_root;
6640
alloc= session->mem_root;
6618
return back default mem_root (thd->mem_root) changed by
6642
return back default mem_root (session->mem_root) changed by
6619
6643
QUICK_RANGE_SELECT constructor
6621
thd->mem_root= old_root;
6645
session->mem_root= old_root;
6623
6647
if (!quick || create_err)
6624
6648
return 0; /* no ranges found */
6678
6702
/* Call multi_range_read_info() to get the MRR flags and buffer size */
6679
quick->mrr_flags= HA_MRR_NO_ASSOCIATION |
6703
quick->mrr_flags= HA_MRR_NO_ASSOCIATION |
6680
6704
(table->key_read ? HA_MRR_INDEX_ONLY : 0);
6681
if (thd->lex->sql_command != SQLCOM_SELECT)
6705
if (session->lex->sql_command != SQLCOM_SELECT)
6682
6706
quick->mrr_flags |= HA_MRR_USE_DEFAULT_IMPL;
6684
quick->mrr_buf_size= thd->variables.read_rnd_buff_size;
6708
quick->mrr_buf_size= session->variables.read_rnd_buff_size;
6685
6709
if (table->file->multi_range_read_info(quick->index, 1, (uint)records,
6686
6710
&quick->mrr_buf_size,
6687
6711
&quick->mrr_flags, &cost))
6698
Perform key scans for all used indexes (except CPK), get rowids and merge
6722
Perform key scans for all used indexes (except CPK), get rowids and merge
6699
6723
them into an ordered non-recurrent sequence of rowids.
6701
6725
The merge/duplicate removal is performed using Unique class. We put all
6702
6726
rowids into Unique, get the sorted sequence and destroy the Unique.
6704
6728
If table has a clustered primary key that covers all rows (true for bdb
6705
6729
and innodb currently) and one of the index_merge scans is a scan on PK,
6706
then rows that will be retrieved by PK scan are not put into Unique and
6730
then rows that will be retrieved by PK scan are not put into Unique and
6707
6731
primary key scan is not performed here, it is performed later separately.
6784
6808
/* index_merge currently doesn't support "using index" at all */
6785
6809
file->extra(HA_EXTRA_NO_KEYREAD);
6786
6810
/* start table scan */
6787
init_read_record(&read_record, thd, head, (SQL_SELECT*) 0, 1, 1);
6811
init_read_record(&read_record, session, head, (SQL_SELECT*) 0, 1, 1);
6788
6812
return(result);
7015
7039
if (!mrr_buf_desc)
7016
7040
empty_buf.buffer= empty_buf.buffer_end= empty_buf.end_of_used_area= NULL;
7019
7043
mrr_flags |= HA_MRR_SORTED;
7020
7044
RANGE_SEQ_IF seq_funcs= {quick_range_seq_init, quick_range_seq_next};
7021
7045
error= file->multi_range_read_init(&seq_funcs, (void*)this, ranges.elements,
7022
mrr_flags, mrr_buf_desc? mrr_buf_desc:
7046
mrr_flags, mrr_buf_desc? mrr_buf_desc:
7029
7053
Range sequence interface implementation for array<QUICK_RANGE>: initialize
7032
7056
quick_range_seq_init()
7033
7057
init_param Caller-opaque paramenter: QUICK_RANGE_SELECT* pointer
7034
7058
n_ranges Number of ranges in the sequence (ignored)
7035
flags MRR flags (currently not used)
7059
flags MRR flags (currently not used)
7038
7062
Opaque value to be passed to quick_range_seq_next
7041
range_seq_t quick_range_seq_init(void *init_param,
7042
uint32_t n_ranges __attribute__((unused)),
7043
uint32_t flags __attribute__((unused)))
7065
range_seq_t quick_range_seq_init(void *init_param, uint32_t, uint32_t)
7045
7067
QUICK_RANGE_SELECT *quick= (QUICK_RANGE_SELECT*)init_param;
7046
7068
quick->qr_traversal_ctx.first= (QUICK_RANGE**)quick->ranges.buffer;
7329
7350
for now, this seems to work right at least.
7332
QUICK_SELECT_DESC::QUICK_SELECT_DESC(QUICK_RANGE_SELECT *q,
7333
uint32_t used_key_parts_arg __attribute__((unused)),
7334
bool *create_err __attribute__((unused)))
7353
QUICK_SELECT_DESC::QUICK_SELECT_DESC(QUICK_RANGE_SELECT *q, uint32_t, bool *)
7335
7354
:QUICK_RANGE_SELECT(*q), rev_it(rev_ranges)
7337
7356
QUICK_RANGE *r;
7437
7456
Compare if found key is over max-value
7438
7457
Returns 0 if key <= range->max_key
7439
TODO: Figure out why can't this function be as simple as cmp_prev().
7458
TODO: Figure out why can't this function be as simple as cmp_prev().
7442
7461
int QUICK_RANGE_SELECT::cmp_next(QUICK_RANGE *range_arg)
7686
7705
static bool get_constant_key_infix(KEY *index_info, SEL_ARG *index_range_tree,
7687
7706
KEY_PART_INFO *first_non_group_part,
7688
7707
KEY_PART_INFO *min_max_arg_part,
7689
KEY_PART_INFO *last_part, THD *thd,
7708
KEY_PART_INFO *last_part, Session *session,
7690
7709
unsigned char *key_infix, uint32_t *key_infix_len,
7691
7710
KEY_PART_INFO **first_non_infix_part);
7729
7748
- NGA = QA - (GA union C) = {NG_1, ..., NG_m} - the ones not in
7730
7749
GROUP BY and not referenced by MIN/MAX functions.
7731
7750
with the following properties specified below.
7732
B3. If Q has a GROUP BY WITH ROLLUP clause the access method is not
7751
B3. If Q has a GROUP BY WITH ROLLUP clause the access method is not
7735
7754
SA1. There is at most one attribute in SA referenced by any number of
7817
7836
other field as in: "select min(a) from t1 group by a" ?
7818
7837
- We assume that the general correctness of the GROUP-BY query was checked
7819
7838
before this point. Is this correct, or do we have to check it completely?
7820
- Lift the limitation in condition (B3), that is, make this access method
7839
- Lift the limitation in condition (B3), that is, make this access method
7821
7840
applicable to ROLLUP queries.
7832
7851
static TRP_GROUP_MIN_MAX *
7833
7852
get_best_group_min_max(PARAM *param, SEL_TREE *tree)
7835
THD *thd= param->thd;
7836
JOIN *join= thd->lex->current_select->join;
7854
Session *session= param->session;
7855
JOIN *join= session->lex->current_select->join;
7837
7856
Table *table= param->table;
7838
7857
bool have_min= false; /* true if there is a MIN function. */
7839
7858
bool have_max= false; /* true if there is a MAX function. */
8095
8114
if (!get_constant_key_infix(cur_index_info, index_range_tree,
8096
8115
first_non_group_part, min_max_arg_part,
8097
last_part, thd, key_infix, &key_infix_len,
8116
last_part, session, key_infix, &key_infix_len,
8098
8117
&first_non_infix_part))
8099
8118
goto next_index;
8162
8181
COST_VECT dummy_cost;
8163
8182
uint32_t mrr_flags= HA_MRR_USE_DEFAULT_IMPL;
8164
8183
uint32_t mrr_bufsize=0;
8165
cur_quick_prefix_records= check_quick_select(param, cur_param_idx,
8166
false /*don't care*/,
8184
cur_quick_prefix_records= check_quick_select(param, cur_param_idx,
8185
false /*don't care*/,
8167
8186
cur_index_tree, true,
8168
8187
&mrr_flags, &mrr_bufsize,
8292
8311
cur_arg= arguments[arg_idx]->real_item();
8293
8312
if (cur_arg->type() == Item::FIELD_ITEM)
8295
if (min_max_arg_item->eq(cur_arg, 1))
8314
if (min_max_arg_item->eq(cur_arg, 1))
8298
8317
If pred references the MIN/MAX argument, check whether pred is a range
8367
8386
first_non_group_part [in] First index part after group attribute parts
8368
8387
min_max_arg_part [in] The keypart of the MIN/MAX argument if any
8369
8388
last_part [in] Last keypart of the index
8370
thd [in] Current thread
8389
session [in] Current thread
8371
8390
key_infix [out] Infix of constants to be used for index lookup
8372
8391
key_infix_len [out] Lenghth of the infix
8373
8392
first_non_infix_part [out] The first keypart after the infix (if any)
8376
8395
Test conditions (NGA1, NGA2) from get_best_group_min_max(). Namely,
8377
8396
for each keypart field NGF_i not in GROUP-BY, check that there is a
8392
get_constant_key_infix(KEY *index_info __attribute__((unused)),
8393
SEL_ARG *index_range_tree,
8411
get_constant_key_infix(KEY *, SEL_ARG *index_range_tree,
8394
8412
KEY_PART_INFO *first_non_group_part,
8395
8413
KEY_PART_INFO *min_max_arg_part,
8396
8414
KEY_PART_INFO *last_part,
8397
THD *thd __attribute__((unused)),
8398
unsigned char *key_infix, uint32_t *key_infix_len,
8415
Session *, unsigned char *key_infix, uint32_t *key_infix_len,
8399
8416
KEY_PART_INFO **first_non_infix_part)
8401
8418
SEL_ARG *cur_range;
8589
8606
void cost_group_min_max(Table* table, KEY *index_info, uint32_t used_key_parts,
8590
8607
uint32_t group_key_parts, SEL_TREE *range_tree,
8591
SEL_ARG *index_tree __attribute__((unused)),
8592
ha_rows quick_prefix_records,
8608
SEL_ARG *, ha_rows quick_prefix_records,
8593
8609
bool have_min, bool have_max,
8594
8610
double *read_cost, ha_rows *records)
8686
8702
QUICK_SELECT_I *
8687
TRP_GROUP_MIN_MAX::make_quick(PARAM *param,
8688
bool retrieve_full_rows __attribute__((unused)),
8689
MEM_ROOT *parent_alloc)
8703
TRP_GROUP_MIN_MAX::make_quick(PARAM *param, bool, MEM_ROOT *parent_alloc)
8691
8705
QUICK_GROUP_MIN_MAX_SELECT *quick;
8693
8707
quick= new QUICK_GROUP_MIN_MAX_SELECT(param->table,
8694
param->thd->lex->current_select->join,
8708
param->session->lex->current_select->join,
8695
8709
have_min, have_max, min_max_arg_part,
8696
8710
group_prefix_len, group_key_parts,
8697
8711
used_key_parts, index_info, index,
8819
8833
assert(!parent_alloc);
8820
8834
if (!parent_alloc)
8822
init_sql_alloc(&alloc, join->thd->variables.range_alloc_block_size, 0);
8823
join->thd->mem_root= &alloc;
8836
init_sql_alloc(&alloc, join->session->variables.range_alloc_block_size, 0);
8837
join->session->mem_root= &alloc;
8826
8840
memset(&alloc, 0, sizeof(MEM_ROOT)); // ensure that it's not used
8996
9010
quick_prefix_select is made over the conditions on the whole key.
8997
It defines a number of ranges of length x.
8998
However when jumping through the prefixes we use only the the first
9011
It defines a number of ranges of length x.
9012
However when jumping through the prefixes we use only the the first
8999
9013
few most significant keyparts in the range key. However if there
9000
are more keyparts to follow the ones we are using we must make the
9001
condition on the key inclusive (because x < "ab" means
9014
are more keyparts to follow the ones we are using we must make the
9015
condition on the key inclusive (because x < "ab" means
9002
9016
x[0] < 'a' OR (x[0] == 'a' AND x[1] < 'b').
9003
9017
To achive the above we must turn off the NEAR_MIN/NEAR_MAX
9510
9527
if ( !(cur_range->flag & NO_MAX_RANGE) )
9512
9529
/* Compose the MAX key for the range. */
9513
unsigned char *max_key= (unsigned char*) my_alloca(real_prefix_len + min_max_arg_len);
9514
memcpy(max_key, group_prefix, real_prefix_len);
9515
memcpy(max_key + real_prefix_len, cur_range->max_key,
9516
cur_range->max_length);
9531
max_key.append(group_prefix, real_prefix_len);
9532
max_key.append(cur_range->max_key, cur_range->max_length);
9517
9533
/* Compare the found key with max_key. */
9518
int cmp_res= key_cmp(index_info->key_part, max_key,
9534
int cmp_res= key_cmp(index_info->key_part,
9519
9536
real_prefix_len + min_max_arg_len);
9520
9537
if ((!((cur_range->flag & NEAR_MAX) && (cmp_res == -1)) || (cmp_res <= 0)))
9627
9646
if ( !(cur_range->flag & NO_MIN_RANGE) )
9629
9648
/* Compose the MIN key for the range. */
9630
unsigned char *min_key= (unsigned char*) my_alloca(real_prefix_len + min_max_arg_len);
9631
memcpy(min_key, group_prefix, real_prefix_len);
9632
memcpy(min_key + real_prefix_len, cur_range->min_key,
9633
cur_range->min_length);
9650
min_key.append(group_prefix, real_prefix_len);
9651
min_key.append(cur_range->min_key, cur_range->min_length);
9634
9653
/* Compare the found key with min_key. */
9635
int cmp_res= key_cmp(index_info->key_part, min_key,
9654
int cmp_res= key_cmp(index_info->key_part,
9636
9656
real_prefix_len + min_max_arg_len);
9637
9657
if ((!((cur_range->flag & NEAR_MIN) && (cmp_res == 1)) ||
9638
9658
(cmp_res >= 0)))
9735
9755
used_lengths->append(buf, length);
9738
static void print_sel_tree(PARAM *param, SEL_TREE *tree, key_map *tree_map,
9739
const char *msg __attribute__((unused)))
9758
static void print_sel_tree(PARAM *param, SEL_TREE *tree, key_map *tree_map, const char *)
9741
9760
SEL_ARG **key,**end;
9766
9785
static void print_ror_scans_arr(Table *table,
9767
const char *msg __attribute__((unused)),
9768
struct st_ror_scan_info **start,
9786
const char *, struct st_ror_scan_info **start,
9769
9787
struct st_ror_scan_info **end)
9771
9789
char buff[1024];