244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
1 |
/* Copyright (C) 2000-2002 MySQL AB
|
2 |
Copyright (C) 2008 eBay, Inc
|
|
3 |
||
4 |
This program is free software; you can redistribute it and/or modify
|
|
5 |
it under the terms of the GNU General Public License as published by
|
|
6 |
the Free Software Foundation; version 2 of the License.
|
|
7 |
||
8 |
This program is distributed in the hope that it will be useful,
|
|
9 |
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
10 |
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
11 |
GNU General Public License for more details.
|
|
12 |
||
13 |
You should have received a copy of the GNU General Public License
|
|
14 |
along with this program; if not, write to the Free Software
|
|
15 |
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
16 |
||
17 |
/* Implements various base dataspace-related functions - allocate, free, clear */
|
|
18 |
||
19 |
#include "heapdef.h" |
|
20 |
||
21 |
||
22 |
/*
|
|
23 |
MySQL Heap tables keep data in arrays of fixed-size chunks.
|
|
24 |
These chunks are organized into two groups of HP_BLOCK structures:
|
|
25 |
- group1 contains indexes, with one HP_BLOCK per key
|
|
26 |
(part of HP_KEYDEF)
|
|
27 |
- group2 contains record data, with single HP_BLOCK
|
|
28 |
for all records, referenced by HP_SHARE.recordspace.block
|
|
29 |
||
30 |
While columns used in index are usually small, other columns
|
|
31 |
in the table may need to accomodate larger data. Typically,
|
|
32 |
larger data is placed into VARCHAR or BLOB columns. With actual
|
|
33 |
sizes varying, Heap Engine has to support variable-sized records
|
|
34 |
in memory. Heap Engine implements the concept of dataspace
|
|
35 |
(HP_DATASPACE), which incorporates HP_BLOCK for the record data,
|
|
36 |
and adds more information for managing variable-sized records.
|
|
37 |
||
38 |
Variable-size records are stored in multiple "chunks",
|
|
39 |
which means that a single record of data (database "row") can
|
|
40 |
consist of multiple chunks organized into one "set". HP_BLOCK
|
|
41 |
contains chunks. In variable-size format, one record
|
|
42 |
is represented as one or many chunks, depending on the actual
|
|
43 |
data, while in fixed-size mode, one record is always represented
|
|
44 |
as one chunk. The index structures would always point to the first
|
|
45 |
chunk in the chunkset.
|
|
46 |
||
47 |
At the time of table creation, Heap Engine attempts to find out
|
|
48 |
if variable-size records are desired. A user can request
|
|
49 |
variable-size records by providing either row_type=dynamic or
|
|
50 |
block_size=NNN table create option. Heap Engine will check
|
|
51 |
whether block_size provides enough space in the first chunk
|
|
52 |
to keep all null bits and columns that are used in indexes.
|
|
53 |
If block_size is too small, table creation will be aborted
|
|
54 |
with an error. Heap Engine will revert to fixed-size allocation
|
|
55 |
mode if block_size provides no memory benefits (no VARCHAR
|
|
56 |
fields extending past first chunk).
|
|
57 |
||
58 |
In order to improve index search performance, Heap Engine needs
|
|
59 |
to keep all null flags and all columns used as keys inside
|
|
60 |
the first chunk of a chunkset. In particular, this means that
|
|
61 |
all columns used as keys should be defined first in the table
|
|
62 |
creation SQL. The length of data used by null bits and key columns
|
|
63 |
is stored as fixed_data_length inside HP_SHARE. fixed_data_length
|
|
64 |
will extend past last key column if more fixed-length fields can
|
|
65 |
fit into the first chunk.
|
|
66 |
||
67 |
Variable-size records are necessary only in the presence
|
|
68 |
of variable-size columns. Heap Engine will be looking for VARCHAR
|
|
69 |
columns, which declare length of 32 or more. If no such columns
|
|
70 |
are found, table will be switched to fixed-size format. You should
|
|
71 |
always try to put such columns at the end of the table definition.
|
|
72 |
||
73 |
Whenever data is being inserted or updated in the table
|
|
74 |
Heap Engine will calculate how many chunks are necessary.
|
|
75 |
For insert operations, Heap Engine allocates new chunkset in
|
|
76 |
the recordspace. For update operations it will modify length of
|
|
77 |
the existing chunkset, unlinking unnecessary chunks at the end,
|
|
78 |
or allocating and adding more if larger length is necessary.
|
|
79 |
||
80 |
When writing data to chunks or copying data back to record,
|
|
81 |
Heap Engine will first copy fixed_data_length of data using single
|
|
82 |
memcpy call. The rest of the columns are processed one-by-one.
|
|
83 |
Non-VARCHAR columns are copied in their full format. VARCHAR's
|
|
84 |
are copied based on their actual length. Any NULL values after
|
|
85 |
fixed_data_length are skipped.
|
|
86 |
||
87 |
The allocation and contents of the actual chunks varies between
|
|
88 |
fixed and variable-size modes. Total chunk length is always
|
|
481
by Brian Aker
Remove all of uchar. |
89 |
aligned to the next sizeof(unsigned char*). Here is the format of
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
90 |
fixed-size chunk:
|
481
by Brian Aker
Remove all of uchar. |
91 |
unsigned char[] - sizeof=chunk_dataspace_length, but at least
|
92 |
sizeof(unsigned char*) bytes. Keeps actual data or pointer
|
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
93 |
to the next deleted chunk.
|
94 |
chunk_dataspace_length equals to full record length
|
|
481
by Brian Aker
Remove all of uchar. |
95 |
unsigned char - status field (1 means "in use", 0 means "deleted")
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
96 |
Variable-size uses different format:
|
481
by Brian Aker
Remove all of uchar. |
97 |
unsigned char[] - sizeof=chunk_dataspace_length, but at least
|
98 |
sizeof(unsigned char*) bytes. Keeps actual data or pointer
|
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
99 |
to the next deleted chunk.
|
100 |
chunk_dataspace_length is set according to table
|
|
101 |
setup (block_size)
|
|
481
by Brian Aker
Remove all of uchar. |
102 |
unsigned char* - pointer to the next chunk in this chunkset,
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
103 |
or NULL for the last chunk
|
481
by Brian Aker
Remove all of uchar. |
104 |
unsigned char - status field (1 means "first", 0 means "deleted",
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
105 |
2 means "linked")
|
106 |
||
107 |
When allocating a new chunkset of N chunks, Heap Engine will try
|
|
108 |
to allocate chunks one-by-one, linking them as they become
|
|
109 |
allocated. Allocation of a single chunk will attempt to reuse
|
|
110 |
a deleted (freed) chunk. If no free chunks are available,
|
|
111 |
it will attempt to allocate a new area inside HP_BLOCK.
|
|
112 |
Freeing chunks will place them at the front of free list
|
|
113 |
referenced by del_link in HP_DATASPACE. The newly freed chunk
|
|
114 |
will contain reference to the previously freed chunk in its first
|
|
481
by Brian Aker
Remove all of uchar. |
115 |
sizeof(unsigned char*) of the payload space.
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
116 |
|
117 |
Here is open issues:
|
|
118 |
1. It is not very nice to require people to keep key columns
|
|
119 |
at the beginning of the table creation SQL. There are three
|
|
120 |
proposed resolutions:
|
|
121 |
a. Leave it as is. It's a reasonable limitation
|
|
122 |
b. Add new HA_KEEP_KEY_COLUMNS_TO_FRONT flag to handler.h and
|
|
123 |
make table.cpp align columns when it creates the table
|
|
124 |
c. Make HeapEngine reorder columns in the chunk data, so that
|
|
125 |
key columns go first. Add parallel HA_KEYSEG structures
|
|
126 |
to distinguish positions in record vs. positions in
|
|
127 |
the first chunk. Copy all data field-by-field rather than
|
|
128 |
using single memcpy unless DBA kept key columns to
|
|
129 |
the beginning.
|
|
130 |
2. heap_check_heap needs verify linked chunks, looking for
|
|
131 |
issues such as orphans, cycles, and bad links. However,
|
|
132 |
Heap Engine today does not do similar things even for
|
|
133 |
free list.
|
|
134 |
3. With new HP_DATASPACE allocation mechaism, BLOB will become
|
|
135 |
increasingly simple to implement, but I may not have time
|
|
136 |
for that. In one approach, BLOB data can be placed at
|
|
137 |
the end of the same record. In another approach (which I
|
|
138 |
prefer) BLOB data would have its own HP_DATASPACE with
|
|
139 |
variable-size entries.
|
|
140 |
4. In a more sophisticated implementation, some space can
|
|
141 |
be saved even with all fixed-size columns if many of them
|
|
142 |
have NULL value, as long as these columns are not used
|
|
143 |
in indexes
|
|
144 |
5. In variable-size format status should be moved to lower
|
|
145 |
bits of the "next" pointer. Pointer is always aligned
|
|
481
by Brian Aker
Remove all of uchar. |
146 |
to sizeof(unsigned char*), which is at least 4, leaving 2 lower
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
147 |
bits free. This will save 8 bytes per chunk
|
148 |
on 64-bit platform.
|
|
149 |
6. As we do not want to modify FRM format, BLOCK_SIZE option
|
|
150 |
of "CREATE TABLE" is saved as "RAID_CHUNKSIZE" for
|
|
151 |
Heap Engine tables.
|
|
152 |
*/
|
|
153 |
||
481
by Brian Aker
Remove all of uchar. |
154 |
static unsigned char *hp_allocate_one_chunk(HP_DATASPACE *info); |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
155 |
|
156 |
||
157 |
/**
|
|
158 |
Clear a dataspace
|
|
159 |
||
160 |
Frees memory and zeros-out any relevant counters in the dataspace
|
|
161 |
||
162 |
@param info the dataspace to clear
|
|
163 |
*/
|
|
164 |
||
165 |
void hp_clear_dataspace(HP_DATASPACE *info) |
|
166 |
{
|
|
167 |
if (info->block.levels) |
|
168 |
{
|
|
398.1.10
by Monty Taylor
Actually removed VOID() this time. |
169 |
hp_free_level(&info->block,info->block.levels,info->block.root, |
481
by Brian Aker
Remove all of uchar. |
170 |
(unsigned char*) 0); |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
171 |
}
|
172 |
info->block.levels=0; |
|
173 |
info->del_chunk_count= info->chunk_count= 0; |
|
174 |
info->del_link=0; |
|
175 |
info->total_data_length= 0; |
|
176 |
}
|
|
177 |
||
178 |
||
179 |
/**
|
|
180 |
Allocate or reallocate a chunkset in the dataspace
|
|
181 |
||
182 |
Attempts to allocate a new chunkset or change the size of an existing chunkset
|
|
183 |
||
184 |
@param info the hosting dataspace
|
|
185 |
@param chunk_count the number of chunks that we expect as the result
|
|
186 |
@param existing_set non-null value asks function to resize existing chunkset,
|
|
187 |
return value would point to this set
|
|
188 |
||
189 |
@return Pointer to the first chunk in the new or updated chunkset, or NULL if unsuccessful
|
|
190 |
*/
|
|
191 |
||
481
by Brian Aker
Remove all of uchar. |
192 |
static unsigned char *hp_allocate_variable_chunkset(HP_DATASPACE *info, |
482
by Brian Aker
Remove uint. |
193 |
uint32_t chunk_count, unsigned char* existing_set) |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
194 |
{
|
195 |
int alloc_count= chunk_count, i; |
|
481
by Brian Aker
Remove all of uchar. |
196 |
unsigned char *first_chunk= 0, *curr_chunk= 0, *prev_chunk= 0, *last_existing_chunk= 0; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
197 |
|
198 |
assert(alloc_count); |
|
199 |
||
200 |
if (existing_set) |
|
201 |
{
|
|
202 |
first_chunk= existing_set; |
|
203 |
||
204 |
curr_chunk= existing_set; |
|
205 |
while (curr_chunk && alloc_count) |
|
206 |
{
|
|
207 |
prev_chunk= curr_chunk; |
|
481
by Brian Aker
Remove all of uchar. |
208 |
curr_chunk= *((unsigned char**)(curr_chunk + info->offset_link)); |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
209 |
alloc_count--; |
210 |
}
|
|
211 |
||
212 |
if (!alloc_count) |
|
213 |
{
|
|
214 |
if (curr_chunk) |
|
215 |
{
|
|
216 |
/* We came through all chunks and there is more left, let's truncate the list */
|
|
481
by Brian Aker
Remove all of uchar. |
217 |
*((unsigned char**)(prev_chunk + info->offset_link)) = NULL; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
218 |
hp_free_chunks(info, curr_chunk); |
219 |
}
|
|
220 |
||
221 |
return first_chunk; |
|
222 |
}
|
|
223 |
||
224 |
last_existing_chunk = prev_chunk; |
|
225 |
}
|
|
226 |
||
227 |
/* We can reach this point only if we're allocating new chunkset or more chunks in existing set */
|
|
228 |
||
229 |
for (i=0; i<alloc_count; i++) |
|
230 |
{
|
|
231 |
curr_chunk= hp_allocate_one_chunk(info); |
|
232 |
if (!curr_chunk) |
|
233 |
{
|
|
234 |
/* no space in the current block */
|
|
235 |
||
236 |
if (last_existing_chunk) |
|
237 |
{
|
|
238 |
/* Truncate whatever was added at the end of the existing chunkset */
|
|
239 |
prev_chunk= last_existing_chunk; |
|
481
by Brian Aker
Remove all of uchar. |
240 |
curr_chunk= *((unsigned char**)(prev_chunk + info->offset_link)); |
241 |
*((unsigned char**)(prev_chunk + info->offset_link)) = NULL; |
|
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
242 |
hp_free_chunks(info, curr_chunk); |
243 |
}
|
|
244 |
else if (first_chunk) |
|
245 |
{
|
|
246 |
/* free any chunks previously allocated */
|
|
247 |
hp_free_chunks(info, first_chunk); |
|
248 |
}
|
|
249 |
||
250 |
return NULL; |
|
251 |
}
|
|
252 |
||
253 |
/* mark as if this chunk is last in the chunkset */
|
|
481
by Brian Aker
Remove all of uchar. |
254 |
*((unsigned char**) (curr_chunk + info->offset_link))= 0; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
255 |
|
256 |
if (prev_chunk) |
|
257 |
{
|
|
258 |
/* tie them into a linked list */
|
|
481
by Brian Aker
Remove all of uchar. |
259 |
*((unsigned char**) (prev_chunk + info->offset_link))= curr_chunk; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
260 |
curr_chunk[info->offset_status]= CHUNK_STATUS_LINKED; /* Record linked from active */ |
261 |
}
|
|
262 |
else
|
|
263 |
{
|
|
264 |
curr_chunk[info->offset_status]= CHUNK_STATUS_ACTIVE; /* Record active */ |
|
265 |
}
|
|
266 |
||
267 |
if (!first_chunk) |
|
268 |
{
|
|
269 |
first_chunk= curr_chunk; |
|
270 |
}
|
|
271 |
||
272 |
prev_chunk= curr_chunk; |
|
273 |
}
|
|
274 |
||
275 |
return first_chunk; |
|
276 |
}
|
|
277 |
||
278 |
||
279 |
/**
|
|
280 |
Allocate a new chunkset in the dataspace
|
|
281 |
||
282 |
Attempts to allocate a new chunkset
|
|
283 |
||
284 |
@param info the hosting dataspace
|
|
285 |
@param chunk_count the number of chunks that we expect as the result
|
|
286 |
||
287 |
@return Pointer to the first chunk in the new or updated chunkset, or NULL if unsuccessful
|
|
288 |
*/
|
|
289 |
||
482
by Brian Aker
Remove uint. |
290 |
unsigned char *hp_allocate_chunkset(HP_DATASPACE *info, uint32_t chunk_count) |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
291 |
{
|
481
by Brian Aker
Remove all of uchar. |
292 |
unsigned char* result; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
293 |
|
294 |
||
295 |
if (info->is_variable_size) |
|
296 |
{
|
|
297 |
result = hp_allocate_variable_chunkset(info, chunk_count, NULL); |
|
298 |
}
|
|
299 |
else
|
|
300 |
{
|
|
301 |
result= hp_allocate_one_chunk(info); |
|
302 |
if (result) |
|
303 |
{
|
|
304 |
result[info->offset_status]= CHUNK_STATUS_ACTIVE; |
|
305 |
}
|
|
306 |
||
307 |
return(result); |
|
308 |
}
|
|
309 |
||
310 |
return(result); |
|
311 |
}
|
|
312 |
||
313 |
||
314 |
/**
|
|
315 |
Reallocate an existing chunkset in the dataspace
|
|
316 |
||
317 |
Attempts to change the size of an existing chunkset
|
|
318 |
||
319 |
@param info the hosting dataspace
|
|
320 |
@param chunk_count the number of chunks that we expect as the result
|
|
321 |
@param pos pointer to the existing chunkset
|
|
322 |
||
323 |
@return Error code or zero if successful
|
|
324 |
*/
|
|
325 |
||
482
by Brian Aker
Remove uint. |
326 |
int hp_reallocate_chunkset(HP_DATASPACE *info, uint32_t chunk_count, unsigned char* pos) |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
327 |
{
|
328 |
||
329 |
if (!info->is_variable_size) |
|
330 |
{
|
|
331 |
/* Update should never change chunk_count in fixed-size mode */
|
|
332 |
my_errno=HA_ERR_WRONG_COMMAND; |
|
333 |
return my_errno; |
|
334 |
}
|
|
335 |
||
336 |
/* Reallocate never moves the first chunk */
|
|
337 |
if (!hp_allocate_variable_chunkset(info, chunk_count, pos)) |
|
338 |
return(my_errno); |
|
339 |
||
340 |
return(0); |
|
341 |
}
|
|
342 |
||
343 |
||
344 |
/**
|
|
345 |
Allocate a single chunk in the dataspace
|
|
346 |
||
347 |
Attempts to allocate a new chunk or reuse one from deleted list
|
|
348 |
||
349 |
@param info the hosting dataspace
|
|
350 |
||
351 |
@return Pointer to the chunk, or NULL if unsuccessful
|
|
352 |
*/
|
|
353 |
||
481
by Brian Aker
Remove all of uchar. |
354 |
static unsigned char *hp_allocate_one_chunk(HP_DATASPACE *info) |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
355 |
{
|
481
by Brian Aker
Remove all of uchar. |
356 |
unsigned char* curr_chunk; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
357 |
size_t length, block_pos; |
358 |
||
359 |
if (info->del_link) |
|
360 |
{
|
|
361 |
curr_chunk=info->del_link; |
|
481
by Brian Aker
Remove all of uchar. |
362 |
info->del_link= *((unsigned char**) curr_chunk); |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
363 |
info->del_chunk_count--; |
364 |
||
365 |
return curr_chunk; |
|
366 |
}
|
|
367 |
||
368 |
block_pos= (info->chunk_count % info->block.records_in_block); |
|
369 |
if (!block_pos) |
|
370 |
{
|
|
371 |
if (hp_get_new_block(&info->block,&length)) |
|
372 |
{
|
|
373 |
/* no space in the current block */
|
|
374 |
return NULL; |
|
375 |
}
|
|
376 |
||
377 |
info->total_data_length+= length; |
|
378 |
}
|
|
379 |
||
380 |
info->chunk_count++; |
|
481
by Brian Aker
Remove all of uchar. |
381 |
curr_chunk= ((unsigned char*) info->block.level_info[0].last_blocks + |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
382 |
block_pos * info->block.recbuffer); |
383 |
||
384 |
||
385 |
return curr_chunk; |
|
386 |
}
|
|
387 |
||
388 |
||
389 |
/**
|
|
390 |
Free a list of chunks
|
|
391 |
||
392 |
Reclaims all chunks linked by the pointer,
|
|
393 |
which could be the whole chunkset or a part of an existing chunkset
|
|
394 |
||
395 |
@param info the hosting dataspace
|
|
396 |
@param pos pointer to the head of the chunkset
|
|
397 |
*/
|
|
398 |
||
481
by Brian Aker
Remove all of uchar. |
399 |
void hp_free_chunks(HP_DATASPACE *info, unsigned char *pos) |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
400 |
{
|
481
by Brian Aker
Remove all of uchar. |
401 |
unsigned char* curr_chunk= pos; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
402 |
|
403 |
while (curr_chunk) { |
|
404 |
info->del_chunk_count++; |
|
481
by Brian Aker
Remove all of uchar. |
405 |
*((unsigned char**) curr_chunk)= info->del_link; |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
406 |
info->del_link= curr_chunk; |
407 |
||
408 |
curr_chunk[info->offset_status]= CHUNK_STATUS_DELETED; |
|
409 |
||
410 |
||
411 |
if (!info->is_variable_size) |
|
412 |
{
|
|
413 |
break; |
|
414 |
}
|
|
415 |
||
416 |
/* Delete next chunk in this chunkset */
|
|
481
by Brian Aker
Remove all of uchar. |
417 |
curr_chunk= *((unsigned char**)(curr_chunk + info->offset_link)); |
244.1.1
by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns. |
418 |
}
|
419 |
}
|