~drizzle-trunk/drizzle/development

244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
1
/* Copyright (C) 2000-2002 MySQL AB
2
   Copyright (C) 2008 eBay, Inc
3
4
   This program is free software; you can redistribute it and/or modify
5
   it under the terms of the GNU General Public License as published by
6
   the Free Software Foundation; version 2 of the License.
7
8
   This program is distributed in the hope that it will be useful,
9
   but WITHOUT ANY WARRANTY; without even the implied warranty of
10
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
11
   GNU General Public License for more details.
12
13
   You should have received a copy of the GNU General Public License
14
   along with this program; if not, write to the Free Software
15
   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
16
17
/* Implements various base dataspace-related functions - allocate, free, clear */
18
19
#include "heapdef.h"
20
21
22
/*
23
  MySQL Heap tables keep data in arrays of fixed-size chunks.
24
  These chunks are organized into two groups of HP_BLOCK structures:
25
    - group1 contains indexes, with one HP_BLOCK per key
26
      (part of HP_KEYDEF)
27
    - group2 contains record data, with single HP_BLOCK
28
      for all records, referenced by HP_SHARE.recordspace.block
29
30
  While columns used in index are usually small, other columns
31
  in the table may need to accomodate larger data. Typically,
32
  larger data is placed into VARCHAR or BLOB columns. With actual
33
  sizes varying, Heap Engine has to support variable-sized records
34
  in memory. Heap Engine implements the concept of dataspace
35
  (HP_DATASPACE), which incorporates HP_BLOCK for the record data,
36
  and adds more information for managing variable-sized records.
37
38
  Variable-size records are stored in multiple "chunks",
39
  which means that a single record of data (database "row") can
40
  consist of multiple chunks organized into one "set". HP_BLOCK
41
  contains chunks. In variable-size format, one record
42
  is represented as one or many chunks, depending on the actual
43
  data, while in fixed-size mode, one record is always represented
44
  as one chunk. The index structures would always point to the first
45
  chunk in the chunkset.
46
47
  At the time of table creation, Heap Engine attempts to find out
48
  if variable-size records are desired. A user can request
49
  variable-size records by providing either row_type=dynamic or
50
  block_size=NNN table create option. Heap Engine will check
51
  whether block_size provides enough space in the first chunk
52
  to keep all null bits and columns that are used in indexes.
53
  If block_size is too small, table creation will be aborted
54
  with an error. Heap Engine will revert to fixed-size allocation
55
  mode if block_size provides no memory benefits (no VARCHAR
56
  fields extending past first chunk).
57
58
  In order to improve index search performance, Heap Engine needs
59
  to keep all null flags and all columns used as keys inside
60
  the first chunk of a chunkset. In particular, this means that
61
  all columns used as keys should be defined first in the table
62
  creation SQL. The length of data used by null bits and key columns
63
  is stored as fixed_data_length inside HP_SHARE. fixed_data_length
64
  will extend past last key column if more fixed-length fields can
65
  fit into the first chunk.
66
67
  Variable-size records are necessary only in the presence
68
  of variable-size columns. Heap Engine will be looking for VARCHAR
69
  columns, which declare length of 32 or more. If no such columns
70
  are found, table will be switched to fixed-size format. You should
71
  always try to put such columns at the end of the table definition.
72
73
  Whenever data is being inserted or updated in the table
74
  Heap Engine will calculate how many chunks are necessary.
75
  For insert operations, Heap Engine allocates new chunkset in
76
  the recordspace. For update operations it will modify length of
77
  the existing chunkset, unlinking unnecessary chunks at the end,
78
  or allocating and adding more if larger length is necessary.
79
80
  When writing data to chunks or copying data back to record,
81
  Heap Engine will first copy fixed_data_length of data using single
82
  memcpy call. The rest of the columns are processed one-by-one.
83
  Non-VARCHAR columns are copied in their full format. VARCHAR's
84
  are copied based on their actual length. Any NULL values after
85
  fixed_data_length are skipped.
86
87
  The allocation and contents of the actual chunks varies between
88
  fixed and variable-size modes. Total chunk length is always
481 by Brian Aker
Remove all of uchar.
89
  aligned to the next sizeof(unsigned char*). Here is the format of
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
90
  fixed-size chunk:
481 by Brian Aker
Remove all of uchar.
91
      unsigned char[] - sizeof=chunk_dataspace_length, but at least
92
               sizeof(unsigned char*) bytes. Keeps actual data or pointer
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
93
               to the next deleted chunk.
94
               chunk_dataspace_length equals to full record length
481 by Brian Aker
Remove all of uchar.
95
      unsigned char   - status field (1 means "in use", 0 means "deleted")
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
96
  Variable-size uses different format:
481 by Brian Aker
Remove all of uchar.
97
      unsigned char[] - sizeof=chunk_dataspace_length, but at least
98
               sizeof(unsigned char*) bytes. Keeps actual data or pointer
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
99
               to the next deleted chunk.
100
               chunk_dataspace_length is set according to table
101
               setup (block_size)
481 by Brian Aker
Remove all of uchar.
102
      unsigned char*  - pointer to the next chunk in this chunkset,
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
103
               or NULL for the last chunk
481 by Brian Aker
Remove all of uchar.
104
      unsigned char  -  status field (1 means "first", 0 means "deleted",
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
105
               2 means "linked")
106
107
  When allocating a new chunkset of N chunks, Heap Engine will try
108
  to allocate chunks one-by-one, linking them as they become
109
  allocated. Allocation of a single chunk will attempt to reuse
110
  a deleted (freed) chunk. If no free chunks are available,
111
  it will attempt to allocate a new area inside HP_BLOCK.
112
  Freeing chunks will place them at the front of free list
113
  referenced by del_link in HP_DATASPACE. The newly freed chunk
114
  will contain reference to the previously freed chunk in its first
481 by Brian Aker
Remove all of uchar.
115
  sizeof(unsigned char*) of the payload space.
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
116
117
  Here is open issues:
118
    1. It is not very nice to require people to keep key columns
119
       at the beginning of the table creation SQL. There are three
120
       proposed resolutions:
121
       a. Leave it as is. It's a reasonable limitation
122
       b. Add new HA_KEEP_KEY_COLUMNS_TO_FRONT flag to handler.h and
123
          make table.cpp align columns when it creates the table
124
       c. Make HeapEngine reorder columns in the chunk data, so that
125
          key columns go first. Add parallel HA_KEYSEG structures
126
          to distinguish positions in record vs. positions in
127
          the first chunk. Copy all data field-by-field rather than
128
          using single memcpy unless DBA kept key columns to
129
          the beginning.
130
    2. heap_check_heap needs verify linked chunks, looking for
131
       issues such as orphans, cycles, and bad links. However,
132
       Heap Engine today does not do similar things even for
133
       free list.
134
    3. With new HP_DATASPACE allocation mechaism, BLOB will become
135
       increasingly simple to implement, but I may not have time
136
       for that. In one approach, BLOB data can be placed at
137
       the end of the same record. In another approach (which I
138
       prefer) BLOB data would have its own HP_DATASPACE with
139
       variable-size entries.
140
    4. In a more sophisticated implementation, some space can
141
       be saved even with all fixed-size columns if many of them
142
       have NULL value, as long as these columns are not used
143
       in indexes
144
    5. In variable-size format status should be moved to lower
145
       bits of the "next" pointer. Pointer is always aligned
481 by Brian Aker
Remove all of uchar.
146
       to sizeof(unsigned char*), which is at least 4, leaving 2 lower
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
147
       bits free. This will save 8 bytes per chunk
148
       on 64-bit platform.
149
    6. As we do not want to modify FRM format, BLOCK_SIZE option
150
       of "CREATE TABLE" is saved as "RAID_CHUNKSIZE" for
151
       Heap Engine tables.
152
*/
153
481 by Brian Aker
Remove all of uchar.
154
static unsigned char *hp_allocate_one_chunk(HP_DATASPACE *info);
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
155
156
157
/**
158
  Clear a dataspace
159
160
  Frees memory and zeros-out any relevant counters in the dataspace
161
162
  @param  info  the dataspace to clear
163
*/
164
165
void hp_clear_dataspace(HP_DATASPACE *info)
166
{
167
  if (info->block.levels)
168
  {
398.1.10 by Monty Taylor
Actually removed VOID() this time.
169
    hp_free_level(&info->block,info->block.levels,info->block.root,
481 by Brian Aker
Remove all of uchar.
170
                  (unsigned char*) 0);
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
171
  }
172
  info->block.levels=0;
173
  info->del_chunk_count= info->chunk_count= 0;
174
  info->del_link=0;
175
  info->total_data_length= 0;
176
}
177
178
179
/**
180
  Allocate or reallocate a chunkset in the dataspace
181
182
  Attempts to allocate a new chunkset or change the size of an existing chunkset
183
184
  @param  info            the hosting dataspace
185
  @param  chunk_count     the number of chunks that we expect as the result
186
  @param  existing_set    non-null value asks function to resize existing chunkset,
187
                          return value would point to this set
188
189
  @return  Pointer to the first chunk in the new or updated chunkset, or NULL if unsuccessful
190
*/
191
481 by Brian Aker
Remove all of uchar.
192
static unsigned char *hp_allocate_variable_chunkset(HP_DATASPACE *info,
482 by Brian Aker
Remove uint.
193
                                           uint32_t chunk_count, unsigned char* existing_set)
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
194
{
195
  int alloc_count= chunk_count, i;
481 by Brian Aker
Remove all of uchar.
196
  unsigned char *first_chunk= 0, *curr_chunk= 0, *prev_chunk= 0, *last_existing_chunk= 0;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
197
198
  assert(alloc_count);
199
200
  if (existing_set)
201
  {
202
    first_chunk= existing_set;
203
204
    curr_chunk= existing_set;
205
    while (curr_chunk && alloc_count)
206
    {
207
      prev_chunk= curr_chunk;
481 by Brian Aker
Remove all of uchar.
208
      curr_chunk= *((unsigned char**)(curr_chunk + info->offset_link));
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
209
      alloc_count--;
210
    }
211
212
    if (!alloc_count)
213
    {
214
      if (curr_chunk)
215
      {
216
        /* We came through all chunks and there is more left, let's truncate the list */
481 by Brian Aker
Remove all of uchar.
217
        *((unsigned char**)(prev_chunk + info->offset_link)) = NULL;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
218
        hp_free_chunks(info, curr_chunk);
219
      }
220
221
      return first_chunk;
222
    }
223
224
    last_existing_chunk = prev_chunk;
225
  }
226
227
  /* We can reach this point only if we're allocating new chunkset or more chunks in existing set */
228
229
  for (i=0; i<alloc_count; i++)
230
  {
231
      curr_chunk= hp_allocate_one_chunk(info);
232
      if (!curr_chunk)
233
      {
234
        /* no space in the current block */
235
236
        if (last_existing_chunk)
237
        {
238
          /* Truncate whatever was added at the end of the existing chunkset */
239
          prev_chunk= last_existing_chunk;
481 by Brian Aker
Remove all of uchar.
240
          curr_chunk= *((unsigned char**)(prev_chunk + info->offset_link));
241
          *((unsigned char**)(prev_chunk + info->offset_link)) = NULL;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
242
          hp_free_chunks(info, curr_chunk);
243
        }
244
        else if (first_chunk)
245
        {
246
          /* free any chunks previously allocated */
247
          hp_free_chunks(info, first_chunk);
248
        }
249
250
        return NULL;
251
      }
252
253
      /* mark as if this chunk is last in the chunkset */
481 by Brian Aker
Remove all of uchar.
254
      *((unsigned char**) (curr_chunk + info->offset_link))= 0;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
255
256
      if (prev_chunk)
257
      {
258
        /* tie them into a linked list */
481 by Brian Aker
Remove all of uchar.
259
        *((unsigned char**) (prev_chunk + info->offset_link))= curr_chunk;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
260
        curr_chunk[info->offset_status]= CHUNK_STATUS_LINKED;			/* Record linked from active */
261
      }
262
      else
263
      {
264
        curr_chunk[info->offset_status]= CHUNK_STATUS_ACTIVE;			  /* Record active */
265
      }
266
267
      if (!first_chunk)
268
      {
269
        first_chunk= curr_chunk;
270
      }
271
272
      prev_chunk= curr_chunk;
273
  }
274
275
  return first_chunk;
276
}
277
278
279
/**
280
  Allocate a new chunkset in the dataspace
281
282
  Attempts to allocate a new chunkset
283
284
  @param  info            the hosting dataspace
285
  @param  chunk_count     the number of chunks that we expect as the result
286
287
  @return  Pointer to the first chunk in the new or updated chunkset, or NULL if unsuccessful
288
*/
289
482 by Brian Aker
Remove uint.
290
unsigned char *hp_allocate_chunkset(HP_DATASPACE *info, uint32_t chunk_count)
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
291
{
481 by Brian Aker
Remove all of uchar.
292
  unsigned char* result;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
293
294
295
  if (info->is_variable_size)
296
  {
297
    result = hp_allocate_variable_chunkset(info, chunk_count, NULL);
298
  }
299
  else
300
  {
301
    result= hp_allocate_one_chunk(info);
302
    if (result)
303
    {
304
      result[info->offset_status]= CHUNK_STATUS_ACTIVE;
305
    }
306
307
    return(result);
308
  }
309
310
  return(result);
311
}
312
313
314
/**
315
  Reallocate an existing chunkset in the dataspace
316
317
  Attempts to change the size of an existing chunkset
318
319
  @param  info            the hosting dataspace
320
  @param  chunk_count     the number of chunks that we expect as the result
321
  @param  pos             pointer to the existing chunkset
322
323
  @return  Error code or zero if successful
324
*/
325
482 by Brian Aker
Remove uint.
326
int hp_reallocate_chunkset(HP_DATASPACE *info, uint32_t chunk_count, unsigned char* pos)
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
327
{
328
329
  if (!info->is_variable_size)
330
  {
331
    /* Update should never change chunk_count in fixed-size mode */
332
    my_errno=HA_ERR_WRONG_COMMAND;
333
    return my_errno;
334
  }
335
336
  /* Reallocate never moves the first chunk */
337
  if (!hp_allocate_variable_chunkset(info, chunk_count, pos))
338
    return(my_errno);
339
340
  return(0);
341
}
342
343
344
/**
345
  Allocate a single chunk in the dataspace
346
347
  Attempts to allocate a new chunk or reuse one from deleted list
348
349
  @param  info            the hosting dataspace
350
351
  @return  Pointer to the chunk, or NULL if unsuccessful
352
*/
353
481 by Brian Aker
Remove all of uchar.
354
static unsigned char *hp_allocate_one_chunk(HP_DATASPACE *info)
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
355
{
481 by Brian Aker
Remove all of uchar.
356
  unsigned char* curr_chunk;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
357
  size_t length, block_pos;
358
359
  if (info->del_link)
360
  {
361
    curr_chunk=info->del_link;
481 by Brian Aker
Remove all of uchar.
362
    info->del_link= *((unsigned char**) curr_chunk);
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
363
    info->del_chunk_count--;
364
365
    return curr_chunk;
366
  }
367
368
  block_pos= (info->chunk_count % info->block.records_in_block);
369
  if (!block_pos)
370
  {
371
    if (hp_get_new_block(&info->block,&length))
372
    {
373
      /* no space in the current block */
374
      return NULL;
375
    }
376
377
    info->total_data_length+= length;
378
  }
379
380
  info->chunk_count++;
481 by Brian Aker
Remove all of uchar.
381
  curr_chunk= ((unsigned char*) info->block.level_info[0].last_blocks +
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
382
    block_pos * info->block.recbuffer);
383
384
385
  return curr_chunk;
386
}
387
388
389
/**
390
  Free a list of chunks
391
392
  Reclaims all chunks linked by the pointer,
393
  which could be the whole chunkset or a part of an existing chunkset
394
395
  @param  info            the hosting dataspace
396
  @param  pos             pointer to the head of the chunkset
397
*/
398
481 by Brian Aker
Remove all of uchar.
399
void hp_free_chunks(HP_DATASPACE *info, unsigned char *pos)
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
400
{
481 by Brian Aker
Remove all of uchar.
401
  unsigned char* curr_chunk= pos;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
402
403
  while (curr_chunk) {
404
    info->del_chunk_count++;
481 by Brian Aker
Remove all of uchar.
405
    *((unsigned char**) curr_chunk)= info->del_link;
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
406
    info->del_link= curr_chunk;
407
408
    curr_chunk[info->offset_status]= CHUNK_STATUS_DELETED;
409
410
411
    if (!info->is_variable_size)
412
    {
413
      break;
414
    }
415
416
    /* Delete next chunk in this chunkset */
481 by Brian Aker
Remove all of uchar.
417
    curr_chunk= *((unsigned char**)(curr_chunk + info->offset_link));
244.1.1 by Harrison Fisk
Port Ebay/Google memory storage engine variable width columns.
418
  }
419
}