Skip to main content

builder.go

builder.go - Overview

This file implements the Builder struct, which is responsible for building a table (SSTable) by accumulating key-value pairs in blocks, compressing and encrypting them, and finally constructing an index for efficient lookups.

Detailed Documentation

header struct

type header struct {
overlap uint16 // Overlap with base key.
diff uint16 // Length of the diff.
}

Represents the header of a key-value entry within a block.

  • overlap: Number of bytes the current key overlaps with the previous key (base key).
  • diff: Length of the differing suffix of the current key.

header.Encode()

func (h header) Encode() []byte

Encodes the header into a byte slice.

Returns:

  • []byte: A byte slice containing the encoded header.

header.Decode()

func (h *header) Decode(buf []byte)

Decodes the header from a byte slice.

Parameters:

  • buf: A byte slice containing the encoded header.

bblock struct

type bblock struct {
data []byte
baseKey []byte // Base key for the current block.
entryOffsets []uint32 // Offsets of entries present in current block.
end int // Points to the end offset of the block.
}

Represents a block that is being compressed/encrypted in the background.

  • data: The byte slice holding the block's data.
  • baseKey: The base key for the current block. Subsequent keys in the block are stored as diffs from this key.
  • entryOffsets: A slice of offsets to the start of each entry within the block.
  • end: The current end offset of valid data within the data buffer.

Builder struct

type Builder struct {
// Typically tens or hundreds of meg. This is for one single file.
alloc *z.Allocator
curBlock *bblock
compressedSize atomic.Uint32
uncompressedSize atomic.Uint32

lenOffsets uint32
keyHashes []uint32 // Used for building the bloomfilter.
opts *Options
maxVersion uint64
onDiskSize uint32
staleDataSize int

// Used to concurrently compress/encrypt blocks.
wg sync.WaitGroup
blockChan chan *bblock
blockList []*bblock
}

Used for building a table.

  • alloc: Allocator for managing memory.
  • curBlock: The current block being built.
  • compressedSize: Atomic counter for the total compressed size of all blocks.
  • uncompressedSize: Atomic counter for the total uncompressed size of all blocks.
  • lenOffsets: Total length of offsets.
  • keyHashes: Hashes of keys, used for building a Bloom filter.
  • opts: Options for building the table.
  • maxVersion: The maximum version number of the keys added to the table.
  • onDiskSize: The estimated size of the table on disk.
  • staleDataSize: The total size of stale data in the table.
  • wg: WaitGroup for managing concurrent compression/encryption.
  • blockChan: Channel for passing blocks to compression/encryption workers.
  • blockList: List of completed blocks.

Builder.allocate()

func (b *Builder) allocate(need int) []byte

Allocates a slice of bytes from the builder's allocator. If there is not enough space, reallocates the current block's data.

Parameters:

  • need: The number of bytes needed.

Returns:

  • []byte: A byte slice of the requested size.

Builder.append()

func (b *Builder) append(data []byte)

Appends data to the current block's data buffer.

Parameters:

  • data: The data to append.

NewTableBuilder()

func NewTableBuilder(opts Options) *Builder

Creates a new Builder with the given options.

Parameters:

  • opts: The options to use for building the table.

Returns:

  • *Builder: A pointer to the new Builder.

maxEncodedLen()

func maxEncodedLen(ctype options.CompressionType, sz int) int

Calculates the maximum encoded length for a given compression type and size.

Parameters:

  • ctype: The compression type.
  • sz: The original size.

Returns:

  • int: The maximum encoded length.

Builder.handleBlock()

func (b *Builder) handleBlock()

A goroutine that handles compression and encryption of blocks received from blockChan.

Builder.Close()

func (b *Builder) Close()

Closes the Builder by returning the allocator to the pool.

Builder.Empty()

func (b *Builder) Empty() bool

Checks if the builder is empty (has no keys).

Returns:

  • bool: True if the builder is empty, false otherwise.

Builder.keyDiff()

func (b *Builder) keyDiff(newKey []byte) []byte

Computes the difference between a new key and the current block's base key.

Parameters:

  • newKey: The new key.

Returns:

  • []byte: The suffix of newKey that differs from the base key.

Builder.addHelper()

func (b *Builder) addHelper(key []byte, v y.ValueStruct, vpLen uint32)

Adds a key-value pair to the current block.

Parameters:

  • key: The key.
  • v: The value.
  • vpLen: The length of the value pointer.

Builder.finishBlock()

func (b *Builder) finishBlock()

Finalizes the current block by appending the entry offsets, checksum, and other metadata. Sends the block to the compression/encryption workers if enabled.

Builder.shouldFinishBlock()

func (b *Builder) shouldFinishBlock(key []byte, value y.ValueStruct) bool

Determines whether the current block should be finished based on its estimated size.

Parameters:

  • key: The key being added.
  • value: The value being added.

Returns:

  • bool: True if the block should be finished, false otherwise.

Builder.AddStaleKey()

func (b *Builder) AddStaleKey(key []byte, v y.ValueStruct, valueLen uint32)

Adds a stale key-value pair to the block and increments the internal staleDataSize counter.

Parameters:

  • key: The key.
  • v: The value.
  • valueLen: The length of the value.

Builder.Add()

func (b *Builder) Add(key []byte, value y.ValueStruct, valueLen uint32)

Adds a key-value pair to the block.

Parameters:

  • key: The key.
  • value: The value.
  • valueLen: The length of the value.

Builder.addInternal()

func (b *Builder) addInternal(key []byte, value y.ValueStruct, valueLen uint32, isStale bool)

Internal helper function to add a key-value pair to the block.

Parameters:

  • key: The key.
  • value: The value.
  • valueLen: The length of the value.
  • isStale: Indicates whether the key is stale.

Builder.ReachedCapacity()

func (b *Builder) ReachedCapacity() bool

Checks if the builder has reached its capacity.

Returns:

  • bool: True if the capacity has been reached, false otherwise.

Builder.Finish()

func (b *Builder) Finish() []byte

Finishes building the table and returns the final byte slice.

Returns:

  • []byte: The complete table data.

buildData struct

type buildData struct {
blockList []*bblock
index []byte
checksum []byte
Size int
alloc *z.Allocator
}

Structure to hold the data after the build is complete.

  • blockList: List of blocks.
  • index: The index data.
  • checksum: The checksum of the index.
  • Size: Total size of the table.
  • alloc: The allocator used.

buildData.Copy()

func (bd *buildData) Copy(dst []byte) int

Copies the build data into the destination buffer.

Parameters:

  • dst: Destination buffer.

Returns:

  • int: Number of bytes written.

Builder.Done()

func (b *Builder) Done() buildData

Completes the table building process, including compression, encryption, index creation, and checksum calculation.

Returns:

  • buildData: The built table data.

Builder.calculateChecksum()

func (b *Builder) calculateChecksum(data []byte) []byte

Calculates the checksum for the given data.

Parameters:

  • data: The data to calculate the checksum for.

Returns:

  • []byte: The checksum bytes.

Builder.DataKey()

func (b *Builder) DataKey() *pb.DataKey

Returns the data key of the builder.

Returns:

  • *pb.DataKey: The data key.

Builder.Opts()

func (b *Builder) Opts() *Options

Returns the options of the builder.

Returns:

  • *Options: The options.

Builder.encrypt()

func (b *Builder) encrypt(data []byte) ([]byte, error)

Encrypts the given data and appends the IV to the end.

Parameters:

  • data: The data to encrypt.

Returns:

  • []byte: The encrypted data with IV appended.
  • error: An error, if any.

Builder.shouldEncrypt()

func (b *Builder) shouldEncrypt() bool

Determines whether encryption should be performed.

Returns:

  • bool: True if encryption should be performed, false otherwise.

Builder.compressData()

func (b *Builder) compressData(data []byte) ([]byte, error)

Compresses the given data using the configured compression algorithm.

Parameters:

  • data: The data to compress.

Returns:

  • []byte: The compressed data.
  • error: An error, if any.

Builder.buildIndex()

func (b *Builder) buildIndex(bloom []byte) ([]byte, uint32)

Builds the index for the table.

Parameters:

  • bloom: The Bloom filter data.

Returns:

  • []byte: The index data.
  • uint32: The data size.

Builder.writeBlockOffsets()

func (b *Builder) writeBlockOffsets(builder *fbs.Builder) ([]fbs.UOffsetT, uint32)

Writes the block offsets to the FlatBuffers builder.

Parameters:

  • builder: The FlatBuffers builder.

Returns:

  • []fbs.UOffsetT: A slice of FlatBuffers offsets.
  • uint32: The start offset.

Builder.writeBlockOffset()

func (b *Builder) writeBlockOffset(
builder *fbs.Builder, bl *bblock, startOffset uint32) fbs.UOffsetT

Writes a single block offset to the FlatBuffers builder.

Parameters:

  • builder: The FlatBuffers builder.
  • bl: The block.
  • startOffset: The start offset.

Returns:

  • fbs.UOffsetT: The FlatBuffers offset.

Getting Started Relevance