Package bw :: Package poseidon :: Module diskindex :: Class OceanIndex
[frames] | no frames]

Class OceanIndex

object --+    
         |    
 DiskIndex --+
             |
            OceanIndex

This is the main implementation of OceanIndex new in version 2. The implementation details are described in detail in the docstring for diskindex.

Instance Methods
 
__init__(self, fn, options)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
 
__len__(self)
This method doesn't work correctly.
 
__contains__(self, key)
Checks if key is in the index.
 
read_disk(self, key)
Reads the key from disk and returns the item.
 
create(self)
Initializes an empty diskindex.
 
read_header(self)
Read and validate the file header.
 
commit(self)
Commit the diskindex.
 
close(self)
Commits and closes the diskindex.
 
verify(self)
This method reads the whole index file (possibly very very large) and sequentially validates that each dirty entry as been written.

Inherited from DiskIndex: __getitem__, __iter__, __setitem__, clean, get, has_key, iteritems, update

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables
  HEADER = Struct("!2sI")
(MAGIC, VERSION)
  FANOUT = Struct("!Q")
(no of items < n[0])
  ITEM = Struct("!dQL")
(time, offset, length)
  VERSION = 2
Properties

Inherited from object: __class__

Method Details

__init__(self, fn, options)
(Constructor)

 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Parameters:
  • fn - The fully qualified pathname of the storage. e.g. "/home/username/brainwave/db/mmap". No extension is provided.
  • options - A dictionary containing additional options
Overrides: object.__init__
(inherited documentation)

__len__(self)
(Length operator)

 

This method doesn't work correctly. It uses the size of the cache + the total size from the fanout. Items read from disk which are in the cache will be double counted.

Overrides: DiskIndex.__len__

__contains__(self, key)
(In operator)

 

Checks if key is in the index. Since we'll have to seek there anyways, this method, also reads the item and populates the index.

Overrides: DiskIndex.__contains__

read_disk(self, key)

 

Reads the key from disk and returns the item. This method takes 2 seeks. The first seek reads the level-1 fanout tablespace = (total size/128) * 16 bytes and searches it. This information is used to directly seek to the correct location in the itemspace. It reads ITEM.size bytes (20 bytes for Ocean, 36 bytes for Sea as of version 2). The itemspace is kept separate from keyspace to reduce the memory cost of reading the tablespace and speed up the find operations (since there is lesser data).

Parameters:
  • key - The key whose value is to be read
Overrides: DiskIndex.read_disk

read_header(self)

 

Read and validate the file header. Also does version checking.

commit(self)

 

Commit the diskindex. This method is thread-safe. It is designed to be run asynchronously. It uses some pretty advanced trickery with generators to reduce the distance between seeks and is also quite memory efficient. Also, all the data is written to a .il file and then atomically renamed. This is also why the read-file-pointer needs to be re-opened at the beginning of every transaction.

Overrides: DiskIndex.commit

close(self)

 

Commits and closes the diskindex.

Overrides: DiskIndex.close

verify(self)

 

This method reads the whole index file (possibly very very large) and sequentially validates that each dirty entry as been written. This method is designed for debugging and is NOT to be used in production. Call this method inside the commit method (before resetting _dirty) to validate that the commit is ok.