com.bigdata
Class LRUNexus

java.lang.Object
  extended by com.bigdata.LRUNexus

public class LRUNexus
extends Object

Static singleton factory.

Version:
$Id: LRUNexus.java 2265 2009-10-26 12:51:06Z thompsonbry $ FIXME LRUNexus : writes MUST must be "isolated" until the commit. Isolated indices MUST have their own cache backed by the shared LRU (actually, they are on the shared temporary store so that helps). Unisolated indices SHOULD have their own cache backed by the shared LRU. At commit, any records in the "isolated" cache for a B+Tree should be putAll() onto the unisolated cache for the backing store. This way, we do not need to do anything if there is an abort().

There are two quick fixes: (1) Disable the Global LRU; and (2) discard the cache if there is an abort on a store. The latter is pretty easy since we only have one store with abort semantics, which is the AbstractJournal, so that is how this is being handled right now by AbstractJournal.abort().

An optimization would essentially isolate the writes on the cache per BTree or between commits. At the commit point, the written records would be migrated into the "committed" cache for the store. The caller would read on the uncommitted cache, which would read through to the "committed" cache. This would prevent incorrect reads without requiring us to throw away valid records in the cache. This could be a significant performance gain if aborts are common on a machine with a lot of RAM.

Author:
Bryan Thompson
See Also:
LRUNexus.Options
TODO:
Test w/ G1 -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC

G1 appears faster for query, but somewhat slower for load. This is probably related to the increased memory demand during load (more of the data winds up buffered). G1 might work for both use cases with a smaller portion of the heap given over to buffers.

G1 can also trip a crash, at least during load. There is a Sun incident ID# 1609804 for this., Look into the memory pool threshold notification mechanism. See ManagementFactory.getMemoryPoolMXBeans() and MemoryPoolMXBean. TonyP suggests that tracking the old generation occupancy may be a better metric (more stable). The tricky part is to identify which pool(s?) correspond(s) to the old generation. Once that is done, the idea is to set a notification threshold using MemoryPoolMXBean.setUsageThreshold(long) and to only clear references from the tail of the global LRU when we have exceeded that threshold. Reading the javadoc, it seems that threshold notification would probably come after a (full) GC. The goal would have to be something like reducing the bytesInMemory to some percentage of its value at threshold notification (e.g., 80%). Since we can't directly control that and the feedback from the JVM is only at full GC intervals, we need to simply discard some percentage of the references from the tail of the global LRU. We could actually adjust the desired #of references on the LRU if that metric appears to be relatively stable. However, note that the average #of bytes per reference and the average #of instances of a reference on the LRU are not necessarily stable values. We could also examine the recordCount (total cache size across all caches). If weak references are cleared on an ongoing basis rather than during the full GC mark phase, then that will be very close to the real hard reference count., Does it make sense to both buffer the index segment nodes region and buffer the nodes and leaves? [buffering the nodes region is an option.], Note that a r/w store will require an approach in which addresses are PURGED from the store's cache during the commit protocol. That might be handled at the tx layer., Better ergonomics! Perhaps keep some minimum amount for the JVM and then set a trigger on the GC time and if it crosses 5-10% of the CPU time for the application, then reduce the maximum bytes allowed for the global LRU buffer., Settings be made public (or package private) and used as the input to a designated constructor method signature for all of the implementations. This would make it possible to plug in a new implementation without hard wiring things in the code.


Nested Class Summary
static interface LRUNexus.Options
          These options are MUST BE specified as ENVIRONMENT variables on the command line when you start the JVM.
 
Field Summary
static IGlobalLRU<Long,Object> INSTANCE
          Global instance.
protected static org.apache.log4j.Logger log
           
 
Constructor Summary
LRUNexus()
           
 
Method Summary
static IGlobalLRU.ILRUCache<Long,Object> getCache(IRawStore store)
          Factory returns the IGlobalLRU.ILRUCache for the store iff the LRUNexus is enabled.
static boolean getIndexSegmentBuildPopulatesCache()
          Return true if the IndexSegmentBuilder will populate the IGlobalLRU with records for the new IndexSegment during the build.
static void main(String[] args)
          Command line utility may be used to confirm the environment settings.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final transient org.apache.log4j.Logger log

INSTANCE

public static final IGlobalLRU<Long,Object> INSTANCE
Global instance.

Note: A Sun G1 bug in JDK 1.6.0_16 provides a false estimate of the available memory.

See Also:
LRUNexus.Options
Constructor Detail

LRUNexus

public LRUNexus()
Method Detail

getIndexSegmentBuildPopulatesCache

public static final boolean getIndexSegmentBuildPopulatesCache()
Return true if the IndexSegmentBuilder will populate the IGlobalLRU with records for the new IndexSegment during the build.

See Also:
LRUNexus.Options.INDEX_SEGMENT_BUILD_POPULATES_CACHE

getCache

public static IGlobalLRU.ILRUCache<Long,Object> getCache(IRawStore store)
Factory returns the IGlobalLRU.ILRUCache for the store iff the LRUNexus is enabled.

Parameters:
store - The store.
Returns:
The cache for that store if the LRUNexus is enabled and otherwise null.
Throws:
IllegalArgumentException - if the store is null.

main

public static void main(String[] args)
                 throws ClassNotFoundException
Command line utility may be used to confirm the environment settings.

Parameters:
args - Ignored. All parameters are specified either in the environment or using JVM -Dcom.bigdata.LRUNexus.foo=bar arguments on the command line.
Throws:
ClassNotFoundException


Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.