Package com.bigdata.rdf.internal

This package provides an internal representation of RDF Values.

See:
          Description

Interface Summary
DTEFlags Data type enum bit flags.
IDatatypeURIResolver Specialized interface for resolving (and creating if necessary) datatype URIs.
IExtension<V extends BigdataValue> IExtensions are responsible for round-tripping between an RDF Value and an LiteralExtensionIV for a particular datatype.
IExtensionFactory IExtensionFactories are responsible for enumerating what extensions are supported for a particular database configuration.
IExtensionIV  
IInlineUnicode Interface for IVs which have inline Unicode components in their representation.
ILexiconConfiguration<V extends BigdataValue> Configuration determines which RDF Values are inlined into the statement indices rather than being assigned term identifiers by the lexicon.
INonInlineExtensionCodes An interface declaring the one byte extension code for non-inline IV s.
IV<V extends BigdataValue,T> Interface for the internal representation of an RDF Value (the representation which is encoded within the statement indices).
IVCache<V extends BigdataValue,T> Interface for managing the BigdataValue cached on an IV.
 

Class Summary
BSBMExtensionFactory Adds inlining for the http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/USD datatype, which is treated as xsd:float.
DefaultExtensionFactory Default IExtensionFactory.
IVUnicode Utility class supporting IVs having inline Unicode data.
IVUnicode.IVUnicodeComparator Class imposes the natural ordering of the encoded Unicode representation for an IV having inline Unicode data on Java Strings.
IVUtility Helper class for IVs.
LexiconConfiguration<V extends BigdataValue> An object which describes which kinds of RDF Values are inlined into the statement indices and how other RDF Values are coded into the lexicon.
NoExtensionFactory A class which does not support any extensions.
TermIVComparator Places BigdataValues into an ordering determined by their assigned IVs (internal values).
XPathMathFunctions Support for the picky xpath math functions: abs, ceiling, floor, and round.
XSD Collects various XSD URIs as constants.
 

Enum Summary
DTE Data Type Enumeration (DTE) is a class which declares the known intrinsic data types, provides for extensibility to new data types, and provides for data types which either can not be inlined or are not being inlined.
VTE Value Type Enumeration (IVTE) is a class with methods for interpreting and setting the bit flags used to identify the type of an RDF Value (URI, Literal, Blank Node, SID, etc).
 

Exception Summary
NoSuchVocabularyItem An exception thrown when a request is made for a URI which was not declared in the Vocabulary.
NotMaterializedException Exception thrown by IVCache.getValue() if the IV has not first been cached using IVCache.asValue(LexiconRelation).
 

Package com.bigdata.rdf.internal Description

This package provides an internal representation of RDF Values. The internal representation of an RDF Value is either a term identifier or an inline value. Term identifiers are assigned by consistent writes on the TERM2ID index and the ID2TERM index (which maps the term identifier back onto the RDF Value). Inline values directly code the RDF Value and are generally used for datatype literals having short values (xsd:boolean, xsd:char) or numeric values (xsd:short, xsd:int, xsd:float, xsd:long, xsd:double).

The unsigned byte[] keys for an RDF Value are formed by appending some bit flags which partition the key space into URIs, blank nodes, literals, and statement identifiers, a bit flag indicating whether the value is inline, followed by either pad bits and the term identifier or a code which identifies the data type of the inline value and minimum length decodable representation of the data type value formed using IKeyBuilder. This design clusters different kinds of RDF values within different regions of the ID2TERM index.

Inline values have a significant performance advantage since the RDF Value object can be recovered directly from the inline value without indirection through the ID2TERM index. This reduces the costs to materialize RDF Values, greatly speeds up aggregation style queries over numeric data, and reduces both the maintenance time and the size on the disk for the lexicon indices since inline values are not entered into the lexicon.

Inline values also make it possible to translate a LT/GT style filter against an inlined datatype into key-range queries against the OSP(C) index. Since the inline values for different numeric data types are located in different parts of the key space, the use of a cast in the query will require either than key-range queries are issued against all numeric data types (UNION of access paths) or that the query is evaluated without the use of key-range scans on OSP(C).

RDF Values which are inlined obey the equality and order semantics of their value space (think xsd:int). One consequence of this is that comparison of inlined datatype literals MUST occur in the value space. E.g., 005 and 5 represent the same point in the xsd:int value space. Clients should be aware of this distinction as statements which have lexical distinctions which are not distinct in the value space will be mapped onto the same statement internally.



Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.