Class CompactHashSet<E>
- java.lang.Object
-
- java.util.AbstractCollection<E>
-
- java.util.AbstractSet<E>
-
- com.google.common.collect.CompactHashSet<E>
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Iterable<E>
,java.util.Collection<E>
,java.util.Set<E>
- Direct Known Subclasses:
CompactLinkedHashSet
class CompactHashSet<E> extends java.util.AbstractSet<E> implements java.io.Serializable
CompactHashSet is an implementation of a Set. All optional operations (adding and removing) are supported. The elements can be any objects.contains(x)
,add(x)
andremove(x)
, are all (expected and amortized) constant time operations. Expected in the hashtable sense (depends on the hash function doing a good job of distributing the elements to the buckets to a distribution not far from uniform), and amortized since some operations can trigger a hash table resize.Unlike
java.util.HashSet
, iteration is only proportional to the actualsize()
, which is optimal, and not the size of the internal hashtable, which could be much larger thansize()
. Furthermore, this structure only depends on a fixed number of arrays;add(x)
operations do not create objects for the garbage collector to deal with, and for every element added, the garbage collector will have to traverse1.5
references on average, in the marking phase, not5.0
as injava.util.HashSet
.If there are no removals, then
iteration
order is the same as insertion order. Any removal invalidates any ordering guarantees.This class should not be assumed to be universally superior to
java.util.HashSet
. Generally speaking, this class reduces object allocation and memory consumption at the price of moderately increased constant factors of CPU. Only use this class when there is a specific reason to prioritize memory over CPU.
-
-
Field Summary
Fields Modifier and Type Field Description (package private) java.lang.Object[]
elements
The elements contained in the set, in the range of [0, size()).private int[]
entries
Contains the logical entries, in the range of [0, size()).(package private) static double
HASH_FLOODING_FPP
Maximum allowed false positive probability of detecting a hash flooding attack given random input.private static int
MAX_HASH_BUCKET_LENGTH
Maximum allowed length of a hash table bucket before falling back to a j.u.LinkedHashSet based implementation.private int
metadata
Keeps track of metadata like the number of hash table bits and modifications of this data structure (to make it possible to throw ConcurrentModificationException in the iterator).private int
size
The number of elements contained in the set.private java.lang.Object
table
The hashtable object.
-
Constructor Summary
Constructors Constructor Description CompactHashSet()
Constructs a new empty instance ofCompactHashSet
.CompactHashSet(int expectedSize)
Constructs a new instance ofCompactHashSet
with the specified capacity.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
add(E object)
(package private) int
adjustAfterRemove(int indexBeforeRemove, int indexRemoved)
Updates the index an iterator is pointing to after a call to remove: returns the index of the entry that should be looked at after a removal on indexRemoved, with indexBeforeRemove as the index that *was* the next entry that would be looked at.(package private) int
allocArrays()
Handle lazy allocation of arrays.void
clear()
boolean
contains(java.lang.Object object)
(package private) java.util.Set<E>
convertToHashFloodingResistantImplementation()
static <E> CompactHashSet<E>
create()
Creates an emptyCompactHashSet
instance.static <E> CompactHashSet<E>
create(E... elements)
Creates a mutableCompactHashSet
instance containing the given elements in unspecified order.static <E> CompactHashSet<E>
create(java.util.Collection<? extends E> collection)
Creates a mutableCompactHashSet
instance containing the elements of the given collection in unspecified order.private java.util.Set<E>
createHashFloodingResistantDelegate(int tableSize)
static <E> CompactHashSet<E>
createWithExpectedSize(int expectedSize)
Creates aCompactHashSet
instance, with a high enough "initial capacity" that it should holdexpectedSize
elements without growth.(package private) java.util.Set<E>
delegateOrNull()
(package private) int
firstEntryIndex()
void
forEach(java.util.function.Consumer<? super E> action)
(package private) int
getSuccessor(int entryIndex)
private int
hashTableMask()
Gets the hash table mask using the stored number of hash table bits.(package private) void
incrementModCount()
(package private) void
init(int expectedSize)
Pseudoconstructor for serialization support.(package private) void
insertEntry(int entryIndex, E object, int hash, int mask)
Creates a fresh entry with the specified object at the specified position in the entry arrays.boolean
isEmpty()
(package private) boolean
isUsingHashFloodingResistance()
java.util.Iterator<E>
iterator()
(package private) void
moveLastEntry(int dstIndex, int mask)
Moves the last entry in the entry array intodstIndex
, and nulls out its old position.(package private) boolean
needsAllocArrays()
Returns whether arrays need to be allocated.private void
readObject(java.io.ObjectInputStream stream)
boolean
remove(java.lang.Object object)
(package private) void
resizeEntries(int newCapacity)
Resizes the internal entries array to the specified capacity, which may be greater or less than the current capacity.private void
resizeMeMaybe(int newSize)
Resizes the entries storage if necessary.private int
resizeTable(int mask, int newCapacity, int targetHash, int targetEntryIndex)
private void
setHashTableMask(int mask)
Stores the hash table mask as the number of bits needed to represent an index.int
size()
java.util.Spliterator<E>
spliterator()
java.lang.Object[]
toArray()
<T> T[]
toArray(T[] a)
void
trimToSize()
Ensures that thisCompactHashSet
has the smallest representation in memory, given its current size.private void
writeObject(java.io.ObjectOutputStream stream)
-
-
-
Field Detail
-
HASH_FLOODING_FPP
static final double HASH_FLOODING_FPP
Maximum allowed false positive probability of detecting a hash flooding attack given random input.- See Also:
- Constant Field Values
-
MAX_HASH_BUCKET_LENGTH
private static final int MAX_HASH_BUCKET_LENGTH
Maximum allowed length of a hash table bucket before falling back to a j.u.LinkedHashSet based implementation. Experimentally determined.- See Also:
- Constant Field Values
-
table
private transient java.lang.Object table
The hashtable object. This can be either:- a byte[], short[], or int[], with size a power of two, created by
CompactHashing.createTable, whose values are either
- UNSET, meaning "null pointer"
- one plus an index into the entries and elements array
- another java.util.Set delegate implementation. In most modern JDKs, normal java.util hash collections intelligently fall back to a binary search tree if hash table collisions are detected. Rather than going to all the trouble of reimplementing this ourselves, we simply switch over to use the JDK implementation wholesale if probable hash flooding is detected, sacrificing the compactness guarantee in very rare cases in exchange for much more reliable worst-case behavior.
- null, if no entries have yet been added to the map
- a byte[], short[], or int[], with size a power of two, created by
CompactHashing.createTable, whose values are either
-
entries
private transient int[] entries
Contains the logical entries, in the range of [0, size()). The high bits of each int are the part of the smeared hash of the element not covered by the hashtable mask, whereas the low bits are the "next" pointer (pointing to the next entry in the bucket chain), which will always be less than or equal to the hashtable mask.hash = aaaaaaaa mask = 0000ffff next = 0000bbbb entry = aaaabbbb
The pointers in [size(), entries.length) are all "null" (UNSET).
-
elements
transient java.lang.Object[] elements
The elements contained in the set, in the range of [0, size()). The elements in [size(), elements.length) are allnull
.
-
metadata
private transient int metadata
Keeps track of metadata like the number of hash table bits and modifications of this data structure (to make it possible to throw ConcurrentModificationException in the iterator). Note that we choose not to make this volatile, so we do less of a "best effort" to track such errors, for better performance.
-
size
private transient int size
The number of elements contained in the set.
-
-
Method Detail
-
create
public static <E> CompactHashSet<E> create()
Creates an emptyCompactHashSet
instance.
-
create
public static <E> CompactHashSet<E> create(java.util.Collection<? extends E> collection)
Creates a mutableCompactHashSet
instance containing the elements of the given collection in unspecified order.- Parameters:
collection
- the elements that the set should contain- Returns:
- a new
CompactHashSet
containing those elements (minus duplicates)
-
create
@SafeVarargs public static <E> CompactHashSet<E> create(E... elements)
Creates a mutableCompactHashSet
instance containing the given elements in unspecified order.- Parameters:
elements
- the elements that the set should contain- Returns:
- a new
CompactHashSet
containing those elements (minus duplicates)
-
createWithExpectedSize
public static <E> CompactHashSet<E> createWithExpectedSize(int expectedSize)
Creates aCompactHashSet
instance, with a high enough "initial capacity" that it should holdexpectedSize
elements without growth.- Parameters:
expectedSize
- the number of elements you expect to add to the returned set- Returns:
- a new, empty
CompactHashSet
with enough capacity to holdexpectedSize
elements without resizing - Throws:
java.lang.IllegalArgumentException
- ifexpectedSize
is negative
-
init
void init(int expectedSize)
Pseudoconstructor for serialization support.
-
needsAllocArrays
boolean needsAllocArrays()
Returns whether arrays need to be allocated.
-
allocArrays
int allocArrays()
Handle lazy allocation of arrays.
-
delegateOrNull
java.util.Set<E> delegateOrNull()
-
createHashFloodingResistantDelegate
private java.util.Set<E> createHashFloodingResistantDelegate(int tableSize)
-
convertToHashFloodingResistantImplementation
java.util.Set<E> convertToHashFloodingResistantImplementation()
-
isUsingHashFloodingResistance
boolean isUsingHashFloodingResistance()
-
setHashTableMask
private void setHashTableMask(int mask)
Stores the hash table mask as the number of bits needed to represent an index.
-
hashTableMask
private int hashTableMask()
Gets the hash table mask using the stored number of hash table bits.
-
incrementModCount
void incrementModCount()
-
add
public boolean add(E object)
-
insertEntry
void insertEntry(int entryIndex, E object, int hash, int mask)
Creates a fresh entry with the specified object at the specified position in the entry arrays.
-
resizeMeMaybe
private void resizeMeMaybe(int newSize)
Resizes the entries storage if necessary.
-
resizeEntries
void resizeEntries(int newCapacity)
Resizes the internal entries array to the specified capacity, which may be greater or less than the current capacity.
-
resizeTable
private int resizeTable(int mask, int newCapacity, int targetHash, int targetEntryIndex)
-
contains
public boolean contains(java.lang.Object object)
-
remove
public boolean remove(java.lang.Object object)
-
moveLastEntry
void moveLastEntry(int dstIndex, int mask)
Moves the last entry in the entry array intodstIndex
, and nulls out its old position.
-
firstEntryIndex
int firstEntryIndex()
-
getSuccessor
int getSuccessor(int entryIndex)
-
adjustAfterRemove
int adjustAfterRemove(int indexBeforeRemove, int indexRemoved)
Updates the index an iterator is pointing to after a call to remove: returns the index of the entry that should be looked at after a removal on indexRemoved, with indexBeforeRemove as the index that *was* the next entry that would be looked at.
-
iterator
public java.util.Iterator<E> iterator()
-
spliterator
public java.util.Spliterator<E> spliterator()
-
forEach
public void forEach(java.util.function.Consumer<? super E> action)
- Specified by:
forEach
in interfacejava.lang.Iterable<E>
-
size
public int size()
-
isEmpty
public boolean isEmpty()
-
toArray
public java.lang.Object[] toArray()
-
toArray
public <T> T[] toArray(T[] a)
-
trimToSize
public void trimToSize()
Ensures that thisCompactHashSet
has the smallest representation in memory, given its current size.
-
clear
public void clear()
-
writeObject
private void writeObject(java.io.ObjectOutputStream stream) throws java.io.IOException
- Throws:
java.io.IOException
-
readObject
private void readObject(java.io.ObjectInputStream stream) throws java.io.IOException, java.lang.ClassNotFoundException
- Throws:
java.io.IOException
java.lang.ClassNotFoundException
-
-