public class ObjectIntMultiVectorJaccard extends ObjectIntMultiVector implements java.io.Serializable
Modifier and Type | Class and Description |
---|---|
static class |
ObjectIntMultiVectorJaccard.WeightedJaccardDistanceFunction
Class for distance functions that compute distances between two
ObjectIntMultiVector s using a non-metric weighted Jaccard coefficient. |
ObjectIntMultiVector.ArrayMultiWeightProvider, ObjectIntMultiVector.MapMultiWeightProvider, ObjectIntMultiVector.MultiWeightIgnoreProvider, ObjectIntMultiVector.MultiWeightProvider, ObjectIntMultiVector.SDIteratorIntersectionResult, ObjectIntMultiVector.SortedDataIterator, ObjectIntMultiVector.WeightProvider
LocalAbstractObject.DataEqualObject, LocalAbstractObject.TextStreamFactory<T extends LocalAbstractObject>, LocalAbstractObject.TrivialDistanceFunction
data
counterDistanceComputations, counterLowerBoundDistanceComputations, counterPrecomputedDistanceSavings, counterUpperBoundDistanceComputations, MAX_DISTANCE, MIN_DISTANCE, suppData, trivialDistanceFunction, UNKNOWN_DISTANCE
Constructor and Description |
---|
ObjectIntMultiVectorJaccard(BinaryInput input,
BinarySerializator serializator)
Creates a new instance of ObjectIntMultiVectorJaccard loaded from binary input buffer.
|
ObjectIntMultiVectorJaccard(java.io.BufferedReader stream)
Creates a new instance of ObjectIntMultiVectorJaccard from stream - it expects that the data is already sorted!
|
ObjectIntMultiVectorJaccard(java.io.BufferedReader stream,
int arrays)
Creates a new instance of ObjectIntMultiVectorJaccard from stream - it expects that the data is already sorted!
|
ObjectIntMultiVectorJaccard(int[]... data)
Creates a new instance of ObjectIntMultiVectorJaccard.
|
ObjectIntMultiVectorJaccard(int[][] data,
boolean forceSort)
Creates a new instance of ObjectIntMultiVectorJaccard.
|
ObjectIntMultiVectorJaccard(int arrays,
int dimension)
Creates a new instance of randomly generated ObjectIntMultiVectorJaccard.
|
Modifier and Type | Method and Description |
---|---|
static void |
addMetaObjectKeywordString(MetaObject metaObject,
java.lang.String jaccardDescriptorName,
TextDescriptorFactory<? extends ObjectIntMultiVector> textDescriptorFactory,
java.lang.String additionalKeywords)
HACK method for adding search keywords to an existing jaccard meta object.
|
static TextDescriptorFactory<ObjectIntMultiVectorJaccard> |
createTextDescriptorFactory(java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> writableWordIndex,
IntStorageIndexed<java.lang.String>[] readonlyWordIndexes)
Creates a factory method that converts the given array of strings into
ObjectIntMultiVectorJaccard . |
protected float |
getDistanceImpl(LocalAbstractObject obj,
float distThreshold)
Implements the Jaccard coefficient distance function.
|
float |
getMaxDistance()
Returns a maximal possible distance for this class.
|
static float |
getMaximalDistance()
Returns a maximal possible distance for this class.
|
float |
getWeightedJaccardDistance(ObjectIntMultiVector obj,
ObjectIntMultiVector.WeightProvider weightProviderThis,
ObjectIntMultiVector.WeightProvider weightProviderObj)
Implements a non-metric weighted Jaccard coefficient distance function.
|
static float |
getWeightedJaccardDistance(ObjectIntMultiVector o1,
ObjectIntMultiVector.WeightProvider weightProviderO1,
ObjectIntMultiVector o2,
ObjectIntMultiVector.WeightProvider weightProviderO2)
Computes a distance between two
ObjectIntMultiVector s using
a non-metric weighted Jaccard coefficient. |
binarySerialize, cloneRandomlyModify, dataEquals, dataHashCode, getBinarySize, getDimensionality, getSize, getSortedIterator, getVectorData, getVectorData, getVectorDataCount, getVectorDataItem, getVectorDataItemCount, sortData, toString, writeData
clearSurplusData, clone, clone, create, create, createMetaDistancesHolder, excludeUsingPrecompDist, getDistance, getDistance, getDistance, getDistanceFilter, getDistanceFilter, getDistanceFilter, getDistanceFilter, getDistanceLowerBound, getDistanceLowerBoundImpl, getDistanceStorePrecomputed, getDistanceStorePrecomputed, getDistanceStorePrecomputed, getDistanceUpperBound, getDistanceUpperBoundImpl, getFieldsForNames, getNormDistance, getPrecomputedDistance, getPrecomputedDistance, getRandomChar, getRandomNormal, chainDestroy, chainFilter, includeUsingPrecompDist, isDistanceCompatible, parseObjectComment, peekNextChar, readAttributesFromStream, readObjectComments, readObjectCommentsWithoutData, unchainFilter, write, write, writeAttributesToStream, writeObjectComment
clone, getLocatorURI, getNoDataObject, getObjectKey, getObjectKey, getObjectLocatorURI, setObjectKey
public ObjectIntMultiVectorJaccard(int[][] data, boolean forceSort)
forceSort
is false, the provided data are expected to be sorted!data
- the data content of the new objectforceSort
- if false, the data is expected to be sortedpublic ObjectIntMultiVectorJaccard(int[]... data)
data
- the data content of the new objectpublic ObjectIntMultiVectorJaccard(int arrays, int dimension)
arrays
- the number of vector data arrays to createdimension
- number of dimensions to generatepublic ObjectIntMultiVectorJaccard(java.io.BufferedReader stream, int arrays) throws java.io.IOException, java.lang.NumberFormatException
stream
- text stream to read the data fromarrays
- number of arrays to read from the streamjava.io.IOException
- when an error appears during reading from given stream.
or EOFException when end-of-file of the given stream is reached.java.lang.NumberFormatException
- when the line read from given stream does
not consist of comma-separated or space-separated numbers.public ObjectIntMultiVectorJaccard(java.io.BufferedReader stream) throws java.io.IOException, java.lang.NumberFormatException
stream
- text stream to read the data fromjava.io.IOException
- when an error appears during reading from given stream.
or EOFException when end-of-file of the given stream is reached.java.lang.NumberFormatException
- when the line read from given stream does
not consist of comma-separated or space-separated numbers.public ObjectIntMultiVectorJaccard(BinaryInput input, BinarySerializator serializator) throws java.io.IOException
input
- the buffer to read the ObjectIntVector fromserializator
- the serializator used to write objectsjava.io.IOException
- if there was an I/O error reading from the bufferpublic static TextDescriptorFactory<ObjectIntMultiVectorJaccard> createTextDescriptorFactory(java.util.Set<java.lang.String> stopWords, Stemmer stemmer, IntStorageIndexed<java.lang.String> writableWordIndex, IntStorageIndexed<java.lang.String>[] readonlyWordIndexes)
ObjectIntMultiVectorJaccard
.
The conversion uses the
textsToWordIdentifiersMultiIndex
method to process the strings.stopWords
- set of words that are ignoredstemmer
- a Stemmer
for word transformationwritableWordIndex
- the index for translating words to addresses where the unknown words can be insertedreadonlyWordIndexes
- the indexes for translating words to addressespublic static void addMetaObjectKeywordString(MetaObject metaObject, java.lang.String jaccardDescriptorName, TextDescriptorFactory<? extends ObjectIntMultiVector> textDescriptorFactory, java.lang.String additionalKeywords) throws TextConversionException
metaObject
- the metaobject encapsulation that contains the jaccard objectjaccardDescriptorName
- the name of the encapsulated jaccard objecttextDescriptorFactory
- the string-to-identifiers factoryadditionalKeywords
- the additional keywords string to addTextConversionException
- if there was an error converting the identifiersprotected float getDistanceImpl(LocalAbstractObject obj, float distThreshold)
getDistanceImpl
in class LocalAbstractObject
obj
- the object to compute distance todistThreshold
- the threshold value on the distancepublic float getMaxDistance()
LocalAbstractObject
LocalAbstractObject.MAX_DISTANCE
.getMaxDistance
in class LocalAbstractObject
public static float getMaximalDistance()
LocalAbstractObject.MAX_DISTANCE
.public static float getWeightedJaccardDistance(ObjectIntMultiVector o1, ObjectIntMultiVector.WeightProvider weightProviderO1, ObjectIntMultiVector o2, ObjectIntMultiVector.WeightProvider weightProviderO2) throws java.lang.NullPointerException
ObjectIntMultiVector
s using
a non-metric weighted Jaccard coefficient.o1
- the object to compute distance fromweightProviderO1
- the weight provider for object o1
o2
- the object to compute distance toweightProviderO2
- the weight provider for object o2
o1
and object o2
java.lang.NullPointerException
- if either weightProviderO1
or weightProviderO2
is nullpublic float getWeightedJaccardDistance(ObjectIntMultiVector obj, ObjectIntMultiVector.WeightProvider weightProviderThis, ObjectIntMultiVector.WeightProvider weightProviderObj)
weightProviderThis
or weightProviderObj
is null,
the normal Jaccard distance
is returned.obj
- the object to compute distance toweightProviderThis
- the weight provider for this objectweightProviderObj
- the weight provider for obj
obj