Package | Description |
---|---|
messif.objects.impl |
Implementation of basic data objects.
|
messif.objects.text |
Support for text data.
|
Modifier and Type | Method and Description |
---|---|
static void |
ObjectIntMultiVectorJaccard.addMetaObjectKeywordString(MetaObject metaObject,
java.lang.String jaccardDescriptorName,
TextDescriptorFactory<? extends ObjectIntMultiVector> textDescriptorFactory,
java.lang.String additionalKeywords)
HACK method for adding search keywords to an existing jaccard meta object.
|
ObjectIntMultiVector.WeightProvider |
MetaObjectProfiSCT.DatabaseSupport.getKeywordWeightProvider(ObjectIntMultiVector keywords,
float[] weights)
Returns the weight provider for keywords based on tf-idf.
|
java.util.Map<java.lang.Integer,java.lang.Float> |
MetaObjectProfiSCT.DatabaseSupport.getKeywordWeights(ObjectIntMultiVector keywords)
Returns weights for keywords based on tf-idf.
|
java.lang.String[] |
MetaObjectProfiSCT.getKeywordWords(IntStorageIndexed<java.lang.String> wordIndex)
Returns the key words of this object.
|
java.lang.String[] |
MetaObjectProfiSCT.getSearchWords(IntStorageIndexed<java.lang.String> wordIndex)
Returns the search words of this object.
|
static ObjectIntMultiVector.WeightProvider |
MetaObjectProfiSCT.DatabaseSupport.getStaticKeywordWeightProvider(float[] weights)
Returns the weight provider for keywords based on tf-idf that uses
pre-loaded static keyword weights.
|
java.lang.String[] |
MetaObjectProfiSCT.getTitleWords(IntStorageIndexed<java.lang.String> wordIndex)
Returns the title words of this object.
|
static ObjectIntMultiVectorJaccard |
CophirXmlParser.parseKeyWordsType(java.lang.String[] texts,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Parse the keywords descriptor data.
|
java.util.Collection<RankedAbstractObject> |
MetaObjectProfiSCT.DatabaseSupport.rankByKeywords(MetaObjectProfiSCT queryObject,
float[] keyWordWeights,
java.util.Iterator<? extends MetaObjectProfiSCT> iterator)
Returns a collection of ranked objects given by the
iterator with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights. |
java.util.Collection<RankedAbstractObject> |
MetaObjectProfiSCT.DatabaseSupport.rankByKeywords(java.lang.String[][] referenceKeywords,
float[] keyWordWeights,
java.util.Iterator<? extends MetaObjectProfiSCT> iterator)
Returns a collection of ranked objects given by
iterator with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights. |
java.util.Collection<RankedAbstractObject> |
MetaObjectProfiSCT.DatabaseSupport.rerankByKeywords(MetaObjectProfiSCT queryObject,
float[] keyWordWeights,
float originalRankWeight,
java.util.Iterator<? extends RankedAbstractObject> iterator)
Returns a collection of ranked objects given by the
iterator with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights. |
java.util.Collection<RankedAbstractObject> |
MetaObjectProfiSCT.DatabaseSupport.searchByText(MetaObjectProfiSCT object,
float[] weights,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
java.util.Collection<RankedAbstractObject> |
MetaObjectProfiSCT.DatabaseSupport.searchByText(java.lang.String[] searchKeywords,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
java.util.Collection<RankedAbstractObject> |
MetaObjectProfiSCT.DatabaseSupport.searchByText(java.lang.String[] titleWords,
java.lang.String[] keywordWords,
java.lang.String[] searchWords,
float[] weights,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
java.util.Collection<RankedAbstractObject> |
MetaObjectProfiSCT.DatabaseSupport.searchByText(java.lang.String text,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
int[] |
MetaObjectProfiSCT.DatabaseSupport.wordsToIdentifiers(java.lang.String[] words)
Transforms a list of words into array of identifiers.
|
Constructor and Description |
---|
MetaObjectProfiSCT.MultiWeightIgnoreProviderProfi(float[] weights,
float ignoreWeight,
java.lang.String[] ignoredKeywords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> keyWordIndex)
Creates a new instance of MultiWeightProvider with the the given array of weights.
|
MetaObjectProfiSCT(MetaObjectProfiSCT object,
java.lang.String searchString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of MetaObjectProfiSCT from the given
MetaObjectProfiSCT
and given set of keywords. |
MetaObjectProfiSCTiDIM(MetaObjectProfiSCT object,
java.lang.String searchString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex) |
Modifier and Type | Method and Description |
---|---|
T |
TextDescriptorFactory.createTextDescriptor(java.lang.String... strings)
Creates a
object that represents a descriptor
for the given text strings. |
java.lang.String[] |
WordExpander.expandWords(java.lang.String[] words)
Expand the list of words.
|
static java.lang.String[] |
TextConversion.identifiersToWords(IntStorageIndexed<java.lang.String> wordIndex,
int[] ids)
Convert the given array of word identifiers to words using the given storage.
|
java.lang.String |
Stemmer.stem(java.lang.String word)
Provides a stem for the given word.
|
static java.util.Set<java.lang.String> |
TextConversion.stemWords(java.util.Collection<java.lang.String> words,
Stemmer stemmer)
Processes the given collection of words by stemming.
|
static int[][] |
TextConversion.textsToWordIdentifiersMultiIndex(java.lang.String[] strings,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> writableWordIndex,
IntStorageIndexed<java.lang.String>[] readonlyWordIndexes)
Transforms multiple strings of words into multi-array of addresses.
|
static int[] |
TextConversion.textToWordIdentifiers(java.lang.String string,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Transforms a string of words into array of addresses.
|
static int[] |
TextConversion.textToWordIdentifiers(java.lang.String string,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Transforms a string of words into array of addresses.
|
static int[] |
TextConversion.textToWordIdentifiersMultiIndex(java.lang.String string,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> writableWordIndex,
IntStorageIndexed<java.lang.String>[] readonlyWordIndexes)
Transforms a string of words into array of addresses.
|
static java.lang.String |
TextConversion.unifyWord(java.lang.String keyWord,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
boolean normalize)
Return a stemmed, non-ignored word.
|
static java.util.Collection<java.lang.String> |
TextConversion.unifyWords(java.lang.String[] keyWords,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
boolean normalize)
Return a collection of stemmed, non-ignored words.
|
static int[] |
TextConversion.wordsToIdentifiers(java.lang.String[] words,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean normalize)
Transforms a list of words into array of addresses.
|
static int[] |
TextConversion.wordsToIdentifiers(java.lang.String[] words,
java.util.Set<java.lang.String> ignoreWords,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean normalize)
Transforms a list of words into array of addresses.
|
static int |
TextConversion.wordsToIdentifiersRead(java.util.Collection<java.lang.String> words,
IntStorageIndexed<java.lang.String> wordIndex,
int[] identifiers,
int index)
Transforms a list of words into array of addresses by reading the given word index.
|
static int |
TextConversion.wordsToIdentifiersStore(java.util.Collection<java.lang.String> words,
IntStorageIndexed<java.lang.String> wordIndex,
int[] identifiers,
int index)
Transforms a list of words into array of addresses by storing the words
into the given word index and retrieving the generated identifiers.
|