Package | Description |
---|---|
messif.objects.impl |
Implementation of basic data objects.
|
messif.objects.text |
Support for text data.
|
Modifier and Type | Method and Description |
---|---|
Stemmer |
MetaObjectProfiSCT.DatabaseSupport.getStemmer()
Returns the current
Stemmer instance. |
Modifier and Type | Method and Description |
---|---|
protected ObjectIntMultiVectorJaccard |
MetaObjectProfiSCT.convertWordsToIdentifiers(WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
java.lang.Object title,
java.lang.Object keyword,
java.lang.Object search)
Convert the given title, key and additional words to a int multi-vector
object with Jaccard distance function.
|
static CophirXmlParser |
CophirXmlParser.create(java.io.File file,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Factory method that parses the given CoPhIR XML file.
|
static CophirXmlParser |
CophirXmlParser.create(java.io.File xmlDir,
java.lang.String identifier,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Factory method that parses a CoPhIR XML file with the given identifier.
|
static TextDescriptorFactory<ObjectIntMultiVectorJaccard> |
ObjectIntMultiVectorJaccard.createTextDescriptorFactory(java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> writableWordIndex,
IntStorageIndexed<java.lang.String>[] readonlyWordIndexes)
Creates a factory method that converts the given array of strings into
ObjectIntMultiVectorJaccard . |
static java.util.Map<java.lang.String,DatabaseStorage.ColumnConvertor<MetaObjectProfiSCT>> |
MetaObjectProfiSCT.DatabaseSupport.getDBColumnMap(boolean addTextStreamColumn,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean useLinkTable)
Returns the database column definitions for the
MetaObjectProfiSCT object. |
static DatabaseStorage.ColumnConvertor<MetaObjectProfiSCT> |
MetaObjectProfiSCT.DatabaseSupport.getTextStreamColumnConvertor(Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean useLinkTable)
Returns the database column convertor for creating the
MetaObjectProfiSCT object
from the text stream. |
static ObjectIntMultiVectorJaccard |
CophirXmlParser.parseKeyWordsType(java.lang.String[] texts,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Parse the keywords descriptor data.
|
Constructor and Description |
---|
CophirXmlParser(Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new handler for parsing CoPhIR XML files.
|
CophirXmlParser(Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
java.lang.String csvObjectName)
Creates a new handler for parsing CoPhIR XML files.
|
MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl,
java.util.Properties dbConnInfo,
java.lang.String dbDriverClass,
java.lang.String tableName,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of DatabaseSupport.
|
MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl,
java.util.Properties dbConnInfo,
java.lang.String dbDriverClass,
java.lang.String tableName,
java.lang.String wordLinkTable,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of DatabaseSupport.
|
MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl,
java.util.Properties dbConnInfo,
java.lang.String dbDriverClass,
java.lang.String tableName,
java.lang.String wordLinkTable,
java.lang.String wordFrequencyTable,
java.lang.String stopwordTable,
java.lang.String[] stopwordCategories,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of DatabaseSupport.
|
MetaObjectProfiSCT.MetaObjectProfiSCTKwDistCosine(java.io.BufferedReader stream,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
java.lang.Float keywordsWeight,
float[] keywordLayerWeights)
Creates a new instance of MetaObjectProfiSCTKwDistCosine.
|
MetaObjectProfiSCT.MetaObjectProfiSCTKwDistJaccard(java.io.BufferedReader stream,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
java.lang.Float keywordsWeight,
float[] keywordLayerWeights)
Creates a new instance of MetaObjectProfiSCTKwDistJaccard.
|
MetaObjectProfiSCT.MultiWeightIgnoreProviderProfi(float[] weights,
float ignoreWeight,
java.lang.String[] ignoredKeywords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> keyWordIndex)
Creates a new instance of MultiWeightProvider with the the given array of weights.
|
MetaObjectProfiSCT(java.io.BufferedReader stream,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of MetaObjectProfiSCT.
|
MetaObjectProfiSCT(java.io.BufferedReader stream,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
java.lang.String searchString)
Creates a new instance of MetaObjectProfiSCT.
|
MetaObjectProfiSCT(MetaObjectProfiSCT object,
java.lang.String[] titleWords,
java.lang.String[] keywordWords,
java.lang.String[] searchWords,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of MetaObjectProfiSCT from the given
MetaObjectProfiSCT
and given title words, key words, and search words. |
MetaObjectProfiSCT(MetaObjectProfiSCT object,
java.lang.String titleString,
java.lang.String keywordString,
java.lang.String searchString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of MetaObjectProfiSCT from the given
MetaObjectProfiSCT
and given set of keywords. |
MetaObjectProfiSCT(MetaObjectProfiSCT object,
java.lang.String titleString,
java.lang.String keywordString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of MetaObjectProfiSCT from the given
MetaObjectProfiSCT
and given set of keywords. |
MetaObjectProfiSCT(MetaObjectProfiSCT object,
java.lang.String searchString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of MetaObjectProfiSCT from the given
MetaObjectProfiSCT
and given set of keywords. |
MetaObjectProfiSCTiDIM.MetaObjectProfiSCTiDIMKwDistCosine(java.io.BufferedReader stream,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
java.lang.Float keywordsWeight,
float[] keywordLayerWeights)
Creates a new instance of MetaObjectProfiSCTiDIMKwDistCosine.
|
MetaObjectProfiSCTiDIM(java.io.BufferedReader stream,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex) |
MetaObjectProfiSCTiDIM(java.io.BufferedReader stream,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
java.lang.String searchString) |
MetaObjectProfiSCTiDIM(MetaObjectProfiSCT object,
java.lang.String titleString,
java.lang.String keywordString,
java.lang.String searchString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex) |
MetaObjectProfiSCTiDIM(MetaObjectProfiSCT object,
java.lang.String titleString,
java.lang.String keywordString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex) |
MetaObjectProfiSCTiDIM(MetaObjectProfiSCT object,
java.lang.String searchString,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex) |
Modifier and Type | Method and Description |
---|---|
static java.util.Set<java.lang.String> |
TextConversion.stemWords(java.util.Collection<java.lang.String> words,
Stemmer stemmer)
Processes the given collection of words by stemming.
|
static int[][] |
TextConversion.textsToWordIdentifiersMultiIndex(java.lang.String[] strings,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> writableWordIndex,
IntStorageIndexed<java.lang.String>[] readonlyWordIndexes)
Transforms multiple strings of words into multi-array of addresses.
|
static int[] |
TextConversion.textToWordIdentifiers(java.lang.String string,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Transforms a string of words into array of addresses.
|
static int[] |
TextConversion.textToWordIdentifiers(java.lang.String string,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Transforms a string of words into array of addresses.
|
static int[] |
TextConversion.textToWordIdentifiersMultiIndex(java.lang.String string,
java.lang.String stringSplitRegexp,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> writableWordIndex,
IntStorageIndexed<java.lang.String>[] readonlyWordIndexes)
Transforms a string of words into array of addresses.
|
static java.lang.String |
TextConversion.unifyWord(java.lang.String keyWord,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
boolean normalize)
Return a stemmed, non-ignored word.
|
static java.util.Collection<java.lang.String> |
TextConversion.unifyWords(java.lang.String[] keyWords,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
Stemmer stemmer,
boolean normalize)
Return a collection of stemmed, non-ignored words.
|
static int[] |
TextConversion.wordsToIdentifiers(java.lang.String[] words,
java.util.Set<java.lang.String> ignoreWords,
java.util.Set<java.lang.String> stopWords,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean normalize)
Transforms a list of words into array of addresses.
|
static int[] |
TextConversion.wordsToIdentifiers(java.lang.String[] words,
java.util.Set<java.lang.String> ignoreWords,
WordExpander expander,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean normalize)
Transforms a list of words into array of addresses.
|