public static class MetaObjectProfiSCT.DatabaseSupport extends ExtendedDatabaseConnection
Modifier and Type | Class and Description |
---|---|
protected static class |
MetaObjectProfiSCT.DatabaseSupport.KeywordWeightProvider
Implements a database provider for keyword weights.
|
static class |
MetaObjectProfiSCT.DatabaseSupport.Stopword
Encapsulation of database stopword
|
ExtendedDatabaseConnection.ExtendedDatabaseConnectionPublic
Constructor and Description |
---|
MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl,
java.util.Properties dbConnInfo,
java.lang.String dbDriverClass,
java.lang.String tableName,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of DatabaseSupport.
|
MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl,
java.util.Properties dbConnInfo,
java.lang.String dbDriverClass,
java.lang.String tableName,
java.lang.String wordLinkTable,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of DatabaseSupport.
|
MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl,
java.util.Properties dbConnInfo,
java.lang.String dbDriverClass,
java.lang.String tableName,
java.lang.String wordLinkTable,
java.lang.String wordFrequencyTable,
java.lang.String stopwordTable,
java.lang.String[] stopwordCategories,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex)
Creates a new instance of DatabaseSupport.
|
Modifier and Type | Method and Description |
---|---|
MultiExtractor<? extends MetaObjectProfiSCT> |
createImageDirExtractor(java.lang.String extractorCommand,
boolean fileAsArgument,
boolean storeObjects)
Creates a new extractor that uses external extractor on a directory of images.
|
Extractor<? extends MetaObjectProfiSCT> |
createImageExtractor(java.lang.String extractorCommand,
boolean storeObjects,
java.lang.String[] dataLineParameterNames)
Creates a new extractor that uses external image extractor and additional parameters
to create instances of
MetaObjectProfiSCT . |
Extractor<? extends MetaObjectProfiSCT> |
createLocatorExtractor(java.lang.String locatorParamName,
java.lang.String additionalKeyWordsParamName,
boolean removeObjects)
Creates a new extractor that uses locator parameter of the
ExtractorDataSource to get the respective object from the database. |
protected void |
deleteWordLinks(int objectId)
Delete all word links for the given object id.
|
IntStorageSearch<MetaObjectProfiSCT> |
getAllObjects()
Returns a search over all objects in the storage.
|
IntStorageSearch<MetaObjectProfiSCT> |
getAllObjects(java.lang.Integer fromId,
java.lang.Integer toId)
Returns a search over all objects in the storage.
|
static java.util.Map<java.lang.String,DatabaseStorage.ColumnConvertor<MetaObjectProfiSCT>> |
getDBColumnMap(boolean addTextStreamColumn,
Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean useLinkTable)
Returns the database column definitions for the
MetaObjectProfiSCT object. |
ObjectIntMultiVector.WeightProvider |
getKeywordWeightProvider(ObjectIntMultiVector keywords,
float[] weights)
Returns the weight provider for keywords based on tf-idf.
|
java.util.Map<java.lang.Integer,java.lang.Float> |
getKeywordWeights(ObjectIntMultiVector keywords)
Returns weights for keywords based on tf-idf.
|
static ObjectIntMultiVector.WeightProvider |
getStaticKeywordWeightProvider(float[] weights)
Returns the weight provider for keywords based on tf-idf that uses
pre-loaded static keyword weights.
|
Stemmer |
getStemmer()
Returns the current
Stemmer instance. |
java.util.Collection<MetaObjectProfiSCT.DatabaseSupport.Stopword> |
getStopwords()
Returns the stopwords loaded from database.
|
static DatabaseStorage.ColumnConvertor<MetaObjectProfiSCT> |
getTextStreamColumnConvertor(Stemmer stemmer,
IntStorageIndexed<java.lang.String> wordIndex,
boolean useLinkTable)
Returns the database column convertor for creating the
MetaObjectProfiSCT object
from the text stream. |
static java.lang.String |
getTextStreamColumnName(boolean useLinkTable)
Returns the database column name for creating the
MetaObjectProfiSCT object
from the text stream. |
IntStorageIndexed<java.lang.String> |
getWordIndex()
Returns the current word-index instance.
|
java.util.Map<java.lang.Integer,java.lang.Float> |
initializeKeywordIdfWeights()
Initializes the internal keyword idf weights.
|
protected void |
insertWordLinks(int objectId,
ObjectIntMultiVector wordIds)
Insert words link for the given object id.
|
java.util.Collection<MetaObjectProfiSCT> |
locatorsToObject(java.lang.String[] locators)
Returns a collection of objects with given
locators . |
MetaObjectProfiSCT |
locatorToObject(java.lang.String locator)
Returns the object with given
locator . |
MetaObjectProfiSCT |
locatorToObject(java.lang.String locator,
boolean remove,
java.lang.String searchWords,
WordExpander expander)
Returns the object with given
locator . |
MetaObjectProfiSCT |
locatorToObject(java.lang.String locator,
java.lang.String searchWords,
WordExpander expander)
Returns the object with given
locator . |
java.lang.String |
locatorToThumbnail(java.lang.String locator)
Returns the thumbnail path of the object with the given locator.
|
java.util.List<java.lang.String> |
randomLocators(int count)
Returns a list of randomly generated locators from the database.
|
java.util.Collection<RankedAbstractObject> |
rankByKeywords(MetaObjectProfiSCT queryObject,
float[] keyWordWeights,
java.util.Iterator<? extends MetaObjectProfiSCT> iterator)
Returns a collection of ranked objects given by the
iterator with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights. |
java.util.Collection<RankedAbstractObject> |
rankByKeywords(java.lang.String[][] referenceKeywords,
float[] keyWordWeights,
java.util.Iterator<? extends MetaObjectProfiSCT> iterator)
Returns a collection of ranked objects given by
iterator with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights. |
void |
removeObject(int objectId)
Remove the object from the database storage.
|
java.util.Collection<RankedAbstractObject> |
rerankByKeywords(MetaObjectProfiSCT queryObject,
float[] keyWordWeights,
float originalRankWeight,
java.util.Iterator<? extends RankedAbstractObject> iterator)
Returns a collection of ranked objects given by the
iterator with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights. |
java.util.Collection<RankedAbstractObject> |
searchByText(MetaObjectProfiSCT object,
float[] weights,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
java.util.Collection<RankedAbstractObject> |
searchByText(java.lang.String[] searchKeywords,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
java.util.Collection<RankedAbstractObject> |
searchByText(java.lang.String[] titleWords,
java.lang.String[] keywordWords,
java.lang.String[] searchWords,
float[] weights,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
java.util.Collection<RankedAbstractObject> |
searchByText(java.lang.String text,
boolean useIdf,
int count)
Returns a collection of objects found by the text search.
|
int |
storeObject(MetaObjectProfiSCT object)
Store the object into the database storage.
|
void |
updateObject(int objectId,
MetaObjectProfiSCT object)
Updates the object stored in the database storage.
|
int[] |
wordsToIdentifiers(java.lang.String[] words)
Transforms a list of words into array of identifiers.
|
closeConnection, createConnection, createDriver, executeDataManipulation, executeSingleValue, finalize, getConnection, prepareAndExecute, prepareAndExecute, resultSetToMap, toString
public MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl, java.util.Properties dbConnInfo, java.lang.String dbDriverClass, java.lang.String tableName, java.lang.String wordLinkTable, java.lang.String wordFrequencyTable, java.lang.String stopwordTable, java.lang.String[] stopwordCategories, Stemmer stemmer, IntStorageIndexed<java.lang.String> wordIndex) throws java.lang.IllegalArgumentException, java.sql.SQLException
dbConnUrl
- the database connection URL (e.g. "jdbc:mysql://localhost/somedb")dbConnInfo
- additional parameters of the connection (e.g. "user" and "password")dbDriverClass
- class of the database driver to use (can be null if the driver is already registered)tableName
- the name of the table in the databasewordLinkTable
- the name of the table in the database that the word links are inserted intowordFrequencyTable
- the name of the table in the database that the word frequencies are taken from
(optional, if not set, the wordLinkTable is used)stopwordTable
- the name of the table in the database where the stopwords are keptstopwordCategories
- list of stopword categories to load, all categories are loaded if nullstemmer
- an instance that provides a Stemmer
for word transformationwordIndex
- the index for translating words to addressesjava.lang.IllegalArgumentException
- if the connection url is null or the driver class cannot be registeredjava.sql.SQLException
- if there was a problem connecting to the databasepublic MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl, java.util.Properties dbConnInfo, java.lang.String dbDriverClass, java.lang.String tableName, java.lang.String wordLinkTable, Stemmer stemmer, IntStorageIndexed<java.lang.String> wordIndex) throws java.lang.IllegalArgumentException, java.sql.SQLException
dbConnUrl
- the database connection URL (e.g. "jdbc:mysql://localhost/somedb")dbConnInfo
- additional parameters of the connection (e.g. "user" and "password")dbDriverClass
- class of the database driver to use (can be null if the driver is already registered)tableName
- the name of the table in the databasewordLinkTable
- the name of the table in the database that the word links are inserted intostemmer
- an instance that provides a Stemmer
for word transformationwordIndex
- the index for translating words to addressesjava.lang.IllegalArgumentException
- if the connection url is null or the driver class cannot be registeredjava.sql.SQLException
- if there was a problem connecting to the databasepublic MetaObjectProfiSCT.DatabaseSupport(java.lang.String dbConnUrl, java.util.Properties dbConnInfo, java.lang.String dbDriverClass, java.lang.String tableName, Stemmer stemmer, IntStorageIndexed<java.lang.String> wordIndex) throws java.lang.IllegalArgumentException, java.sql.SQLException
dbConnUrl
- the database connection URL (e.g. "jdbc:mysql://localhost/somedb")dbConnInfo
- additional parameters of the connection (e.g. "user" and "password")dbDriverClass
- class of the database driver to use (can be null if the driver is already registered)tableName
- the name of the table in the databasestemmer
- an instance that provides a Stemmer
for word transformationwordIndex
- the index for translating words to addressesjava.lang.IllegalArgumentException
- if the connection url is null or the driver class cannot be registeredjava.sql.SQLException
- if there was a problem connecting to the databasepublic static java.lang.String getTextStreamColumnName(boolean useLinkTable)
MetaObjectProfiSCT
object
from the text stream.useLinkTable
- flag whether to use the title and keyword link tablespublic static DatabaseStorage.ColumnConvertor<MetaObjectProfiSCT> getTextStreamColumnConvertor(Stemmer stemmer, IntStorageIndexed<java.lang.String> wordIndex, boolean useLinkTable)
MetaObjectProfiSCT
object
from the text stream.stemmer
- an instance that provides a Stemmer
for word transformationwordIndex
- the index for translating words to addressesuseLinkTable
- flag whether to use the title and keyword link tablespublic static java.util.Map<java.lang.String,DatabaseStorage.ColumnConvertor<MetaObjectProfiSCT>> getDBColumnMap(boolean addTextStreamColumn, Stemmer stemmer, IntStorageIndexed<java.lang.String> wordIndex, boolean useLinkTable)
MetaObjectProfiSCT
object.addTextStreamColumn
- flag whether to add the metaobject stream column to the resulting mapstemmer
- an instance that provides a Stemmer
for word transformationwordIndex
- the index for translating words to addressesuseLinkTable
- flag whether to use the title and keyword link tablesMetaObjectProfiSCT
objectpublic java.util.Collection<MetaObjectProfiSCT.DatabaseSupport.Stopword> getStopwords()
public Stemmer getStemmer()
Stemmer
instance.Stemmer
instancepublic IntStorageIndexed<java.lang.String> getWordIndex()
public int[] wordsToIdentifiers(java.lang.String[] words) throws TextConversionException
words
- the list of keywords to transformTextConversionException
- if there was an error stemming the word or reading the indexpublic java.lang.String locatorToThumbnail(java.lang.String locator) throws java.sql.SQLException
locator
- the locator of the object for which to get the thumbnailjava.sql.SQLException
- if there was a problem executing the SQL commandpublic java.util.List<java.lang.String> randomLocators(int count) throws java.sql.SQLException
count
- the number of random locators to retrievejava.sql.SQLException
- if there was a problem executing the SQL commandpublic MetaObjectProfiSCT locatorToObject(java.lang.String locator, boolean remove, java.lang.String searchWords, WordExpander expander) throws ExtractorException
locator
.
The object is retrieved from the database.locator
- the locator of the object to returnremove
- if true, the object is removed from the database after it is retrievedsearchWords
- the search words that will be encapsulated in the keyWords object as the third arrayexpander
- instance for expanding the list of search wordsExtractorException
- if there was a problem retrieving or instantiating the datapublic MetaObjectProfiSCT locatorToObject(java.lang.String locator, java.lang.String searchWords, WordExpander expander) throws ExtractorException
locator
.
The object is retrieved from the database.locator
- the locator of the object to returnsearchWords
- the search words that will be encapsulated in the keyWords object as the third arrayexpander
- instance for expanding the list of search wordsExtractorException
- if there was a problem retrieving or instantiating the datapublic MetaObjectProfiSCT locatorToObject(java.lang.String locator) throws ExtractorException
locator
.
The object is retrieved from the database.locator
- the locator of the object to returnExtractorException
- if there was a problem retrieving or instantiating the datapublic java.util.Collection<MetaObjectProfiSCT> locatorsToObject(java.lang.String[] locators) throws ExtractorException
locators
.
The objects are retrieved from the database.locators
- the locators of the objects to returnExtractorException
- if there was a problem retrieving or instantiating the datapublic java.util.Collection<RankedAbstractObject> searchByText(MetaObjectProfiSCT object, float[] weights, boolean useIdf, int count) throws TextConversionException
object
- the query object which provides the textweights
- the weights to use for title, keyword, and search wordsuseIdf
- flag whether to use inverse-document-frequencies of the words (true) or a simpler weighted sum search (false)count
- the number of objects to retrieveTextConversionException
- if there was a problem executing the search on the databasepublic java.util.Collection<RankedAbstractObject> searchByText(java.lang.String text, boolean useIdf, int count) throws TextConversionException
text
- the text to search foruseIdf
- flag whether to use inverse-document-frequencies of the words (true) or a simpler weighted sum search (false)count
- the number of objects to retrieveTextConversionException
- if there was a problem executing the search on the databasepublic java.util.Collection<RankedAbstractObject> searchByText(java.lang.String[] searchKeywords, boolean useIdf, int count) throws TextConversionException
searchKeywords
- the words to search foruseIdf
- flag whether to use inverse-document-frequencies of the words (true) or a simpler weighted sum search (false)count
- the number of objects to retrieveTextConversionException
- if there was a problem executing the search on the databasepublic java.util.Collection<RankedAbstractObject> searchByText(java.lang.String[] titleWords, java.lang.String[] keywordWords, java.lang.String[] searchWords, float[] weights, boolean useIdf, int count) throws TextConversionException
unified
using
the stemmer and also duplicate keywords are removed.titleWords
- the title words to search forkeywordWords
- the keyword words to search forsearchWords
- the search words to search forweights
- the weights to use for title, keyword, and search wordsuseIdf
- flag whether to use inverse-document-frequencies of the words (true) or a simpler weighted sum search (false)count
- the number of objects to retrieveTextConversionException
- if there was a problem executing the search on the databasepublic java.util.Collection<RankedAbstractObject> rerankByKeywords(MetaObjectProfiSCT queryObject, float[] keyWordWeights, float originalRankWeight, java.util.Iterator<? extends RankedAbstractObject> iterator) throws TextConversionException
iterator
with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights.
An iterator of already ranked objects is provided and a weight for combining the original weight can be provided.queryObject
- the query object with keywords using which the objects in the collection are rankedkeyWordWeights
- the weights for different layers of keywords (title, etc.)originalRankWeight
- weight of the original object distance rank (if zero, only the new ranking based on tf-idf is used)iterator
- the iterator that provides the objects to rankTextConversionException
- if there was a problem retrieving keyword frequencies from the databasepublic java.util.Collection<RankedAbstractObject> rankByKeywords(MetaObjectProfiSCT queryObject, float[] keyWordWeights, java.util.Iterator<? extends MetaObjectProfiSCT> iterator) throws TextConversionException
iterator
with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights.queryObject
- the query object with keywords using which the objects in the collection are rankedkeyWordWeights
- the weights for different layers of keywords (title, etc.)iterator
- the iterator that provides the objects to rankTextConversionException
- if there was a problem retrieving keyword frequencies from the databasepublic java.util.Collection<RankedAbstractObject> rankByKeywords(java.lang.String[][] referenceKeywords, float[] keyWordWeights, java.util.Iterator<? extends MetaObjectProfiSCT> iterator) throws TextConversionException
iterator
with
the distances provided by the weighted Jaccard keyword distance
with word-frequency weights.
The given reference keywords are provided in different layers (title, etc.)
in the respective sub-arrays. Note that the size of the outer array should
not be bigger than the size of the keyWordWeights
.referenceKeywords
- the reference keywords using which the objects in the collection are rankedkeyWordWeights
- the weights for different layers of keywords (title, etc.)iterator
- the iterator that provides the objects to rankTextConversionException
- if there was a problem retrieving keyword frequencies from the databaseprotected void insertWordLinks(int objectId, ObjectIntMultiVector wordIds) throws java.sql.SQLException
insertWordLinkSQL
is null.objectId
- the identifier of the object with the keywordswordIds
- the identifiers of the keywords in the objectjava.sql.SQLException
- if there was an error inserting word linksprotected void deleteWordLinks(int objectId) throws java.sql.SQLException
deleteObjectWordLinksSQL
is null.objectId
- the identifier of the object with the keywordsjava.sql.SQLException
- if there was an error deleting word linkspublic int storeObject(MetaObjectProfiSCT object) throws BucketStorageException
object
- the object to storeBucketStorageException
- if there was an error adding the object to storagepublic void updateObject(int objectId, MetaObjectProfiSCT object) throws BucketStorageException
objectId
- the identifier of the object with the keywordsobject
- the object to storeBucketStorageException
- if there was an error adding the object to storagepublic void removeObject(int objectId) throws BucketStorageException
objectId
- the identifier of the object with the keywordsBucketStorageException
- if there was an error adding the object to storagepublic IntStorageSearch<MetaObjectProfiSCT> getAllObjects()
public IntStorageSearch<MetaObjectProfiSCT> getAllObjects(java.lang.Integer fromId, java.lang.Integer toId)
fromId
- identifier of the starting objecttoId
- identifier of the ending objectpublic Extractor<? extends MetaObjectProfiSCT> createLocatorExtractor(java.lang.String locatorParamName, java.lang.String additionalKeyWordsParamName, boolean removeObjects)
ExtractorDataSource
to get the respective object from the database.locatorParamName
- the name of the ExtractorDataSource
parameter that contains the locatoradditionalKeyWordsParamName
- the name of the ExtractorDataSource
parameter that contains the additional keywordsremoveObjects
- if true, the object is removed from the database after it is retrievedpublic Extractor<? extends MetaObjectProfiSCT> createImageExtractor(java.lang.String extractorCommand, boolean storeObjects, java.lang.String[] dataLineParameterNames)
MetaObjectProfiSCT
.extractorCommand
- the external extractor command for extracting binary imagesstoreObjects
- if true every object successfully extracted by this extractor
is added to the encapsulated storage via storeObject(messif.objects.impl.MetaObjectProfiSCT)
dataLineParameterNames
- a list of names of the ExtractorDataSource
parameters that are appended to the extracted descriptorspublic MultiExtractor<? extends MetaObjectProfiSCT> createImageDirExtractor(java.lang.String extractorCommand, boolean fileAsArgument, boolean storeObjects)
extractorCommand
- the external extractor command for extracting binary imagesfileAsArgument
- if true, the "%s" argument of external command is replaced with the filenamestoreObjects
- if true every object successfully extracted by this extractor
is added to the encapsulated storage via storeObject(messif.objects.impl.MetaObjectProfiSCT)
public java.util.Map<java.lang.Integer,java.lang.Float> getKeywordWeights(ObjectIntMultiVector keywords) throws TextConversionException
keywords
- the keywords to read the weights forTextConversionException
- if there was an error reading the keyword weights from the databasepublic ObjectIntMultiVector.WeightProvider getKeywordWeightProvider(ObjectIntMultiVector keywords, float[] weights) throws TextConversionException
keywords
- the keywords to read the weights forweights
- the weights for different layers of keywords (title, etc.)TextConversionException
- if there was an error reading the keyword weights from the databasepublic static ObjectIntMultiVector.WeightProvider getStaticKeywordWeightProvider(float[] weights) throws TextConversionException
initializeKeywordIdfWeights()
must be used
to load the static keyword weights. The returned object is serializable,
but it requires that the static keyword weights are initialized on the
other side.weights
- the weights for different layers of keywords (title, etc.)TextConversionException
- if there was an error reading the keyword weights from the databasepublic java.util.Map<java.lang.Integer,java.lang.Float> initializeKeywordIdfWeights() throws java.sql.SQLException
java.sql.SQLException
- if there was an error reading the keyword weights from the database