Advantage: Use DNN models to do this. Key word: DNN Usage: Used in text-based query-to-document retrieval.
Retrieve documents related with an input query.
Proposed DNN architecture:
Term vector
Term vector use letter or word as a dimension and counts of that term.
Word hashing
In order to reduce the dimensionality of bag-of-words term vectors, it is based on letter n-gram. In this paper, the author used letter trigrams. The collisions are rare under this condition.
They have click-through logs that consist of a list of queries and their clicked documents. This indicates that this query is related with this document. So how they train this model? They want to measure that given a query and its correlated document, maximize the likelihood of the clicked documents. The likelihood of the clicked document is
The probability of a document given the semantic relevance score is
They tried to maximize the