Evaluatıng the Use of Neural Rankıng Methods in Search Engınes
Özet
A search engine strikes a balance between effectiveness and efficiency to retrieve the best documents in a scalable way. Recent deep learning-based ranker methods prove effective and improve state of the art in relevancy metrics. However, unlike index-based retrieval methods, neural rankers like BERT do not scale to large datasets. In this thesis, we propose a query term weighting method that can be used with a standard inverted index without modifying it. Using a pairwise ranking loss, query term weights are learned using relevant and irrelevant document pairs for each query. The learned weights prove to be more effective than term recall values previously used for the task. We further show that these weights can be predicted with a BERT regression model and improve the performance of both a BM25 based index and an index already optimized with a term weighting function. In addition, we examine document term weighting methods in the literature that work by manipulating term frequencies or expanding documents for document retrieval tasks. Predicting weights with the help of contextual knowledge about document instead of term frequencies for documents terms significantly increase retrieval and ranking performance.