Skip to content

felipehummel/LuceneQueryingUtils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

FlexibleANDQuery

A Lucene Query implementation that behaves like a typical AND query. But you may choose to re-run it if there wasn't enough results.

At each re-run it removes the 'least important' term of the query, by basically changing it from MUST to SHOULD occur. The notion of 'least important' is customizable through a given Comparator<String>.

If no comparator is given, the doc frequency of a term is used (more docs, more 'importance').

With FlexibleANDQuery you guarantee that documents with all query terms appear in the top of the ranking before other documents. But you can still return documents that do not have all query terms. Normally, the vector space model (TF-IDF) takes care of that, but not always.

Typical usage:

final FlexibleANDQuery query = new FlexibleANDQuery(keyword_query, "content", analyzer, current_reader);
boolean tryAgain = true;
TopDocs top_docs;
while (tryAgain) {
        top_docs = searcher.search(query, 20);
        if (top_docs.scoreDocs.length > 0 || nquery.getNumberOfMustOccurTerms() < 2) 
    		tryAgain = false;
} 
// ...
// use top_docs

CountingCollector

A simple counting collector. It does not generate any kind of document ranking.

About

Some classes to use with the Lucene search library

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages