Skip to content

Releases: quickwit-oss/tantivy

Tantivy 0.12

19 Feb 13:29
Compare
Choose a tag to compare
  • Removing static dispatch in tokenizers for simplicity. (#762)
  • Added backward iteration for TermDictionary stream. (@halvorboe)
  • Fixed a performance issue when searching for the posting lists of a missing term (@audunhalland)
  • Added a configurable maximum number of docs (10M by default) for a segment to be considered for merge (@hntd187, landed by @halvorboe #713)
  • Important Bugfix #777, causing tantivy to retain memory mapping. (diagnosed by @poljar)
  • Added support for field boosting. (#547, @fulmicoton)

Tantivy 0.11.3

20 Dec 12:24
Compare
Choose a tag to compare
  • Fixed DateTime as a fast field (#735)

Tantivy 0.11.1

17 Dec 12:12
Compare
Choose a tag to compare

Tantivy 0.11.0

15 Dec 23:56
Compare
Choose a tag to compare
  • Added f64 field. Internally reuse u64 code the same way i64 does (@fdb-hiroshima)
  • Various bugfixes in the query parser.
    • Better handling of hyphens in query parser. (#609)
    • Better handling of whitespaces.
  • Closes #498 - add support for Elastic-style unbounded range queries for alphanumeric types eg. "title:>hello", "weight:>=70.5", "height:<200" (@petr-tik)
  • API change around Box<BoxableTokenizer>. See detail in #629
  • Avoid rebuilding Regex automaton whenever a regex query is reused. #639 (@brainlock)
  • Add footer with some metadata to index files. #605 (@fdb-hiroshima)
  • Add a method to check the compatibility of the footer in the index with the running version of tantivy (@petr-tik)
  • TopDocs collector: ensure stable sorting on equal score. #671 (@brainlock)
  • Added handling of pre-tokenized text fields (#642), which will enable users to
    load tokens created outside tantivy. See usage in examples/pre_tokenized_text. (@kkoziara)
  • Fix crash when committing multiple times with deleted documents. #681 (@brainlock)

How to update?

  • The index format is changed. You are required to reindex your data to use tantivy 0.11.
  • Box<dyn BoxableTokenizer> has been replaced by a BoxedTokenizer struct.
  • Regex are now compiled when the RegexQuery instance is built. As a result, it can now return
    an error and handling the Result is required.
  • tantivy::version() now returns a Version object. This object implements ToString()

Tantivy 0.10.3

10 Nov 04:51
Compare
Choose a tag to compare
  • Fix crash when committing multiple times with deleted documents. #681 (@brainlock)

Tantivy 0.10.2

01 Oct 00:43
Compare
Choose a tag to compare

Hotfix for #656

Tantivy 0.10.1

30 Jul 02:58
Compare
Choose a tag to compare
  • Closes #544. A few users experienced problems with the directory watching system.
    Avoid watching the mmap directory until someone effectively creates a reader that uses
    this functionality.

Tantivy 0.10.0

11 Jul 10:15
Compare
Choose a tag to compare

Tantivy 0.10.0 index format is compatible with the index format in 0.9.0.

  • Added an API to easily tweak or entirely replace the
    default score. See TopDocs::tweak_scoreand TopScore::custom_score (@pmasurel)
  • Added an ASCII folding filter (@drusellers)
  • Bugfix in query.count in presence of deletes (@pmasurel)
  • Added .explain(...) in Query and Weight to (@pmasurel)
  • Added an efficient way to delete_all_documents in IndexWriter (@petr-tik).
    All segments are simply removed.

Minor

  • Switched to Rust 2018 (@uvd)
  • Small simplification of the code.
    Calling .freq() or .doc() when .advance() has never been called
    on segment postings should panic from now on.
  • Tokens exceeding u16::max_value() - 4 chars are discarded silently instead of panicking.
  • Fast fields are now preloaded when the SegmentReader is created.
  • IndexMeta is now public. (@hntd187)
  • IndexWriter add_document, delete_term. IndexWriter is Sync, making it possible to use it with a Arc<RwLock<IndexWriter>>. add_document and delete_term can
    only require a read lock. (@pmasurel)
  • Introducing Opstamp as an expressive type alias for u64. (@petr-tik)
  • Stamper now relies on AtomicU64 on all platforms (@petr-tik)
  • Bugfix - Files get deleted slightly earlier
  • Compilation resources improved (@fdb-hiroshima)

How to update?

Your program should be usable as is.

Fast fields

Fast fields used to be accessed directly from the SegmentReader.
The API changed, you are now required to acquire your fast field reader via the
segment_reader.fast_fields(), and use one of the typed method:

  • .u64(), .i64() if your field is single-valued ;
  • .u64s(), .i64s() if your field is multi-valued ;
  • .bytes() if your field is bytes fast field.

Tantivy 0.9.1

28 Mar 00:59
Compare
Choose a tag to compare

Hotfix . All language were using the English stemmer.

Tantivy 0.9

20 Mar 13:13
Compare
Choose a tag to compare

0.9.0 index format is not compatible with the previous index format.

Bugfix

Some Mmap objects were being leaked, and would never get released. (@fulmicoton)

New Features

  • Added IndexReader. By default, index is reloaded automatically upon new commits (@fulmicoton)
  • Stemming in other language possible (@pentlander)
  • Added grouped add and delete operations.
    They are guaranteed to happen together (i.e. they cannot be split by a commit).
    In addition, adds are guaranteed to happen on the same segment. (@elbow-jason)
  • Added DateTime field (@barrotsteindev)

Misc improvements

  • Indexer memory footprint improved. (VInt comp, inlining the first block. (@fulmicoton)
  • Removed most unsafe (@fulmicoton)
  • Segments with no docs are deleted earlier (@barrotsteindev)
  • Removed INT_STORED and INT_INDEXED. It is now possible to use STORED and INDEXED
    for int fields. (@fulmicoton)