Skip to content

Releases: huggingface/setfit

v1.1.0 - Sentence Transformers as the finetuning backend; tackle deprecations of other dependencies

19 Sep 09:28
Compare
Choose a tag to compare

This release introduces a new backend to finetune embedding models, based on the Sentence Transformers Trainer, tackles deprecations of other dependencies like transformers, deprecates Python 3.7 while adding support for new Python versions, and applies some other minor fixes. There shouldn't be any breaking changes.

Install this version with

pip install -U setfit

Defer the embedding model finetuning phase to Sentence Transformers (#554)

In SetFit v1.0, the old model.fit training from Sentence Transformers was replaced by a custom training loop that has some features the former was missing, such as loss logging, useful callbacks, etc. However, since then, Sentence Transformers v3 has released, which also added all of the features that were previously lacking. To simplify the training moving forward, the training is now (once again) deferred to Sentence Transformers.

Because both the old and new training approach are inspired by the transformers Trainer, there should not be any breaking changes. The primary notable change is that training now requires accelerate (as Sentence Transformers requires it), and we benefit from some of the Sentence Transformers training features, such as multi-GPU training.

Solve discrepancies with new versions of dependencies

To ensure compatibility with the latest versions of dependencies, the following issues have been addressed:

  • Follow the (soft) deprecation of evaluation_strategy to eval_strategy (#538). This previously resulted in crashes if your transformers version was too new.
  • Avoid the now-deprecated DatasetFilter (#527). This previously resulted in crashes if your huggingface-hub version was too new.

Python version support

  • Following Python 3.7 its deprecation by the Python team, Python 3.7 is now also deprecated by SetFit moving forward. (#506)
  • We've added official support for Python 3.11 and 3.12 now that both are included in our test suite. (#550)

Minor changes

  • Firm up max_steps and eval_max_steps: rather than being a rough maximum limit, the limit is now exact. This can be helpful to avoid memory overflow, especially in situations with notable dataset imbalances. (#549)
  • Training and validation losses are now nicely logged in notebooks. (#557)

Minor bug fixes

  • Fix bug where device parameter in SetFitHead is ignored if CUDA is not available. (#518)

All Changes

New Contributors

Full Changelog: v1.0.3...v1.1.0

v1.0.3

16 Jan 17:12
Compare
Choose a tag to compare

This is a patch release with two notable fixes and a feature:

  • Training logs now correctly list the number of training examples (now called "unique pairs")
  • The warmup steps is now based on the number of steps rather than args.max_steps if args.max_steps > the number of steps. This prevents accidentally being in warm-up for longer than the desired warmup proportion.
  • When training with string labels, the model now tries to automatically set the string labels to SetFitModel.labels if this variable hasn't been defined yet.

The PRs:

Full Changelog: v1.0.2...v1.0.3

v1.0.2

11 Jan 14:50
Compare
Choose a tag to compare

What's Changed

  • Fix: Python-ify evaluation results before writing model card by @tomaarsen in #460
  • Resolve crash with predict_proba & multi-output by @tomaarsen in #466
  • Remove breaking shuffle DataLoader option by @tomaarsen in #470
  • Predict for ABSA models with a gold aspect dataset by @tomaarsen in #469
  • Prepare SetFit for upcoming 2.3.0 release of SentenceTransformers by @tomaarsen in #463

Full Changelog: v1.0.1...v1.0.2

v1.0.1

07 Dec 15:57
Compare
Choose a tag to compare

v1.0.1 Patch Release

  • Fixes ConstructorError when saving a SetFitModel that was trained with a custom evaluation metrics (#460)

v1.0.0

06 Dec 14:49
Compare
Choose a tag to compare

v1.0.0 Full SetFit Release

This release heavily refactors the SetFit trainer and introduces some much requested features, such as:

  • New Trainer, new TrainingArguments with many, many new arguments.
  • Configurable logging, automatic logging to Weights & Biases and Tensorboard if installed.
  • Evaluation during training, early stopping support to combat overfitting.
  • Checkpointing + loading the best model at the end.
  • SetFit for Aspect Based Sentiment Analysis in collaboration with Intel Labs.
  • Heavily improved automatic model card generation.
  • Extensive callbacks support based on transformers.
  • Full, extensive documentation: http://hf.co/docs/setfit
  • and more!

v1.0.0 Migration Guide

Read the v1.0.0 Migration Guide in the documentation: https://hf.co/docs/setfit/how_to/v1.0.0_migration_guide

v1.0.0 Detailed Release Notes

Read the more detailed release notes in the documentation: https://huggingface.co/docs/setfit/how_to/v1.0.0_migration_guide#v100-changelog

What's Changed

  • Preserve dataset features in sample_dataset by @grofte in #396
  • Allow other datasets in trainer.evaluate() by @grofte in #402
  • Normalize device to CPU when evaluating by @tomaarsen in #363
  • show_progress_bar as parameter on predict and predict_prob by @davidsbatista in #429
  • Refactor to introduce Trainer & TrainingArguments, add SetFit ABSA by @tomaarsen in #265
  • fix: make sampling more reproducible by @yahiaelgamal in #441
  • Allow setting batch size in SetFitModel.predict by @tomaarsen in #443
  • Save differentiable model head on CPU by @tomaarsen in #444
  • Allow 'device' on SetFitModel.from_pretrained() by @tomaarsen in #445
  • Add notebook to demonstrate how efficiently running SetFit with ONNX by @MosheWasserb in #435
  • Add "labels" to SetFitModel, store/load from configuration file by @tomaarsen in #447
  • Allow passing strings to model.predict by @tomaarsen in #448
  • Allow partial column mappings by @tomaarsen in #449
  • Allow normalize_embeddings with a differentiable head by @tomaarsen in #450
  • Heavily improve automatic model card generation by @tomaarsen in #452
  • Also pass metric_kwargs to custom metric callable by @tomaarsen in #456
  • Prepare v1.0.0 release - Trainer, TrainingArguments, SetFitABSA, logging, evaluation during training, callbacks, docs by @tomaarsen in #439

New Contributors

Full Changelog: v0.7.0...v1.0.0

v0.7.0

14 Apr 15:18
Compare
Choose a tag to compare

v0.7.0 Bug Fixes Galore

This release introduces numerous bug fixes, including critical ones for push_to_hub, save_pretrained and distillation training.

Bug fixes and improvements

  • Add a warning if an unsplit dataset is passed to SetFitTrainer by @jaalu in #299
  • Improve dataset pre-processing speeds for large datasets by @logan-markewich in #309
  • Add Path support to _save_pretrained, resolve TypeError: unsupported operand type(s) for +: 'PosixPath' and 'str' by @tomaarsen in #332
  • Add Hallmarks of Cancer notebook by @MosheWasserb in #333
  • Initialize SetFitModel with cls instead by @kobiche in #341
  • Allow distillation training with models using differentiable heads by @tomaarsen in #343
  • Prevent TypeError on model.predict when using string labels by @tomaarsen in #331
  • Restrict pandas to <2 for compatibility tests by @tomaarsen in #350
  • Update Trainer.push_to_hub to use **kwargs by @tomaarsen in #351
  • Add metric keyword arguments, e.g. add "average" strategy to f1 by @tomaarsen in #353

Significant community contributions

The following contributors have made significant changes to the library over the last release:

  • @jaalu
    • Add a warning if an unsplit dataset is passed to SetFitTrainer (#299)
  • @tomaarsen
    • Add comparison plotting script (#319)
    • Resolve IndexError if there is just one K-shot scenario
    • Reintroduce Usage in README until docs are ready
    • Add Path support to _save_pretrained (#332)
    • Allow distillation training with models using differentiable heads (#343)
    • Prevent TypeError on model.predict when using string labels (#331)
    • Restrict pandas to <2 for compatibility tests (#350)
    • Update Trainer.push_to_hub to use **kwargs (#351)
    • Add metric keyword arguments, e.g. add "average" strategy to f1 (#353)
  • @EdAbati
    • Add cache for 🤗 Hub models in the CI (#312)
    • Rerun hyperparameter search notebook (#321)
  • @MosheWasserb
    • Add Hallmarks of Cancer notebook (#333)

v0.6.0

08 Feb 13:59
Compare
Choose a tag to compare

v0.6.0 OpenVINO exporter, model cards, and various quality of life improvements 🔥

To bring in the new year, this release comes with many bug fixes and quality of life improvements around using SetFit models. It also provides:

  • an OpenVINO exporter that you can optimise your models for inference with. Check out the notebooks for an example.
  • a dedicated model card with metadata and usage instructions. See here for an example output from push_to_hub(): https://huggingface.co/lewtun/setfit-new-model-card

Bug fixes and improvements

Significant community contributions

The following contributors have made significant changes to the library over the last release:

  • @tomaarsen
    • Always install the checked-out setfit (#235)
    • Add SetFitModel.to (#229) (#236)
    • Prevent overriding the sample size in sample_dataset (#231)
    • Always display test coverage; add tests (#240)
    • Automatically create summary table after scripts/setfit/run_fewshot.py (#262)
    • Fix squared optimization steps bug in distillation trainer (#284)
    • Resolve SentenceTransformer resetting devices after moving a SetFitModel (#283)
    • Reformat according to the newest black version
    • Remove doubled space in warning message
    • Exclude compatibility versions from dev setup (#286)
  • @Yongtae723
    • add related work in readme (#239)
    • Fix type hints (#266)
    • Add multi-target support to SetFitHead (#272)
  • @danielkorat
    • Fix seed in trainer.py (#243)
    • add run_zeroshot.py; add functionality to data.get_templated_dataset() (formerly add_templated_examples()) (#292)
  • @AlexKoff88
    • Added support of OpenVINO export (#214)

v0.5.0 Knowledge distillation trainer & ONNX exporter

14 Dec 15:03
Compare
Choose a tag to compare

This release comes with two main features:

  • A DistillationSetFitTrainer class that allows users to use unlabeled data to significantly boost the performance of small models like MiniLM. See this workshop for an end-to-end example.
  • An ONNX exporter that converts the SetFit model instances into ONNX graphs for downstream inference + optimisation. Checkout the notebooks folder for an end-to-end example.

Kudos to @orenpereg and @nbertagnolli for implementing both of these features 🔥

Bug fixes and improvements

Significant community contributions

The following contributors have made significant changes to the library over the last release:

v0.4.1 Patch release

08 Nov 15:26
Compare
Choose a tag to compare

Fixes an issue on Google Colab, where the default version of Python 3.7 is incompatible with the Literal type. See #162 for more details.

v0.4.0 Differentiable heads & various quality of life improvements

08 Nov 14:11
Compare
Choose a tag to compare

Differentiable heads for SetFitModel

@blakechi has implemented a differentiable head in PyTorch for SetFitModel that enables the model to be trained end-to-end. The implementation is backwards compatible with the scikit-learn heads and can be activated by setting use_differentiable_head=True when loading SetFitModel. Here's a full example:

from datasets import load_dataset
from sentence_transformers.losses import CosineSimilarityLoss

from setfit import SetFitModel, SetFitTrainer


# Load a dataset from the Hugging Face Hub
dataset = load_dataset("sst2")

# Simulate the few-shot regime by sampling 8 examples per class
num_classes = 2
train_dataset = dataset["train"].shuffle(seed=42).select(range(8 * num_classes))
eval_dataset = dataset["validation"]

# Load a SetFit model from Hub
model = SetFitModel.from_pretrained(
    "sentence-transformers/paraphrase-mpnet-base-v2",
    use_differentiable_head=True,
    head_params={"out_features": num_classes},
)

# Create trainer
trainer = SetFitTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    loss_class=CosineSimilarityLoss,
    metric="accuracy",
    batch_size=16,
    num_iterations=20, # The number of text pairs to generate for contrastive learning
    num_epochs=1, # The number of epochs to use for constrastive learning
    column_mapping={"sentence": "text", "label": "label"} # Map dataset columns to text/label expected by trainer
)

# Train and evaluate
trainer.freeze() # Freeze the head
trainer.train() # Train only the body

# Unfreeze the head and freeze the body -> head-only training
trainer.unfreeze(keep_body_frozen=True)
# or
# Unfreeze the head and unfreeze the body -> end-to-end training
trainer.unfreeze(keep_body_frozen=False)

trainer.train(
    num_epochs=25, # The number of epochs to train the head or the whole model (body and head)
    batch_size=16,
    body_learning_rate=1e-5, # The body's learning rate
    learning_rate=1e-2, # The head's learning rate
    l2_weight=0.0, # Weight decay on **both** the body and head. If `None`, will use 0.01.
)
metrics = trainer.evaluate()

# Push model to the Hub
trainer.push_to_hub("my-awesome-setfit-model")

# Download from Hub and run inference
model = SetFitModel.from_pretrained("lewtun/my-awesome-setfit-model")
# Run inference
preds = model(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"]) 

Bug fixes and improvements

Significant community contributions

The following contributors have made significant changes to the library over the last release:

  • @pdhall99
    • fix: allow load of pretrained model without head
    • fix: templated examples copy empty vector (#148)
  • @PhilipMay
    • add num_epochs to train_step calculation (#139)
    • redirect call to predict (#142)
    • Fix non default loss_class issue (#154)
    • Add more loss function options (#159)
  • @blakechi
    • Support for the differentiable head (#112)
    • Add the usage and relevant info. of the differentiable head to README (#149)
  • @mpangrazzi
    • Add support to kwargs in compute() method called by trainer.evaluate() (#125)