Skip to content

Releases: pykeio/ort

v2.0.0-rc.6

10 Sep 21:46
ee5cc20
Compare
Choose a tag to compare

ort::Error refactor

ort::Error is no longer an enum, but rather an opaque struct with a message and a new ErrorCode field.

ort::Error still implements std::error::Error, so this change shouldn't be too breaking; however, if you were previously matching on ort::Errors, you'll have to refactor your code to instead match on the error's code (acquired with the Error::code() function).

AllocationDevice refactor

The AllocationDevice type has also been converted from an enum to a struct. Common devices like CUDA or DirectML are accessible via associated constants like AllocationDevice::CUDA & AllocationDevice::DIRECTML.

Features

  • 60f6eca Update to ONNX Runtime v1.19.2.
  • 9f4527c Added ModelMetadata::custom_keys() to get a Vec of all custom keys.
  • bfa791d Add various SessionBuilder options affecting compute & graph optimizations.
  • 5e6fc6b Expose the underlying Allocator API. You can now allocate & free buffers acquired from a session or operator kernel context.
  • 52422ae Added ValueType::Optional.
  • 2576812 Added the Vitis AI execution provider for new AMD Ryzen AI chips.
  • 41ef65a Added the RKNPU execution provider for certain Rockchip NPUs.
  • 6b3e7a0 Added KernelContext::par_for, allowing operator kernels to use ONNX Runtime's thread pool without needing an extra dependency on a crate like rayon.

Fixes

  • edcb219 Make environment initialization thread-safe. This should eliminate intermittent segfaults when running tests concurrently, like seen in #278.
  • 3072279 Linux dylibs no longer require version symlinks, fixing #269.
  • bc70a0a Fixed unsatisfiable lifetime bounds when creating Tensors from &CowArrays.
  • 6592b17 Providing more inputs than the model expects no longer segfaults.
  • b595048 Shave off dependencies by removing tracing's attributes feature - a --no-default-features build of ort now only builds 9 crates!
  • c7ddbdb Removed the operator-libraries feature - you can still use SessionBuilder::with_operator_library, it's just no longer gated behind the feature!

If you have any questions about this release, we're here to help:

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-rc.5

18 Aug 19:13
92541cd
Compare
Choose a tag to compare

Possibly breaking

  • Pre-built static libraries (i.e. not cuda or tensorrt) are now linked with /MD instead of /MT on Windows; i.e. MSVC CRT is no longer statically linked. This should resolve linking issues in some cases (particularly crates using other FFI libraries), but may cause issues for others. I have personally tested this in 2 internal pyke projects that depend on ort & many FFI libraries and haven't encountered any issues, but your mileage may vary.

Definitely breaking

  • 069ddfd ort now depends on ndarray 0.16.
  • e2c4549 wasm32-unknown-unknown support has been removed.
    • Getting wasm32-unknown-unknown working in the first place was basically a miracle. Hacking ONNX Runtime to work outside of Emscripten took a lot of effort, but recent changes to Emscripten and ONNX Runtime have made this exponentially more difficult. Given I am not adequately versed on ONNX Runtime's internals, the nigh-impossibility of debugging weird errors, and the vow I took to write as little C++ as possible ever since I learned Rust, it's no longer feasible for me to work on WASM support for ort.
    • If you were using ort in WASM, I suggest you use and/or support the development of alternative WASM-supporting ONNX inference crates like tract or WONNX.

Features

  • ab293f8 Update to ONNX Runtime v1.19.0.
  • ecf76f9 Use the URL hash for downloaded model filenames. Models previously downloaded & cached with commit_from_url will be redownloaded.
  • 9d25514 Add missing configuration keys for some execution providers.
  • 733b7fa New callbacks for the simple Trainer API, just like HF's TrainerCallbacks! This allows you to write custom logging/LR scheduling callbacks. See the updated train-clm-simple example for usage details.

Fixes

  • 1692d11 Fix OpenVINO EP option bugs.
  • 08aaa0f Fix DirectML output tensor extraction.

If you have any questions about this release, we're here to help:

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-rc.4

07 Jul 20:52
04da381
Compare
Choose a tag to compare

This release addresses important linking issues with rc3, particularly regarding CUDA on Linux.

cuDNN 9 is no longer required for CUDA 12 builds (but is still the default); set the ORT_CUDNN_VERSION environment variable to 8 to use cuDNN 8 with CUDA 12.


If you have any questions about this release, we're here to help:

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-rc.3

06 Jul 17:20
3dec017
Compare
Choose a tag to compare

Training

ort now supports a (currently limited subset of) ONNX Runtime's Training API. You can use the on-device Training API for fine-tuning, online learning, or even full pretraining, on any CPU or GPU.

The train-clm example pretrains a language model from scratch. There's also a 'simple' API and related example, which offers a basically one-line training solution akin to 🤗 Transformers' Trainer API:

trainer.train(
	TrainingArguments::new(dataloader)
		.with_lr(7e-5)
		.with_max_steps(5000)
		.with_ckpt_strategy(CheckpointStrategy::Steps(500))
)?;

You can learn more about training with ONNX Runtime here. Please try it out and let us know how we can improve the training experience!

ONNX Runtime v1.18

ort now ships with ONNX Runtime v1.18.

The CUDA 12 build requires cuDNN 9.x, so if you're using CUDA 12, you need to update cuDNN. The CUDA 11 build still requires cuDNN 8.x.

IoBinding

IoBinding's previously rather unsound API has been reworked and actually documented.

Output selection & pre-allocation

Sometimes, you don't need to calculate all of the outputs of a session. Other times, you need to pre-allocate a session's outputs to save on slow device copies or expensive re-allocations. Now, you can do both of these things without IoBinding through a new API: OutputSelector.

let options = RunOptions::new()?.with_outputs(
	OutputSelector::no_default()
		.with("output")
		.preallocate("output", Tensor::<f32>::new(&Allocator::default(), [1, 3, 224, 224])?)
);

let outputs = model.run_with_options(inputs!["input" => input.view()]?, &options)?;

In this example, each call to run_with_options that uses the same options struct will use the same allocation in memory, saving the cost of re-allocating the output; and any outputs that aren't the output aren't even calculated.

Value ergonomics

String tensors are now Tensor<String> instead of DynTensor. They also no longer require an allocator to be provided to create or extract them. Additionally, Maps can also have string keys, and no longer require allocators.

Since value specialization, IntoTensorElementType was used to describe only primitive (i.e. f32, i64) elements. This has since been changed to PrimitiveTensorElementType, which is a subtrait of IntoTensorElementType. If you have type bounds that depended on IntoTensorElementType, you probably want to update them to use PrimitiveTensorElementType instead.

Custom operators

Operator kernels now support i64, string, Vec<f32>, Vec<i64>, and TensorRef attributes, among most other previously missing C API features.

Additionally, the API for adding an operator to a domain has been changed slightly; it is now .add::<Operator>() instead of .add(Operator).

Other changes

  • 80be206 & 8ae23f2 Miscellaneous WASM build fixes.
  • 1c0a5e4 Allow downcasting ValueRef & ValueRefMut.
  • ce5aaba Add EnvironmentBuilder::with_telemetry.
    • pyke binaries were never compiled with telemetry support, only Microsoft-provided Windows builds of ONNX Runtime had telemetry enabled by default; if you are using Microsoft binaries, this will now allow you to disable telemetry.
  • 23fce78 ExecutionProviderDispatch::error_on_failure will immediately error out session creation if the registration of an EP fails.
  • d59ac43 RunOptions is now taken by reference instead of via an Arc.
  • d59ac43 Add Session::run_async_with_options.
  • a92dd30 Enable support for SOCKS proxies when downloading binaries.
  • 19d66de Add AMD MIGraphX execution provider.
  • 882f657 Bundle libonnxruntime in library builds where crate-type=rlib/staticlib.
  • 860e449 Fix build for i686-pc-windows-msvc.
  • 1d89f82 Support pkg-config.

If you have any questions about this release, we're here to help:

Thank you to Florian Kasischke, cagnolone, Ryo Yamashita, and Julien Cretin for contributing to this release!

Thank you to Johannes Laier, Noah, Yunho Cho, Okabintaro, and Matouš Kučera, whose support made this release possible. If you'd like to support ort as well, consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-rc.2

27 Apr 00:19
467d127
Compare
Choose a tag to compare

Changes

  • f30ba57 Update to ONNX Runtime v1.17.3
    • New: CUDA 12 binaries. ort will automatically detect CUDA 12/11 in your environment and install the correct binary.
    • New: Binaries for ROCm on Linux.
    • Note that WASM is still on v1.17.1.
  • b12c43c Support for wasm32-unknown-unknown, wasm32-wasi
  • cedeb55 Swap specialized value upcast and downcast function names to reflect their actual meaning (thanks @/messense for pointing this out!)
  • de3bca4 Fix a segfault with custom operators.
  • 681da43 Fix compatibility with older versions of rustc.
  • 63a1818 Accept ValueRefMut as a session input.
  • 8383879 Add a function to create tensors from a raw device pointer, allowing you to create tensors directly from a CUDA buffer.
  • 4af33b1 Re-export ort-sys as ort::sys.

If you have any questions about this release, we're here to help:

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-rc.1

28 Mar 01:32
69c191d
Compare
Choose a tag to compare

Value specialization

The Value struct has been refactored into multiple strongly-typed structs: Tensor<T>, Map<K, V>, and Sequence<T>, and their type-erased variants: DynTensor, DynMap, and DynSequence.

Values returned by session inference are now DynValues, which behave exactly the same as Value in previous versions.

Tensors created from Rust, like via the new Tensor::new function, can be directly and infallibly extracted into its underlying data via extract_tensor (no try_):

let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDAPinned, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;

let array = tensor.extract_array();
// no need to specify type or handle errors - Tensor<f32> can only extract into an f32 ArrayView

You can still extract tensors, maps, or sequence values normally from a DynValue using try_extract_*:

let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;

DynValue can be upcast()ed to the more specialized types, like DynMap or Tensor<T>:

let tensor: Tensor<f32> = value.upcast()?;
let map: DynMap = value.upcast()?;

Similarly, a strongly-typed value like Tensor<T> can be downcast back into a DynValue or DynTensor.

let dyn_tensor: DynTensor = tensor.downcast();
let dyn_value: DynValue = tensor.into_dyn();

Tensor extraction directly returns an ArrayView

extract_tensor (and now try_extract_tensor) now return an ndarray::ArrayView directly, instead of putting it behind the old ort::Tensor<T> type (not to be confused with the new specialized value type). This means you don't have to .view() on the result:

-let generated_tokens: Tensor<f32> = outputs["output1"].extract_tensor()?;
-let generated_tokens = generated_tokens.view();
+let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;

Full support for sequence & map values

You can now construct and extract Sequence/Map values.

Value views

You can now obtain a view of any Value via the new view() and view_mut() functions, which operate similar to ndarray's own view system. These views can also now be passed into session inputs.

Mutable tensor extraction

You can extract a mutable ArrayViewMut or &mut [T] from a mutable reference to a tensor.

let (raw_shape, raw_data) = tensor.extract_raw_tensor_mut();

Device-allocated tensors

You can now create a tensor on device memory with Tensor::new & an allocator:

let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDAPinned, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;

The data will be allocated by the device specified by the allocator. You can then use the new mutable tensor extraction to modify the tensor's data.

What if custom operators were 🚀 blazingly 🔥 fast 🦀?

You can now write custom operator kernels in Rust. Check out the custom-ops example.

Custom operator library feature change

Since custom operators can now be written completely in Rust, the old custom-ops feature, which enabled loading custom operators from an external dynamic library, has been renamed to operator-libraries.

Additionally, Session::with_custom_ops_lib has been renamed to Session::with_operator_library, and the confusingly named Session::with_enable_custom_ops (which does not enable custom operators in general, but rather attempts to load onnxruntime-extensions) has been updated to Session::with_extensions to reflect its actual behavior.

Asynchronous inference

Session introduces a new run_async method which returns inference results via a future. It's also cancel-safe, so you can simply cancel inference with something like tokio::select! or tokio::time::timeout.


If you have any questions about this release, we're here to help:

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-alpha.4

28 Dec 06:07
0aec403
Compare
Choose a tag to compare

Features

  • af97600 Add support for extracting sequences & maps.

Changes

  • 153d7af Remove built-in ONNX Model Zoo structs (note that you can still use with_model_downloaded, just now only with URLs)

This is likely one of the last alpha releases before v2.0 becomes stable 🎉

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-alpha.3

15 Dec 23:51
dd8dcae
Compare
Choose a tag to compare

Fixes

  • 863f1f3 Pin Model Zoo URLs to the old repo structure, new models will be coming soon.

Features

  • 32e7fab Add ort::init_from on feature load-dynamic to set the path to the dylib at runtime.
  • 52559e4 Cache downloaded binaries & models across all projects. Please update to save my bandwidth =)
  • 534a42a Removed with_log_level. Instead, logging level will be controlled entirely by tracing.
  • a9e146b Implement TryFrom<(Vec<i64>, Arc<Box<[T]>>)> for Value, making default-features = false more ergonomic
  • 152f97f Add Value::dtype() to get the dtype of a tensor value.

Changes

  • 32e7fab Remove the dependency on once_cell.
  • acfa782 Remove the ORT_STRATEGY environment variable. No need to specify ORT_STRATEGY=system anymore, you only need to set ORT_LIB_LOCATION.

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-alpha.2

28 Nov 04:57
b849219
Compare
Choose a tag to compare

Fixes

  • 6938da1 Fix compilation on Windows in some cases when using pyke binaries by linking to DirectML dependencies.
  • 2373ec5 Fix linking for Android. (#121)
  • b04a964 Fix linking for iOS and add profile option (#121)

Features

  • 30360eb Update binaries to ONNX Runtime v1.16.3 - compile Linux libraries on Ubuntu 20.04 to fix glibc issues.
  • 9ed222f Make XNNPACK & ArmNN structs public

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛

v2.0.0-alpha.1

22 Nov 16:29
deeff50
Compare
Choose a tag to compare

This is the first alpha release for ort v2.0. This update overhauls the API, improves performance, fixes bugs, and makes ort simpler to use than ever before.

Our shiny new website will help you migrate: https://ort.pyke.io/migrating/v2

Stuck? We're here to help:

A huge thank you to those who contributed to this release: Ben Harris, Ivan Krivosheev, Rui He, Andrea Corradi, & Lenny

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛