Skip to content

v2.0.0-rc.1

Compare
Choose a tag to compare
@decahedron1 decahedron1 released this 28 Mar 01:32
69c191d

Value specialization

The Value struct has been refactored into multiple strongly-typed structs: Tensor<T>, Map<K, V>, and Sequence<T>, and their type-erased variants: DynTensor, DynMap, and DynSequence.

Values returned by session inference are now DynValues, which behave exactly the same as Value in previous versions.

Tensors created from Rust, like via the new Tensor::new function, can be directly and infallibly extracted into its underlying data via extract_tensor (no try_):

let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDAPinned, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;

let array = tensor.extract_array();
// no need to specify type or handle errors - Tensor<f32> can only extract into an f32 ArrayView

You can still extract tensors, maps, or sequence values normally from a DynValue using try_extract_*:

let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;

DynValue can be upcast()ed to the more specialized types, like DynMap or Tensor<T>:

let tensor: Tensor<f32> = value.upcast()?;
let map: DynMap = value.upcast()?;

Similarly, a strongly-typed value like Tensor<T> can be downcast back into a DynValue or DynTensor.

let dyn_tensor: DynTensor = tensor.downcast();
let dyn_value: DynValue = tensor.into_dyn();

Tensor extraction directly returns an ArrayView

extract_tensor (and now try_extract_tensor) now return an ndarray::ArrayView directly, instead of putting it behind the old ort::Tensor<T> type (not to be confused with the new specialized value type). This means you don't have to .view() on the result:

-let generated_tokens: Tensor<f32> = outputs["output1"].extract_tensor()?;
-let generated_tokens = generated_tokens.view();
+let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;

Full support for sequence & map values

You can now construct and extract Sequence/Map values.

Value views

You can now obtain a view of any Value via the new view() and view_mut() functions, which operate similar to ndarray's own view system. These views can also now be passed into session inputs.

Mutable tensor extraction

You can extract a mutable ArrayViewMut or &mut [T] from a mutable reference to a tensor.

let (raw_shape, raw_data) = tensor.extract_raw_tensor_mut();

Device-allocated tensors

You can now create a tensor on device memory with Tensor::new & an allocator:

let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDAPinned, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;

The data will be allocated by the device specified by the allocator. You can then use the new mutable tensor extraction to modify the tensor's data.

What if custom operators were 🚀 blazingly 🔥 fast 🦀?

You can now write custom operator kernels in Rust. Check out the custom-ops example.

Custom operator library feature change

Since custom operators can now be written completely in Rust, the old custom-ops feature, which enabled loading custom operators from an external dynamic library, has been renamed to operator-libraries.

Additionally, Session::with_custom_ops_lib has been renamed to Session::with_operator_library, and the confusingly named Session::with_enable_custom_ops (which does not enable custom operators in general, but rather attempts to load onnxruntime-extensions) has been updated to Session::with_extensions to reflect its actual behavior.

Asynchronous inference

Session introduces a new run_async method which returns inference results via a future. It's also cancel-safe, so you can simply cancel inference with something like tokio::select! or tokio::time::timeout.


If you have any questions about this release, we're here to help:

Love ort? Consider supporting us on Open Collective 💖

❤️💚💙💛