Skip to content

Upgrading to v2.0 API

Brice Goglin edited this page Dec 11, 2017 · 45 revisions

Here is a list of recommended ways to work-around API changes.

Detecting the hwloc version

To detect whether your building against hwloc 2.0.0 or later:

#if HWLOC_API_VERSION >= 0x20000
...
#endif

If using official releases, the library soname ensures you're using the right library (6 instead of 5). If using unofficial or nightly tarballs, you may also check the runtime API version:

#include <hwloc.h>
#if HWLOC_API_VERSION >= 0x00020000
  /* headers are recent */
  if (hwloc_get_api_version() < 0x20000)
    ... error out, the hwloc runtime library is older than 2.0 ...
#else
  /* headers are pre-2.0 */
  if (hwloc_get_api_version() >= 0x20000)
    ... error out, the hwloc runtime library is more recent than 2.0 ...
#endif

Object changes

Memory children

NUMA nodes are not in the main tree anymore. They are attached under objects as memory children. This list starts at obj->memory_first_child and its size is obj->memory_arity. Hence there can now exist two local NUMA nodes, for instance on KNL.

The normal list of children (starting at obj->first_child, ending at obj->last_child, of size obj->arity, and available as the array obj->children) now only contains CPU-side objects: PUs, Cores, Packages, Caches, Groups, Machine and System. hwloc_get_next_child() may still be used to iterate over all children of all lists.

Hence there is a CPU-hierarchy using normal children, while memory is attached to that hierarchy depending on its affintiy.

For instance:

  • a machine with 2 packages but a single NUMA node is now modeled as a "Machine" object with two "Package" children and one "NUMANode" memory children (displayed first in lstopo below).
Machine (1024MB total)
  NUMANode L#0 (P#0 1024MB)
  Package L#0
    Core L#0 + PU L#0 (P#0)
    Core L#1 + PU L#1 (P#1)
  Package L#1
    Core L#2 + PU L#2 (P#2)
    Core L#3 + PU L#3 (P#3)
  • a machine with 2 packages with one NUMA node and 2 cores in each is now
Machine (2048MB total)
  Package L#0
    NUMANode L#0 (P#0 1024MB)
    Core L#0 + PU L#0 (P#0)
    Core L#1 + PU L#1 (P#1)
  Package L#1
    NUMANode L#1 (P#1 1024MB)
    Core L#2 + PU L#2 (P#2)
    Core L#3 + PU L#3 (P#3)
  • if there are two NUMA nodes per package:
Machine (4096MB total)
  Package L#0
    Group0 L#0
      NUMANode L#0 (P#0 1024MB)
      Core L#0 + PU L#0 (P#0)
      Core L#1 + PU L#1 (P#1)
    Group0 L#1
      NUMANode L#1 (P#1 1024MB)
      Core L#2 + PU L#2 (P#2)
      Core L#3 + PU L#3 (P#3)
  Package L#1
    [...]
  • In practice, there's usually a L3 instead of the above Group:
Machine (4096MB total)
  Package L#0
    L3 L#0 (16MB)
      NUMANode L#0 (P#0 1024MB)
      Core L#0 + PU L#0 (P#0)
      Core L#1 + PU L#1 (P#1)
    L3 L#1 (16MB)
      NUMANode L#1 (P#1 1024MB)
      Core L#2 + PU L#2 (P#2)
      Core L#3 + PU L#3 (P#3)
  Package L#1
    [...]

Functions for iterating over the level of NUMA nodes still work fine.

However, applications that ever walked up/down to find NUMANode parent/children must now be updated. For instance, finding a NUMANode parent should be replaced with finding a parent that has a memory child, and using that child.

It is still possible to look at a nodeset and then iterate over NUMA nodes whose nodeset is included.

I/O and Misc children

I/O children are not in the main object children list anymore either. They are in the list starting at obj->io_first_child and whose size if obj->io_arity.

Misc children are not in the main object children list anymore. They are in the list starting at obj->misc_first_child and whose size if obj->misc_arity.

hwloc_get_next_child() may still be used to iterate over all children of all lists.

HWLOC_OBJ_CACHE replaced

Instead of a single HWLOC_OBJ_CACHE, there are now 8 types HWLOC_OBJ_L1CACHE, ..., HWLOC_OBJ_L5CACHE, HWLOC_OBJ_L1ICACHE, ..., HWLOC_OBJ_L3ICACHE. Cache object attributes is unchanged.

hwloc_get_cache_type_depth() is not really needed to disambiguate cache types anymore since new types can be passed to hwloc_get_type_depth() without ever getting HWLOC_TYPE_DEPTH_MULTIPLE anymore.

hwloc_obj_type_is_cache(), hwloc_obj_type_is_dcache() and hwloc_obj_type_is_icache() may be used to check whether a given type is a cache, data/unified cache or instruction cache.

allowed_cpuset and allowed_nodeset only in the main topology

Objects do not have allowed_cpuset and allowed_nodeset anymore. They are only available for the entire topology using hwloc_topology_get allowed_cpuset() and hwloc_topology_get_allowed_nodeset().

As usual, those are only needed when the WHOLE_SYSTEM topology flag is given, which means disallowed objects are kept in the topology. If so, you may find out whether some PUs inside an object is allowed by checking whether hwloc_bitmap_intersects(obj->cpuset, hwloc_topology_get_allowed_cpuset(topology)). Replace cpusets with nodesets for NUMA nodes. To find out which ones, replace intersects() with and() to get the actual intersection.

Object depths are now signed int

obj->depth as well as depth given to functions such as hwloc_get_obj_by_depth() or returned by hwloc_topology_get_depth() are now signed int.

Other depth such as cache-specific depth attribute are still unsigned.

Memory attributes become NUMANode-specific

Memory attributes such as obj->memory.local_memory are now only available in NUMANode-specific attributes in obj->attr->numanode.local_memory.

Except obj->memory.total_memory which is still available in all objects as obj->total_memory.

Topology configuration changes

hwloc_topology_ignore_type(), hwloc_topology_ignore_type_keep_structure() and hwloc_topology_ignore_all_keep_structure() replaced

Respectively superseded by

hwloc_topology_set_type_filter(topology, type, HWLOC_TYPE_FILTER_KEEP_NONE);
hwloc_topology_set_type_filter(topology, type, HWLOC_TYPE_FILTER_KEEP_STRUCTURE);
hwloc_topology_set_all_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_STRUCTURE);

Also, the meaning of KEEP_STRUCTURE has changed (only entire levels may be ignored, instead of single objects), the old behavior is not available anymore.

HWLOC_TOPOLOGY_FLAG_ICACHES replaced

Superseded by

hwloc_topology_set_icache_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_ALL);

HWLOC_TOPOLOGY_FLAG_WHOLE_IO, HWLOC_TOPOLOGY_FLAG_IO_DEVICES and HWLOC_TOPOLOGY_FLAG_IO_BRIDGES replaced

To keep all I/O devices (PCI, Bridges, and OS devices), use:

hwloc_topology_set_io_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_ALL);

To only keep important devices (Bridges with children, common PCI devices and OS devices):

hwloc_topology_set_io_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_IMPORTANT);

XML changes

2.0 XML files are not compatible with 1.x

2.0 can load 1.x files, with some caveats:

  • Only NUMA-distances are imported. Other distance matrices are ignored (they were never used by default anyway).

2.0 can export 1.x-compatible files, with some caveats:

  • Only distances attached to the root object are exported (i.e. distances that cover the entire machine). Other distance matrices are dropped (they were never used by default anyway).

Users are advised to negociate hwloc versions between exporter and importer: If the importer isn't 2.x, the exporter should export to 1.x. Otherwise, things should work by default. See below.

hwloc_topology_export_xml() and hwloc_topology_export_xmlbuffer() have a new flags argument

Flags may be used to force a hwloc-1.x-compatible XML export. This should be negociated/detected between the importer and the exporter processes so that the most recent XML format is used:

  • If both always support 2.0, don't pass any flag.
  • When the importer uses hwloc 1.x, export with HWLOC_TOPOLOGY_EXPORT_XML_FLAG_V1. Otherwise the importer will fail to import.
  • When the exporter uses hwloc 1.x, a 2.0 importer can import without problem.
#if HWLOC_API_VERSION >= 0x20000
   if (need 1.x compatible XML export)
      hwloc_topology_export_xml(...., HWLOC_TOPOLOGY_EXPORT_XML_FLAG_V1);
   else /* need 2.x compatible XML export */
      hwloc_topology_export_xml(...., 0);
#else
   hwloc_topology_export_xml(....);
#endif

hwloc_topology_diff_load_xml(), hwloc_topology_diff_load_xmlbuffer(), hwloc_topology_diff_export_xml(), hwloc_topology_diff_export_xmlbuffer() and hwloc_topology_diff_destroy() lost the topology argument

The first argument (topology) isn't needed anymore.

Distances API totally rewritten

Now in hwloc/distances.h

Distances are not available in objects anymore. One should first call hwloc_distances_get() (or a variant) to retrieve distances (possibly with one call to get the number of available distances structures, and another call to actually get them). Then it may consult these structures, and finally release them.

The set of object involved in a distances structure is specified by an array of objects, it may not always cover the entire machine or so.

Misc API changes

Bitmap functions (and a couple other functions) can return errors (in theory)

Most bitmap functions may have to reallocate the internal bitmap storage. In v1.x, they would silently crash if realloc failed. In v2.0, they now return an int that can be negative on error. However the preallocated storage is 512 bits, hence realloc will not even be used unless you run hwloc on machines with larger PU or NUMAnode indexes.

hwloc_obj_add_info(), hwloc_cpuset_from_nodeset() and hwloc_nodeset_to_cpuset() also return an int, with would be -1 in case of allocation errors.

Several functions moved out of hwloc.h

Some functions moved to hwloc/export.h or hwloc/distances.h, but those files are still auto-included by hwloc.h.

hwloc_obj_type_string() replaced, hwloc_obj_snprintf() removed

hwloc_type_name() replaces it.

hwloc_obj_snprintf() removed because long-deprecated by hwloc_obj_type_snprintf() and hwloc_obj_attr_snprintf().

hwloc_obj_type_sscanf() deprecated, hwloc_obj_type_of_string() removed

hwloc_type_sscanf() extends hwloc_obj_type_sscanf() by passing a union hwloc_obj_attr_u which may receive cache, group, bridge or OS device attributes.

hwloc_type_sscanf_as_depth() is also added to directly return the corresponding level depth within a topology.

hwloc_distribute() and hwloc_distributev() removed

Removed, deprecated by hwloc_distrib()

hwloc_topology_insert_misc_object_by_cpuset() and hwloc_topology_insert_misc_object_by_parent() replaced

hwloc_topology_insert_misc_object_by_cpuset() is replaced with hwloc_topology_alloc_group_object() and hwloc_topology_insert_group_object().

hwloc_topology_insert_misc_object_by_parent() is replaced with hwloc_topology_insert_misc_object().

*_membind_nodeset() memory binding interfaces deprecated

Use the variant without _nodeset suffix and pass the new HWLOC_MEMBIND_BYNODESET flag

hwloc_cpuset_from/to_nodeset_strict() deprecated

Now useless since all topologies are NUMA. Use the variant without the _strict suffix

API removals

HWLOC_OBJ_SYSTEM removed

The root object is always HWLOC_OBJ_MACHINE

HWLOC_MEMBIND_REPLICATE removed

Not available anymore (no supported operating system supports it).

Custom interface removed

hwloc_topology_set_custom(), hwloc_custom_insert_topology() and hwloc_custom_insert_group_object_by_parent() removed from the API.

The corresponding hwloc-assembler and hwloc-assembler-remote command-line tools also removed.

The custom interface is not available anymore. Topologies always start with object with valid cpusets and nodesets.

obj->online_cpuset removed

The field has been removed from hwloc_obj_t. Offline are simply listed in the complete_cpuset as previously.

obj->os_level removed

The object field has been removed.

Clone this wiki locally