Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configury: enable autogen.sh from an embedded hwloc dist tarball #235

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ggouaillardet
Copy link
Contributor

Signed-off-by: Gilles Gouaillardet [email protected]

@ggouaillardet
Copy link
Contributor Author

Refs open-mpi/ompi#3302

@ggouaillardet
Copy link
Contributor Author

with this PR, it is pretty trivial to untar and add an embedded hwloc tarball into ompi and as a standalone package (a la libevent, pmix or romio)
do you have any recollection on why hwloc is currently not a full standalone component in ompi ?
(e.g. ompi configury uses hwloc m4 file and has its own configure.m4, instead of directly invoking hwloc configure)

@bgoglin
Copy link
Contributor

bgoglin commented Apr 11, 2017

@jsquyres Can you answer above?

@bgoglin
Copy link
Contributor

bgoglin commented Apr 11, 2017

@ggouaillardet How would you generate this "embedded" tarball? What's the avantage against bringing a normal full hwloc tarball, running autogen, and then running a script that removes unneeded directories?

I am trying to simplify your commit. For instance, I don't like having the netloc script explictly listed.

@ggouaillardet
Copy link
Contributor Author

./configure --enable-embedded-mode
make dist

then the tarball can be untarred and renamed into opal/mca/hwloc/hwloc2x/hwloc

the pro is from ompi, you can

./autogen.sh
./configure ...
make dist
tar xvf openmpi-gitclone.tar.gz
cd openmpi-gitclone
./autogen.sh
...

without this PR, some steps do not complete successfully

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to remember a reason why we felt compelled to use embedded m4 vs. just invoking hwloc's own configure, and I'm failing to come up with a reason. It could well have simply been hysterical raisins...?

I.e., we probably did it for several prior embedded packages -- e.g., PLPA -- and therefore hwloc fell into the "well, this is the way we have always done things" mindset. And/or it could have been before our "invoke a component's configure" support got good...?

The more I think about it, the more I think the "don't invoke a sub-configure" methodology probably came from long, long ago in Open MPI (circa v1.0) where every component had its own standalone configure script. It was incredibly modular, but it took forever to run (and remembers that computers+filesystems were much older/slower than today). We had a great configure transformation at one point to make all (normal) OMPI components have embedded m4 that could be slurped up into the main Open MPI configure script. This dramatically sped up configure's execution. Embedded projects (like PLPA) probably got swept up in that craze, and we've probably just maintained that position as our default ever since.

However, for standalone projects like hwloc, supporting embeddable m4 is actually fairly complicated. Perhaps hwloc 2.0 is a good time to audit the use and utility of all the embedding stuff in hwloc and see if it's really worth it any more. E.g., is it really a problem to include a few extra directories in the OMPI-embadded hwloc (even if OMPI doesn't use that stuff)? It certainly would be nice to maintain the configure CLI options to disable the stuff that OMPI (and MPICH and ...) don't want/don't use, but the extra gyrations to make "special" embedded tarballs and/or have the ability to have embedded m4 may not be worth it any more.

@@ -0,0 +1,13 @@
#!/bin/sh

chmod u+w $2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor quibble: I usually like to assign $n variables to names to make the code a bit more readable. For example:

top_srcdir=$1
distdir=$2
hwloc_version=$3

chmod u+w $distdir
# ...

@@ -0,0 +1,13 @@
#!/bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should include the copyright header here.

# but sadly, automake *requires* it
EOF
done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove trailing blank line.

#!/bin/sh

chmod u+w $2
makefiles="$2/doc/Makefile.am $2/doc/examples/Makefile.am $2/doc/doxygen-config.cfg.in $2/utils/Makefile.am $2/utils/hwloc/Makefile.am $2/utils/lstopo/Makefile.am $2/utils/netloc/infiniband/Makefile.am $2/utils/netloc/draw/Makefile.am $2/utils/netloc/mpi/Makefile.am $2/tests/Makefile.am $2/tests/hwloc/Makefile.am $2/tests/hwloc/linux/Makefile.am $2/tests/hwloc/linux/allowed/Makefile.am $2/tests/hwloc/linux/gather/Makefile.am $2/tests/hwloc/x86/Makefile.am $2/tests/hwloc/xml/Makefile.am $2/tests/hwloc/ports/Makefile.am $2/tests/hwloc/rename/Makefile.am $2/tests/hwloc/linux/allowed/test-topology.sh.in $2/tests/hwloc/linux/gather/test-gather-topology.sh.in $2/tests/hwloc/linux/test-topology.sh.in $2/tests/hwloc/x86/test-topology.sh.in $2/tests/hwloc/xml/test-topology.sh.in $2/tests/hwloc/wrapper.sh.in $2/utils/hwloc/hwloc-compress-dir.in $2/utils/hwloc/hwloc-gather-topology.in $2/utils/hwloc/test-hwloc-annotate.sh.in $2/utils/hwloc/test-hwloc-calc.sh.in $2/utils/hwloc/test-hwloc-compress-dir.sh.in $2/utils/hwloc/test-hwloc-diffpatch.sh.in $2/utils/hwloc/test-hwloc-distrib.sh.in $2/utils/hwloc/test-hwloc-info.sh.in $2/utils/hwloc/test-fake-plugin.sh.in $2/utils/hwloc/test-hwloc-dump-hwdata/Makefile.am $2/utils/hwloc/test-hwloc-dump-hwdata/test-hwloc-dump-hwdata.sh.in $2/utils/lstopo/test-lstopo.sh.in $2/contrib/systemd/Makefile.am $2/contrib/misc/Makefile.am $2/tests/netloc/Makefile.am $2/tests/netloc/tests.sh.in $2/utils/lstopo/lstopo-windows.c"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to not hard-code this list of files? I ask because this list will inevitably get forgotten if new directories (i.e., generated files) are added over time.

@@ -0,0 +1,13 @@
#!/bin/sh

chmod u+w $2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do you need to chmod u-w on the $distdir at the end of this script? I.e., do you need to un-do the chmod changes that you did? I don't know if this has an effect on the generated tarball or not.

@bgoglin
Copy link
Contributor

bgoglin commented Apr 11, 2017

@ggouaillardet Your hunk about adding netloc_ib_gather_raw.in to EXTRA_DIST shouldn't be needed. This file shouldn't be required in embedded anyway, just like everything else under utils/.

What I didn't get yet is why your procedure requires patches that the current procedure doesn't. We are creating dummy directories manually to make automake happy. But why is your automake also looking for Makefiles that the current procedure doesn't require?

By the way the status of netloc vs embedding is unclear. OMPI won't use it anytime soon. You likely want to disable netloc everywhere for now.

I agree with @jsquyres about not listing tons of Makefiles in the embedded distscript.

I am fine with stopping the embedding. I don't know if anybody else is using it outside of OMPI, I'll ping the devel list.

@ggouaillardet
Copy link
Contributor Author

i will make the requested changes sometimes this week

let me emphasize i claim no insightful knowledge on how autotools work.
so what i did is (my builddir is not my srcdir)

cd build/hwloc-embedded
../../src/hwloc/configure --enable-embedded-mode
make dist
tar xvfz hwloc-xxx.tar.gz
cd hwloc-xxx
./autogen.sh

autogen.sh fails with various error (missing files) and this PR is enough to get the job done
(this is how i found the list of hard coded paths)
keep in mind in ompi we do not run autogen.sh based on the full hwloc configure.ac,
but we run it vs a trimmed-for-ompi configure.m4 that hence requires less files.
if i follow the hwloc1116 logic of ompi, i am not able to have both hwloc 1.11.6 and hwloc 2.x, so i had to find a way to have a truly standalone hwloc 2.x
there could be a better and more elegant way to do it, but i do not know it.

@bgoglin
Copy link
Contributor

bgoglin commented Apr 11, 2017

"I am not able to have both hwloc 1.11.6 and hwloc 2.x,"

Did you try with 2 different components or a single one?
I think you will need 2 to cope with ABI differences.

@ggouaillardet
Copy link
Contributor Author

what i meant is i cannot embedded two components in the ompi source tree (e.g. hwloc1116 and hwloc2x)
(so enduser/CI can configure with --with-hwloc=internal (for embedded 1.11.6) or --with-hwloc=future (for embedded v2.x) to test both API without the need to install any external hwloc)

hwloc framework has only one static component, so only one is built and there is no need
to worry about ABI difference here.

@jsquyres
Copy link
Member

Yes, I think we've had that limitation in OMPI for quite a while (you can't have multiple hwloc components simultaneously in the tree -- there's esoteric m4 reasons why, and I don't remember the details at the moment). So moving to a "invoke the configure script in the hwloc component" model would alleviate that issue, too.

But remember -- this does come at a cost (which we may or may not care about): the extra files and space used in git cloning. It would probably be a good idea to look at the resulting size of the ompi source trees (and tarballs) with and without all the extra hwloc files (e.g., the hwloc bz2 tarball is ~4MB, expands out to 16MB, of which the PDF docs alone are about 1.7MB).

@bgoglin
Copy link
Contributor

bgoglin commented Apr 12, 2017

The size can be easily reduced to 1.5MB tarball and 8MB expanded (by removing docs PDF, html and man, while keeping the doxygen tag, and by removing tarballs and xmls in tests/linux and tests/xml). You can still autoreconf, configure and build from the root after that. But we cannot go much lower than that without a careful look at makefiles and configure.

Note that there's still another solution: stop including hwloc in OMPI and use the system-provided hwloc. It seems to me that hwloc is very often installed on clusters nowadays. Versions are somehow old, but does it really matter for OMPI?

@bgoglin
Copy link
Contributor

bgoglin commented Apr 12, 2017

We could try generating the explicit list of dummy makefiles by grepping for AC_CONFIG_FILES and AC_CONFIG_LINKS in configure and config/*.m4

There might be a way to add a regression test just like the current tests/hwloc/embedded

By the way, lstopo-windows.c is only needed because of AC_CONFIG_LINKS. So links are only used for non-Linux testing on Linux. We could manually symlink on Linux to simplify this.

@jsquyres
Copy link
Member

We've discussed exactly this idea (not embedding hwloc in Open MPI any more) recently. We all agree that this is the Right long-term solution. However, there still appear to be some barriers in the short term:

  1. We updated Open MPI to easily be able to handle hwloc >= v1.5 (open-mpi/ompi@7e01be6). Supporting < v1.5 will require a bunch of work that we didn't really want to do.
  2. Some old-but-still-not-unpopular Linux distros ship older versions of hwloc (e.g., RHEL 6.3 has hwloc 1.1).
  3. Other old-but-still-not-unpopular Linux distros don't ship hwloc by default (e.g., SLES 12).
  4. Using the system-installed hwloc means that a secondary package frequently has to be installed (e.g., hwloc-devel), which is never installed by default.
  5. OS X does not have hwloc by default (it's available by MacPorts and Homebrew, but just like the other issues above, this is an additional step).
  6. We need to adjust Open MPI to be able to handle platforms where hwloc is not available. Open MPI used to be able to handle this, but it was a whole pile of #if statements, and we inadvertently broke OMPI on hwloc-less platforms on a not-infrequent basis. We need to resurrect this support, but somehow make it more robust.

All of these things, taken together, made it a bit too scary for us to remove the embedded hwloc from Open MPI -- at least for now.

@bgoglin
Copy link
Contributor

bgoglin commented Mar 24, 2020

Note to open pull requests: some things changed in the CI yesterday, you'll need to rebase on top of master to avoid total CI failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants