Skip to content

Draft: CMake improvements#248

Open
caitlinross wants to merge 20 commits into
masterfrom
cmake-improvements
Open

Draft: CMake improvements#248
caitlinross wants to merge 20 commits into
masterfrom
cmake-improvements

Conversation

@caitlinross

Copy link
Copy Markdown
Member

No description provided.

@caitlinross caitlinross force-pushed the cmake-improvements branch 3 times, most recently from 5838ec3 to 6cac1c3 Compare June 23, 2026 22:19
@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment

Thanks for integrating Codecov - We've got you covered ☂️

- convert global add_definitions(-DUSE_*) to target_compile_definitions on codes
- drop redundant CMAKE_C_FLAGS include appends (already covered by target_include_directories)
- executables/tests inherit includes + libs transitively by linking codes
The various dependencies had some inconsistencies in how they were
handled. Some if found in PATH, could not be turned off. Now all have
three possible states: AUTO, ON, OFF. All are AUTO by default meaning
that we will attempt to find them and if found, build with them. If a
dependency is not found, there is no error or warning, we just build
without it. If a dependency is set to OFF, it won't be looked for at
all, and if a dependency is set to ON and not found, an error will be
given.

Also changed naming from USE_* to CODES_USE_* to prevent any collisions
with other projects that may use CODES.
Optional-dependency state (DUMPI, ONLINE, SWM, UNION, RECORDER, TORCH,
ZEROMQ, DARSHAN) was carried as bare -DUSE_<X> defines added PUBLIC to the
codes target, so every one of those names leaked onto the command line of
everything that links codes. That approach had two real problems:

  - The names are generic and unnamespaced. USE_ONLINE / USE_TORCH and the
    like can collide with whatever a downstream project defines, and
    USE_ONLINE was already baked into an installed public header
    (codes-workload.h) -- so the leak was part of our public surface.
  - #ifdef tests existence, not value. A stray -DUSE_X=0 would *enable* the
    feature, and any source that forgot the define or typo'd it silently
    compiled the feature off, with no diagnostic at all.

Instead, generate a single namespaced header (codes_config.h) defining
CODES_HAVE_<X> to 0/1 for each optional subsystem, install it alongside the
public headers, and query it with `#if CODES_HAVE_<X>`. Build CODES's own
targets with -Wundef.

This is strictly better than the old scheme: feature availability is
namespaced (no clashes), it lives in one installed header consumers can
actually query instead of an invisible command-line contract, and nothing
leaks onto consumer compile lines. The `#if` + -Wundef combination turns the
classic preprocessor foot-guns -- forgotten include, typo'd macro, -DFOO=0
-- into compile-time warnings rather than a silently mis-built feature.
CODES installed a static library and headers but no CMake package, so a
downstream model had no way to find_package(codes) -- it had to hard-code
CODES's include and library paths (and ROSS's, and MPI's) by hand, or vendor
the tree. That is exactly the fragile, manual coupling the rest of this
effort has been removing everywhere else.

This adds a namespaced install(EXPORT) that generates codesConfig.cmake and a
codes::codes imported target, mirroring how ROSS already exposes ROSS::ROSS
(and how CODES now consumes ROSS). A consumer does

    find_package(codes REQUIRED)
    target_link_libraries(mymodel PRIVATE codes::codes)

and inherits CODES's public include dirs plus its ROSS / MPI usage
requirements transitively -- no bespoke CODES_INCLUDE_DIRS / CODES_LIBRARIES
variables, no knowing where ROSS lives. The config re-resolves the
dependencies CODES links (find_dependency for ROSS and MPI C/CXX).

To keep the exported target relocatable, the library's public include
interface moves to BUILD_INTERFACE / INSTALL_INTERFACE generator
expressions: in-tree builds see the source/binary dirs, installed consumers
get <prefix>/include and <prefix>/include/codes, and no build-machine path
leaks into the installed codesTargets.cmake. Verified with a standalone
find_package(codes) consumer that links codes::codes and compiles a CODES
header.
…lve zmqml_*

The director-client.C / dragonfly-dally.C objects in libcodes.a reference the
zmqml_* symbols defined in libzmqmlrequester.so, so the requester is a usage
requirement of the codes library itself. Linking it PUBLIC on `codes` (rather
than PRIVATE per-executable in the CODES_TARGETS loop) makes every consumer
inherit it transitively, at the end of the link line after libcodes.a + ROSS /
SWM / UNION. This fixes two faults on the heavy (UNION+ZeroMQ) build:
the doc/example and tests targets — which link codes but aren't in
CODES_TARGETS — previously linked no requester at all, and even the
CODES_TARGETS exes had it ahead of codes' transitive deps where the linker
dropped it, leaving zmqml_* undefined. zmqmlrequester is an IMPORTED target, so
like PkgConfig::SWM it's exempt from the install(EXPORT) "not in export set"
error.
Add a `full` job to build.yml that runs inside ghcr.io/codes-org/codes-ci-full
and builds ROSS + the zmqml requester + CODES with every optional subsystem on
(UNION, ZEROMQ, DUMPI, TORCH), then runs ctest excluding union-surrogate director tests.
The heavy dependency compiles live in the
prebuilt image, so this job only builds ROSS + CODES.
…de Makefile

The zmqml requester shared lib was built out-of-band by a hand-rolled
Makefile (src/surrogate/zmqml/Makefile) and consumed via a SHARED IMPORTED
target pointed at by -DZEROMQ_BUILD_PATH. That Makefile linked the .so with
no -soname, so consumers linking it by absolute path baked a CMake-
relativized "../src/surrogate/zmqml/libzmqmlrequester.so" into DT_NEEDED.
ld.so resolves a slash-containing DT_NEEDED against the process CWD rather
than RUNPATH, so every binary run from ctest's testing-output/test-* subdir
failed with "error while loading shared libraries" — turning the heavy-deps
`full` CI lane red across the modelnet and ping-pong tests.

Replace the arrangement with real CMake targets:

  - New src/surrogate/zmqml/CMakeLists.txt defines a STATIC zmqmlrequester
    target (add_subdirectory under USE_ZEROMQ) carrying libzmq + its own
    public header dir as usage requirements. zmqml_* is compiled straight
    into libcodes, so there is no runtime .so to locate.
  - codes links it PUBLIC and exports it in codesTargets; its imported
    PkgConfig::ZeroMQ dep stays export-exempt, like the SWM/UNION targets.
  - ZeroMQ is now AUTO-probed like the other optional deps: pkg_check_modules
    (libzmq) + find zmq.hpp + find rapidjson/document.h. Drop the
    ZEROMQ_BUILD_PATH cache var and its FATAL_ERROR.
  - Drop the "Build the zmqml requester" make step and -DZEROMQ_BUILD_PATH
    from build.yml and CODES-compile-instructions.sh.
  - Delete the Makefile and runcppdemo.sh; the standalone demozmqmlrequester
    demo becomes an optional CMake target (-DCODES_BUILD_ZMQML_DEMO=ON,
    default OFF). Drop the stale .gitignore entries; update NOTES.txt.

Verified in the codes-ci-full image: all previously-failing modelnet and
ping-pong tests pass; the requester and demo build with only libzmq.so.5
in DT_NEEDED.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant