feat(logging): add logging macros (4/6)#725
Conversation
152f086 to
a655149
Compare
Resolves @wgtmac's review comments on PR apache#723: - Mark ShouldLog()/SetLevel()/level() noexcept so a custom logger cannot throw from the hot filter path; propagated to NoopLogger and CapturingLogger. - Make CurrentLogger() safe during thread teardown without UB: the per-thread cache is reached via a trivially-destructible thread_local pointer to a never-freed heap slot (kept LSan-reachable by a registry), so logging from a thread_local destructor in any order is safe. Replaces the AliveFlag guard. - Add ICEBERG_EXPORT to LogAttribute/LogMessage, matching Error/ScanReport. - Trim the logger.h \file brief (the macros it referenced arrive in PR apache#725). - Tests: use the IsError matcher and std::ignore; strengthen the teardown test to log from a thread_local destroyed after the cache. Co-authored-by: Isaac
7e99738 to
0942681
Compare
Part 2 of the logging stack (builds on #722). Adds the logging API and a swappable default logger — the foundation the backends and macros plug into. **What's here** - `Logger`: the pluggable sink interface (`ShouldLog` / `Log` / `SetLevel` / `Flush` / `Initialize`). `ShouldLog()` is the single source of truth for runtime filtering. - `LogMessage` owns its formatted text so a sink can safely keep a record; reserves an attributes field for future structured logging. - Process-global default logger: `GetDefaultLogger` / `SetDefaultLogger` / `SetDefaultLevel`, with a lock-free thread-local fast path so logging stays cheap. - `Initialize` applies the `level` property, so config-driven levels actually work. `CurrentLogger()` is safe to call even from a `thread_local` destructor during thread shutdown. - `logger.h` stays backend-agnostic (never includes the build config header), so consumers see one stable API regardless of backend. **Examples** — using the API directly (the `LOG_*` macros that wrap it arrive in #725): ```cpp // A custom sink, installed as the process default. class MySink : public Logger { public: bool ShouldLog(LogLevel level) const override { return level >= level_; } void Log(LogMessage&& m) noexcept override { write_line(m.message); } void SetLevel(LogLevel level) override { level_ = level; } LogLevel level() const override { return level_; } private: std::atomic<LogLevel> level_{LogLevel::kInfo}; }; SetDefaultLogger(std::make_shared<MySink>()); // install process-wide SetDefaultLevel(LogLevel::kDebug); // adjust the threshold auto logger = GetDefaultLogger(); // borrow the current default if (logger->ShouldLog(LogLevel::kInfo)) { logger->Log(LogMessage{.level = LogLevel::kInfo, .message = "scan ready"}); } // Or configure from catalog-style properties (applies the "level" key): auto sink = std::make_shared<MySink>(); auto status = sink->Initialize({{std::string(kLevelProperty), "warn"}}); // -> kWarn ``` The same example is documented inline in `logger.h`. **Tests** — `logger_test`: default-logger API, level-from-property, invalid level rejected, concurrent swap/read, and logging during thread teardown. Built and run with clang/libc++ (spdlog ON and OFF). This pull request and its description were written by Isaac.
bb6a95d to
6c3a85e
Compare
Fourth block: the application-facing macros, the only part most callers touch.
- ICEBERG_LOG_{TRACE,DEBUG,INFO,WARN,ERROR,CRITICAL,FATAL} plus the generic
ICEBERG_LOG(level, ...), ICEBERG_LOG_TO(logger, level, ...) for an explicit
logger, and ICEBERG_LOG_RUNTIME_FMT for a runtime (non-literal) format string.
- ICEBERG_LOG_ACTIVE_LEVEL is a compile-time severity floor: statements below it
are removed entirely via `if constexpr` (no format call site, no source
location emitted). ICEBERG_LOG_FATAL is never gated by the floor -- its abort
is always compiled in; it emits, best-effort Flush()es the same logger it
emitted to, then std::abort().
- Filtering is decided solely by Logger::ShouldLog(); formatting is wrapped in
try/catch so logging never throws (a format failure routes to EmitFormatError).
- Bare Java-style aliases (LOG_INFO, ...) are opt-in via ICEBERG_LOG_SHORT_MACROS
to avoid polluting consumers / colliding with glog/abseil.
Header-only addition to logger.h. macros_test covers injection, the
guard-before-format short-circuit, never-throws, and FATAL aborts;
macros_active_level_test verifies compile-time stripping in a kOff translation unit.
Co-authored-by: Isaac
6c3a85e to
a9878cb
Compare
| // so a concurrent SetDefaultLogger cannot flush a different logger than it emitted to. | ||
| #define ICEBERG_LOG_FATAL(FMT_, ...) \ | ||
| do { \ | ||
| auto _ib_logger = ::iceberg::GetDefaultLogger(); \ |
There was a problem hiding this comment.
This should use the current scoped logger rather than the process default. ScopedLogger says LOG_* routes through the scoped binding, but this path would send a fatal record to the global default and then abort.
wgtmac
left a comment
There was a problem hiding this comment.
Great work on the logging macros! I have a few suggestions regarding performance, binary size, and exception handling for your consideration.
| try { \ | ||
| ::iceberg::internal::Emit(*_ib_logger, (level_), \ | ||
| ::std::source_location::current(), \ | ||
| ::std::format(FMT_ __VA_OPT__(, ) __VA_ARGS__)); \ |
There was a problem hiding this comment.
Performance/Heap Allocation: std::format returns a std::string. If the formatted message exceeds the Small String Optimization (SSO) limit, this will trigger a heap allocation before the logger even processes it. For a high-performance system like Iceberg, this could become a bottleneck on hot paths. Consider using a stack/thread-local buffer (like fmt::memory_buffer in fmtlib or spdlog) to achieve zero-allocation formatting.
| // Internal: fixed-severity emit with compile-time floor then the authoritative | ||
| // Logger::ShouldLog (the single source of truth for runtime filtering), with | ||
| // formatting only on the taken path, never throwing. | ||
| #define ICEBERG_INTERNAL_LOG(level_, FMT_, ...) \ |
There was a problem hiding this comment.
Binary Bloat: Expanding this entire block (including if, try-catch, and ShouldLog) at every log call site can significantly bloat the binary size and reduce instruction cache hit rates. A common optimization is to use type erasure (e.g., std::make_format_args) and push the actual try-catch execution into a non-inline function in the .cc file.
| ::iceberg::internal::Emit(*_ib_logger, (level_), \ | ||
| ::std::source_location::current(), \ | ||
| ::std::format(FMT_ __VA_OPT__(, ) __VA_ARGS__)); \ | ||
| } catch (...) { \ |
There was a problem hiding this comment.
Exception Handling Scope: catch (...) is very broad and will swallow all exceptions, including std::bad_alloc (OOM) or thread cancellation exceptions (e.g., POSIX abi::__forced_unwind). Since std::format typically throws std::format_error, it is safer to catch const std::format_error& or const std::exception& so fatal system exceptions can propagate.
| } \ | ||
| } \ | ||
| if (_ib_logger) _ib_logger->Flush(); \ | ||
| ::std::abort(); \ |
There was a problem hiding this comment.
Extensibility: Hardcoding std::abort() removes the ability for the host application (like a Java JNI or Python process embedding Iceberg) to handle fatal errors gracefully (e.g., flushing resources or printing a stack trace). Consider allowing the user to register a custom FatalHandler callback before falling back to abort().
Part 4 of the logging stack (builds on #724). Adds the application-facing macros — the part most callers actually use.
What's here
ICEBERG_LOG_{TRACE,DEBUG,INFO,WARN,ERROR,CRITICAL,FATAL}, the genericICEBERG_LOG(level, …),ICEBERG_LOG_TO(logger, …), andICEBERG_LOG_RUNTIME_FMTfor runtime format strings.ICEBERG_LOG_ACTIVE_LEVELis a compile-time floor: statements below it are removed entirely.ICEBERG_LOG_FATALalways emits, flushes, then aborts.ShouldLog(); formatting happens only on the taken path and never throws (a bad format falls back safely).LOG_*aliases are opt-in viaICEBERG_LOG_SHORT_MACROS.Build note — sets
/Zc:preprocessoron MSVC, required for the__VA_OPT__used by the macros.Examples — every macro takes a
std::formatstring + args. The rendered line depends on the active backend; with the defaultCerrLoggerit looks like:Less common forms:
ICEBERG_LOG(level, …)(runtime severity),ICEBERG_LOG_TO(logger, level, …)(explicit logger),ICEBERG_LOG_RUNTIME_FMT(level, fmt_string, args…)(non-literal format). The same examples are documented inline inlogger.h.Tests —
macros_test/macros_active_level_test: formatting, level gating, "argument not evaluated when disabled", compile-time stripping, never-throws, and FATAL-aborts death tests. clang/libc++.This pull request and its description were written by Isaac.