-
Notifications
You must be signed in to change notification settings - Fork 883
fix(docker): honor configured supervisor image #1935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -79,7 +79,8 @@ const DOCKER_NETWORK_DRIVER: &str = "bridge"; | |||||||||||
|
|
||||||||||||
| /// Default image holding the Linux `openshell-sandbox` binary. The gateway | ||||||||||||
| /// pulls this image and extracts the binary to a host-side cache when no | ||||||||||||
| /// explicit `supervisor_bin` override or local build is available. | ||||||||||||
| /// explicit `supervisor_bin`, configured `supervisor_image`, sibling binary, | ||||||||||||
| /// or local build is available. | ||||||||||||
| const DEFAULT_DOCKER_SUPERVISOR_IMAGE_REPO: &str = "ghcr.io/nvidia/openshell/supervisor"; | ||||||||||||
|
|
||||||||||||
| /// Return the default `ghcr.io/nvidia/openshell/supervisor:<tag>` reference | ||||||||||||
|
|
@@ -155,10 +156,9 @@ pub struct DockerComputeConfig { | |||||||||||
| /// Optional override for the Linux `openshell-sandbox` binary mounted into containers. | ||||||||||||
| pub supervisor_bin: Option<PathBuf>, | ||||||||||||
|
|
||||||||||||
| /// Optional override for the image the gateway pulls to extract the | ||||||||||||
| /// Linux `openshell-sandbox` binary when no explicit binary path or | ||||||||||||
| /// local build is available. Defaults to | ||||||||||||
| /// `ghcr.io/nvidia/openshell/supervisor:<gateway-image-tag>`. | ||||||||||||
| /// Optional image used to extract the Linux `openshell-sandbox` binary. | ||||||||||||
| /// Ignored when `supervisor_bin` is set. See `resolve_supervisor_bin` for | ||||||||||||
| /// the full resolution order. | ||||||||||||
| pub supervisor_image: Option<String>, | ||||||||||||
|
|
||||||||||||
| /// Host-side CA certificate for Docker sandbox mTLS. | ||||||||||||
|
|
@@ -2948,56 +2948,89 @@ fn normalize_docker_arch(arch: &str) -> String { | |||||||||||
| } | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| pub(crate) async fn resolve_supervisor_bin( | ||||||||||||
| docker: &Docker, | ||||||||||||
| #[derive(Debug, Eq, PartialEq)] | ||||||||||||
| enum SupervisorBinSource { | ||||||||||||
| Binary(PathBuf), | ||||||||||||
| Image(String), | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| fn resolve_supervisor_bin_source( | ||||||||||||
| docker_config: &DockerComputeConfig, | ||||||||||||
| daemon_arch: &str, | ||||||||||||
| ) -> CoreResult<PathBuf> { | ||||||||||||
| current_exe: Option<&Path>, | ||||||||||||
| target_candidates: &[PathBuf], | ||||||||||||
| ) -> CoreResult<SupervisorBinSource> { | ||||||||||||
| // Tier 1: explicit supervisor_bin in [openshell.drivers.docker]. | ||||||||||||
| if let Some(path) = docker_config.supervisor_bin.clone() { | ||||||||||||
| let path = canonicalize_existing_file(&path, "docker supervisor binary")?; | ||||||||||||
| validate_linux_elf_binary(&path)?; | ||||||||||||
| return Ok(path); | ||||||||||||
| return Ok(SupervisorBinSource::Binary(path)); | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| // Tier 2: explicit supervisor_image in [openshell.drivers.docker]. | ||||||||||||
| // A configured image should be the source of truth even when a local | ||||||||||||
| // developer build is present under target/. | ||||||||||||
| if let Some(image) = docker_config.supervisor_image.clone() { | ||||||||||||
| return Ok(SupervisorBinSource::Image(image)); | ||||||||||||
| } | ||||||||||||
|
Comment on lines
+2969
to
+2974
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a comment here: OpenShell/crates/openshell-driver-docker/src/lib.rs Lines 159 to 163 in 242ace2
Could be something like: It could make sense to validate the config and only allow either
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, it would be good to have a test for this new resolution order.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If we think that will be clearer, then breaking now probably makes sense. Having these settings be exclusive does reduce the questions around resolution precedence. I don't want to do that in this PR though. If we think that's a direction we want to go, then let's make that change in a separate PR so that the possibly breaking change is more clearly surfaced. |
||||||||||||
|
|
||||||||||||
| // Tier 2: sibling `openshell-sandbox` next to the running gateway | ||||||||||||
| // Tier 3: sibling `openshell-sandbox` next to the running gateway | ||||||||||||
| // (release artifact layout). Linux-only because the sibling must be a | ||||||||||||
| // Linux ELF to bind-mount into a Linux container. | ||||||||||||
| if cfg!(target_os = "linux") { | ||||||||||||
| let current_exe = std::env::current_exe() | ||||||||||||
| .map_err(|err| Error::config(format!("failed to resolve current executable: {err}")))?; | ||||||||||||
| if let Some(parent) = current_exe.parent() { | ||||||||||||
| let sibling = parent.join("openshell-sandbox"); | ||||||||||||
| if sibling.is_file() { | ||||||||||||
| let path = canonicalize_existing_file(&sibling, "docker supervisor binary")?; | ||||||||||||
| if validate_linux_elf_binary(&path).is_ok() { | ||||||||||||
| return Ok(path); | ||||||||||||
| } | ||||||||||||
| if cfg!(target_os = "linux") | ||||||||||||
| && let Some(current_exe) = current_exe | ||||||||||||
| && let Some(parent) = current_exe.parent() | ||||||||||||
| { | ||||||||||||
| let sibling = parent.join("openshell-sandbox"); | ||||||||||||
| if sibling.is_file() { | ||||||||||||
| let path = canonicalize_existing_file(&sibling, "docker supervisor binary")?; | ||||||||||||
| if validate_linux_elf_binary(&path).is_ok() { | ||||||||||||
| return Ok(SupervisorBinSource::Binary(path)); | ||||||||||||
| } | ||||||||||||
| } | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| // Tier 3: local cargo target build (developer workflow). Preferred | ||||||||||||
| // over a registry pull when available because it matches whatever the | ||||||||||||
| // developer just built. | ||||||||||||
| let target_candidates = linux_supervisor_candidates(daemon_arch); | ||||||||||||
| for candidate in &target_candidates { | ||||||||||||
| // Tier 4: local cargo target build (developer workflow). Preferred | ||||||||||||
| // over the default registry image when available because it matches | ||||||||||||
| // whatever the developer just built. | ||||||||||||
| for candidate in target_candidates { | ||||||||||||
| if candidate.is_file() { | ||||||||||||
| let path = canonicalize_existing_file(candidate, "docker supervisor binary")?; | ||||||||||||
| if validate_linux_elf_binary(&path).is_ok() { | ||||||||||||
| return Ok(path); | ||||||||||||
| return Ok(SupervisorBinSource::Binary(path)); | ||||||||||||
| } | ||||||||||||
| } | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| // Tier 4: pull the supervisor image from a registry and extract the | ||||||||||||
| // binary to a host-side cache keyed by image content digest. This is | ||||||||||||
| // the default path for released gateway binaries. | ||||||||||||
| let image = docker_config | ||||||||||||
| .supervisor_image | ||||||||||||
| .clone() | ||||||||||||
| .unwrap_or_else(default_docker_supervisor_image); | ||||||||||||
| extract_supervisor_bin_from_image(docker, &image).await | ||||||||||||
| // Tier 5: pull the release-matched default supervisor image and extract | ||||||||||||
| // the binary to a host-side cache keyed by image content digest. | ||||||||||||
| Ok(SupervisorBinSource::Image(default_docker_supervisor_image())) | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| pub(crate) async fn resolve_supervisor_bin( | ||||||||||||
| docker: &Docker, | ||||||||||||
| docker_config: &DockerComputeConfig, | ||||||||||||
| daemon_arch: &str, | ||||||||||||
| ) -> CoreResult<PathBuf> { | ||||||||||||
| let current_exe = | ||||||||||||
| if cfg!(target_os = "linux") | ||||||||||||
| && docker_config.supervisor_bin.is_none() | ||||||||||||
| && docker_config.supervisor_image.is_none() | ||||||||||||
| { | ||||||||||||
| Some(std::env::current_exe().map_err(|err| { | ||||||||||||
| Error::config(format!("failed to resolve current executable: {err}")) | ||||||||||||
| })?) | ||||||||||||
| } else { | ||||||||||||
| None | ||||||||||||
| }; | ||||||||||||
| let target_candidates = linux_supervisor_candidates(daemon_arch); | ||||||||||||
|
|
||||||||||||
| match resolve_supervisor_bin_source(docker_config, current_exe.as_deref(), &target_candidates)? | ||||||||||||
| { | ||||||||||||
| SupervisorBinSource::Binary(path) => Ok(path), | ||||||||||||
| SupervisorBinSource::Image(image) => { | ||||||||||||
| extract_supervisor_bin_from_image(docker, &image).await | ||||||||||||
| } | ||||||||||||
| } | ||||||||||||
| } | ||||||||||||
|
|
||||||||||||
| fn linux_supervisor_candidates(daemon_arch: &str) -> Vec<PathBuf> { | ||||||||||||
|
|
||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this setting is ignored when the supervisor_bin setting is set. This means that they are mutually exclusive, correct? Maybe it would make more sense to turn them into a setting that looks like:
Just my two cents. Otherwise lgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @pimlock suggested a related change in #1935 (comment) -- my primary goal for the initial change is to unblock my local testing, but I will look into this as a follow up.