NemoClaw/scripts
Prekshi Vyas b8330ae5c4
fix(scripts): restore .openclaw perms after nemoclaw exec command (#6060)
<!-- markdownlint-disable MD041 -->
## Summary

`openclaw doctor --fix` can collapse mutable OpenClaw paths from
NemoClaw's multi-UID `2770/660` contract to OpenClaw's single-user
`700/600` defaults. This change restores that contract after both
entrypoint one-shot commands and the documented `nemoclaw <name> exec`
boundary, with deterministic child-versus-cleanup exit-status
precedence.

## Related Issue

Fixes #6047

## Changes

- Supervise entrypoint one-shot commands so `TERM` and `INT` are
forwarded, the direct child is reaped, permission cleanup always runs,
and the remote status is preserved when cleanup succeeds.
- After public OpenShell exec returns, inspect registered OpenClaw
sandboxes, repair only detected mutable-permission drift through the
installed descriptor-safe normalizer while holding the timer-bound
shields mutation lock, and require successful re-inspection. Hermes,
custom agents, unregistered sandboxes, and active shields locks remain
untouched.
- Move mutable-tree normalization, baseline capture, and empty-config
recovery into an installed root-trusted Python helper that operates
through pinned, no-follow descriptors.
- Authenticate the permanently privilege-dropped owner child with a
private Unix socket, `SO_PASSCRED`, exact credentials, and `SCM_RIGHTS`;
retain the exact directory/config descriptors across the privilege
boundary.
- Replace recovery baseline/config/hash entries with fresh inodes so
hardlinks, symlink swaps, directory replacement, and inode-reuse races
cannot turn root into a confused deputy.
- Fail closed on missing trusted helper, unexpected ownership, unsafe
links, metadata changes, malformed descriptor handoff, or incomplete
verification.
- Document host-side cleanup behavior, atomic recovery, failure
precedence, and safe operator recovery guidance.
- Add unit, integration, container, and live-target coverage for
permission drift, signals, cleanup precedence, capability loss, hardlink
safety, path swaps, protected symlinks, and trusted-helper selection.

## Type of Change

- [ ] Code change (feature, bug fix, or refactor)
- [x] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Quality Gates

- [x] Tests added or updated for changed behavior
- [ ] Existing tests cover changed behavior — justification:
- [ ] Tests not applicable — justification:
- [x] Docs updated for user-facing behavior changes
- [ ] Docs not applicable — justification:
- [x] Sensitive paths changed (security, policy, credentials, preflight,
onboarding, inference, runner, sandbox, or messaging)
- [x] Sensitive-path review completed or maintainer-approved waiver
recorded — reviewer/approval link/justification: independent review
through runtime head `85b0744310553ec4a141bdc031cdc881e8c57422`
confirmed the previously reproducible cross-phase ABA, external-hardlink
ownership/mode mutation, path-based recovery TOCTOU, root
helper-selection, and earlier-tree hardlink-alias issues are fixed.
Follow-up `64d6234a9198e14e582155ebae4c75c362373f10` changes test
control flow only. The owner child is permanently privilege-dropped
before recursive mutation and root performs no work without an
authenticated descriptor handoff. No runtime privilege-boundary blocker
remains; the same-UID child retains only authority that the sandbox user
already has over its own inode.
- [ ] Non-success, skipped, or missing CI check accepted by maintainer —
check name, approval link, and follow-up issue:

## Verification

- [x] PR description includes the DCO sign-off declaration and every
commit appears as `Verified` in GitHub
- [x] Git hooks passed during commit and push, or `npx prek run
--from-ref main --to-ref HEAD` passes
- [x] Targeted tests pass for changed behavior
- [ ] Full `npm test` passes (broad runtime changes only)
- [x] Quality Gates section completed with required justifications or
waivers
- [x] No secrets, API keys, or credentials committed
- [ ] `npm run docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

### Evidence

- Signed commit and push hooks: repository checks, Biome, ShellCheck,
Hadolint, gitleaks, CLI typecheck, full CLI test hook, source-shape
budget, and test-size budget passed.
- Focused host tests: the final runtime correction passed 67 CLI and 145
integration tests after the earlier focused suites; CLI type-checking
passed.
- Fresh production image: build passed; new E2E cases 30–30f all passed.
The full script reported 40 passes and two unrelated stale base-image
profile assertions.
- Independent container probes passed normal repair, exact-config
capture, empty-config recovery, hardlink/protected-target invariants,
and the final `700/600` to `2770/660` production-image repair path.
- Full CLI/coverage hooks, repository checks, source-shape budget, and
test-size budget passed under the repository's expected `umask 022`.
- Docs build completed with 0 errors and 2 pre-existing warnings;
agent-variant synchronization and docs checks passed.
- Security test-depth follow-ups remain non-blocking: malformed
ancillary-message variants, an isolated missing-`CAP_SETUID` case, 16
MiB boundary/source-mutation/temp-cleanup cases, exact-image provenance
when reusing an existing E2E tag, and a same-UID concurrent post-check
hardlink race that cannot increase the child process's existing
authority.

---

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added safer one-shot command handling with post-command config
permission cleanup.
* Improved config recovery and permission repair behavior for OpenClaw
environments.
* Updated documentation to describe the new cleanup and recovery
behavior more clearly.

* **Bug Fixes**
* Hardened config permission handling against symlinks, ownership
mismatches, and concurrent changes.
* Improved failure handling so unsafe or incomplete repairs now fail
closed with clearer status reporting.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
2026-07-01 11:40:56 -07:00
..
checks fix(sandbox): add host-mediated gateway restart (#5874) 2026-06-30 14:46:13 -04:00
e2e test(e2e): retire legacy shell lanes (#5756) 2026-06-29 22:32:24 -05:00
lib fix(scripts): restore .openclaw perms after nemoclaw exec command (#6060) 2026-07-01 11:40:56 -07:00
scorecard test(e2e): retire legacy shell lanes (#5756) 2026-06-29 22:32:24 -05:00
backup-workspace.sh fix(e2e): restore backup directories file-by-file (#3619) 2026-05-15 13:44:29 -07:00
bedrock-runtime-adapter.js fix(inference): auto-detect Bedrock Runtime custom endpoints (#3767) 2026-05-19 14:10:02 -07:00
benchmark-sandbox-image-build.ts refactor(scripts): migrate low-hanging JavaScript to TypeScript (#4367) 2026-05-29 08:33:32 -07:00
bootstrap-windows.ps1 fix(windows): redact private data from Windows bootstrap WSL output (#6009) 2026-06-29 21:07:10 -07:00
brev-launchable-ci-cpu.sh chore(openshell): upgrade supported version to 0.0.71 (#5596) 2026-06-30 11:44:27 -04:00
bump-version.ts chore(release): document tag-based workflow (#5545) 2026-06-19 11:17:19 -07:00
check-coverage-ratchet.ts chore(tooling): enforce honest coverage reporting (#3154) 2026-05-08 08:48:02 -07:00
check-dist-sourcemaps.ts test(cli): stabilize coverage dist sourcemaps (#2961) 2026-05-05 22:50:42 +00:00
check-env-var-docs.ts docs: remove legacy markdown docs and refresh MDX checks (#3837) 2026-05-19 16:39:53 -07:00
check-installer-hash.sh fix(security): add SHA-256 integrity verification for Ollama installer (#2048) 2026-04-20 18:43:49 -07:00
check-node-version.js fix(install): fail fast on unsupported Node before prepare runs (#2399) 2026-05-04 22:43:27 -07:00
check-spdx-headers.sh feat(tooling): SPDX --fix, hook ordering, and strict shellcheck fixes (#670) 2026-03-24 09:21:03 -07:00
check-stale-dist.ts fix(onboard): warn when compiled dist/ is older than src/ (#1958) (#2161) 2026-04-21 06:23:52 -07:00
check-test-file-size-budget.ts ci: guard oversized test files (#4905) 2026-06-06 23:36:21 -07:00
check-version-tag-sync.sh fix: derive CLI version from git tags instead of hard-coded package.json (#1221) 2026-03-31 21:12:08 -07:00
clean-staged-tree.sh fix: improve gateway lifecycle recovery (#953) 2026-03-25 23:23:20 -04:00
codex-acp-wrapper.sh chore: upgrade OpenClaw from 2026.4.9 to 2026.4.24 (#2484) 2026-04-29 09:25:10 -07:00
debug.sh refactor(cli): wrap debug script with TypeScript command (#3088) 2026-05-06 19:51:33 +00:00
dev-tier-selector.ts refactor(scripts): migrate low-hanging JavaScript to TypeScript (#4367) 2026-05-29 08:33:32 -07:00
find-source-shape-tests.ts test(e2e): retire legacy shell lanes (#5756) 2026-06-29 22:32:24 -05:00
find-test-conditionals.ts ci(test): ratchet test conditional growth (#5558) 2026-06-19 16:39:25 -07:00
fix-coredns.sh refactor(cli): move coredns patcher behind internal command (#3074) 2026-05-06 12:05:36 -07:00
gateway-control.sh fix(sandbox): add host-mediated gateway restart (#5874) 2026-06-30 14:46:13 -04:00
generate-openclaw-config.mts fix(onboard): default extra-agents workspace and agentDir when omitted (#5661) 2026-06-24 14:06:34 -07:00
generate-platform-docs.py docs(platform): publish canonical platform support matrix (#4630) (#5712) 2026-06-24 08:37:22 -07:00
install-openshell.sh chore(openshell): upgrade supported version to 0.0.71 (#5596) 2026-06-30 11:44:27 -04:00
install.sh feat(cli): add Deep Agents aliases (#5881) 2026-06-26 14:19:19 -07:00
list-command-helper-uses.ts chore(scripts): add command-helper usage inventory (#2624) 2026-04-28 10:14:13 -07:00
managed-gateway-control.py fix(sandbox): add host-mediated gateway restart (#5874) 2026-06-30 14:46:13 -04:00
nemoclaw-start.sh fix(scripts): restore .openclaw perms after nemoclaw exec command (#6060) 2026-07-01 11:40:56 -07:00
npm-link-or-shim.sh refactor(cli): move dev shim install behind internal command (#3090) 2026-05-06 19:56:34 +00:00
ollama-auth-proxy.js fix(onboard): diagnose Ollama auth-proxy port conflicts and recover startup (#5040) 2026-06-10 10:42:33 -07:00
openclaw-config-guard.py fix(sandbox): add host-mediated gateway restart (#5874) 2026-06-30 14:46:13 -04:00
patch-openclaw-chat-send.js chore(scripts): add --audit mode to patch-openclaw-chat-send.js (#5462) 2026-06-24 13:59:45 -07:00
patch-openclaw-tool-catalog.js chore: upgrade agent runtime dependencies (#3925) 2026-05-22 08:26:00 -07:00
release-cut-tag.sh ci(release): automate latest tag promotion (#4702) 2026-06-03 00:05:57 -07:00
release-latest-tag.sh fix(release): configure latest tag identity (#4734) 2026-06-03 17:56:27 -07:00
release-notes-data.ts fix(release): configure latest tag identity (#4734) 2026-06-03 17:56:27 -07:00
release-plan.ts ci(release): automate latest tag promotion (#4702) 2026-06-03 00:05:57 -07:00
release-wait-latest.sh fix(release): configure latest tag identity (#4734) 2026-06-03 17:56:27 -07:00
setup-dns-proxy.sh refactor(cli): move dns proxy setup behind internal command (#3075) 2026-05-06 19:20:05 +00:00
setup-jetson.sh fix(jetson): apply br_netfilter on JetPack R39 when missing (Fixes #2418) (#2419) 2026-04-24 06:28:52 -07:00
smoke-macos-install.sh fix(inference): use NVIDIA inference credential env (#5366) 2026-06-12 18:22:26 -07:00
start-services.sh fix(cli): render dynamic banner boxes for long URLs (#3370) 2026-05-12 16:29:00 -07:00
state-dir-guard.py fix(sandbox): add host-mediated gateway restart (#5874) 2026-06-30 14:46:13 -04:00
sync-agent-variant-docs.ts docs: standardize copyable command examples (#4759) 2026-06-08 11:52:50 -07:00
test-inference-local.sh fix(deploy): correct vLLM HF model id and pass HF_TOKEN to VM (#2729) 2026-04-30 10:13:09 -07:00
test-inference.sh chore: fix all prek lint findings and wire prek into CI (#705) 2026-03-23 08:09:23 -07:00
type-safety-hotspots.ts feat(cli): report nullable union hotspots (#5524) 2026-06-17 00:03:43 -07:00
update-docker-pin.sh feat(onboard): use OpenShell Docker GPU sandboxes (#3001) 2026-05-11 10:26:13 -07:00
update-hermes-agent.sh feat(hermes): bump Hermes Agent to v2026.6.19 (#5594) 2026-06-25 13:17:36 -07:00
validate-configs.ts test(scanner): catch source-shape assertions (#4138) 2026-05-24 16:54:41 -07:00
walkthrough.sh fix(inference): use NVIDIA inference credential env (#5366) 2026-06-12 18:22:26 -07:00
watch-fern-preview.ts docs: generate agent variant code samples at build time (#4721) 2026-06-03 13:59:57 -07:00