NemoClaw/docs
Prekshi Vyas b8330ae5c4
fix(scripts): restore .openclaw perms after nemoclaw exec command (#6060)
<!-- markdownlint-disable MD041 -->
## Summary

`openclaw doctor --fix` can collapse mutable OpenClaw paths from
NemoClaw's multi-UID `2770/660` contract to OpenClaw's single-user
`700/600` defaults. This change restores that contract after both
entrypoint one-shot commands and the documented `nemoclaw <name> exec`
boundary, with deterministic child-versus-cleanup exit-status
precedence.

## Related Issue

Fixes #6047

## Changes

- Supervise entrypoint one-shot commands so `TERM` and `INT` are
forwarded, the direct child is reaped, permission cleanup always runs,
and the remote status is preserved when cleanup succeeds.
- After public OpenShell exec returns, inspect registered OpenClaw
sandboxes, repair only detected mutable-permission drift through the
installed descriptor-safe normalizer while holding the timer-bound
shields mutation lock, and require successful re-inspection. Hermes,
custom agents, unregistered sandboxes, and active shields locks remain
untouched.
- Move mutable-tree normalization, baseline capture, and empty-config
recovery into an installed root-trusted Python helper that operates
through pinned, no-follow descriptors.
- Authenticate the permanently privilege-dropped owner child with a
private Unix socket, `SO_PASSCRED`, exact credentials, and `SCM_RIGHTS`;
retain the exact directory/config descriptors across the privilege
boundary.
- Replace recovery baseline/config/hash entries with fresh inodes so
hardlinks, symlink swaps, directory replacement, and inode-reuse races
cannot turn root into a confused deputy.
- Fail closed on missing trusted helper, unexpected ownership, unsafe
links, metadata changes, malformed descriptor handoff, or incomplete
verification.
- Document host-side cleanup behavior, atomic recovery, failure
precedence, and safe operator recovery guidance.
- Add unit, integration, container, and live-target coverage for
permission drift, signals, cleanup precedence, capability loss, hardlink
safety, path swaps, protected symlinks, and trusted-helper selection.

## Type of Change

- [ ] Code change (feature, bug fix, or refactor)
- [x] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Quality Gates

- [x] Tests added or updated for changed behavior
- [ ] Existing tests cover changed behavior — justification:
- [ ] Tests not applicable — justification:
- [x] Docs updated for user-facing behavior changes
- [ ] Docs not applicable — justification:
- [x] Sensitive paths changed (security, policy, credentials, preflight,
onboarding, inference, runner, sandbox, or messaging)
- [x] Sensitive-path review completed or maintainer-approved waiver
recorded — reviewer/approval link/justification: independent review
through runtime head `85b0744310553ec4a141bdc031cdc881e8c57422`
confirmed the previously reproducible cross-phase ABA, external-hardlink
ownership/mode mutation, path-based recovery TOCTOU, root
helper-selection, and earlier-tree hardlink-alias issues are fixed.
Follow-up `64d6234a9198e14e582155ebae4c75c362373f10` changes test
control flow only. The owner child is permanently privilege-dropped
before recursive mutation and root performs no work without an
authenticated descriptor handoff. No runtime privilege-boundary blocker
remains; the same-UID child retains only authority that the sandbox user
already has over its own inode.
- [ ] Non-success, skipped, or missing CI check accepted by maintainer —
check name, approval link, and follow-up issue:

## Verification

- [x] PR description includes the DCO sign-off declaration and every
commit appears as `Verified` in GitHub
- [x] Git hooks passed during commit and push, or `npx prek run
--from-ref main --to-ref HEAD` passes
- [x] Targeted tests pass for changed behavior
- [ ] Full `npm test` passes (broad runtime changes only)
- [x] Quality Gates section completed with required justifications or
waivers
- [x] No secrets, API keys, or credentials committed
- [ ] `npm run docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

### Evidence

- Signed commit and push hooks: repository checks, Biome, ShellCheck,
Hadolint, gitleaks, CLI typecheck, full CLI test hook, source-shape
budget, and test-size budget passed.
- Focused host tests: the final runtime correction passed 67 CLI and 145
integration tests after the earlier focused suites; CLI type-checking
passed.
- Fresh production image: build passed; new E2E cases 30–30f all passed.
The full script reported 40 passes and two unrelated stale base-image
profile assertions.
- Independent container probes passed normal repair, exact-config
capture, empty-config recovery, hardlink/protected-target invariants,
and the final `700/600` to `2770/660` production-image repair path.
- Full CLI/coverage hooks, repository checks, source-shape budget, and
test-size budget passed under the repository's expected `umask 022`.
- Docs build completed with 0 errors and 2 pre-existing warnings;
agent-variant synchronization and docs checks passed.
- Security test-depth follow-ups remain non-blocking: malformed
ancillary-message variants, an isolated missing-`CAP_SETUID` case, 16
MiB boundary/source-mutation/temp-cleanup cases, exact-image provenance
when reusing an existing E2E tag, and a same-UID concurrent post-check
hardlink race that cannot increase the child process's existing
authority.

---

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added safer one-shot command handling with post-command config
permission cleanup.
* Improved config recovery and permission repair behavior for OpenClaw
environments.
* Updated documentation to describe the new cleanup and recovery
behavior more clearly.

* **Bug Fixes**
* Hardened config permission handling against symlinks, ownership
mismatches, and concurrent changes.
* Improved failure handling so unsafe or incomplete repairs now fail
closed with clearer status reporting.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
2026-07-01 11:40:56 -07:00
..
_components chore: retire docs-to-skills and make single compact user skill (#5699) 2026-06-23 18:44:50 -07:00
_ext fix(security): harden low-risk code scanning findings (#3657) 2026-05-17 17:47:40 -07:00
_templates initiate nemoclaw doc 2026-03-15 11:22:49 -07:00
about docs: refresh v0.0.71 release docs (#6070) 2026-06-30 18:38:44 -04:00
deployment docs: fix doc-validate findings across guide pages (#5630-#5640) (#5645) 2026-06-30 12:06:31 -07:00
get-started feat(credentials): add CLI subcommand to register provider credentials (#5969) 2026-07-01 00:01:53 -07:00
inference fix(inference): retire GLM 5.1 endpoint selection (#6069) 2026-07-01 10:00:36 -07:00
manage-sandboxes docs: refresh v0.0.70 release docs (#6067) 2026-06-30 16:59:40 -04:00
monitoring docs: fix doc-validate findings across guide pages (#5630-#5640) (#5645) 2026-06-30 12:06:31 -07:00
network-policy docs: fix doc-validate findings across guide pages (#5630-#5640) (#5645) 2026-06-30 12:06:31 -07:00
reference fix(scripts): restore .openclaw perms after nemoclaw exec command (#6060) 2026-07-01 11:40:56 -07:00
resources docs: refresh technical documentation style (#5875) 2026-06-26 11:34:06 -07:00
security fix(sandbox): add host-mediated gateway restart (#5874) 2026-06-30 14:46:13 -04:00
.docs-skip docs: fix QA-reported command drift (#5247) 2026-06-11 11:11:33 -07:00
AGENTS.md chore: retire docs-to-skills and make single compact user skill (#5699) 2026-06-23 18:44:50 -07:00
CONTRIBUTING.md docs: refresh technical documentation style (#5875) 2026-06-26 11:34:06 -07:00
index.mdx docs: clarify copyable how-to examples (#5828) 2026-06-25 18:43:48 -07:00
index.yml docs: add model capability audit matrix (#5528) 2026-06-30 20:49:45 +00:00