Commit graph

351 commits

Author SHA1 Message Date
Danila Gornushko
877b2b4241
Merge pull request #118 from okhsunrog/feat/debug-logging-unify
Some checks are pending
CI / kmod (android14-5.15) (push) Waiting to run
CI / kmod (android14-6.1) (push) Waiting to run
CI / kmod (android15-6.6) (push) Waiting to run
CI / kmod (android16-6.12) (push) Waiting to run
CI / setup (push) Waiting to run
CI / lint (push) Blocked by required conditions
CI / kmod (android12-5.10) (push) Waiting to run
CI / kmod (android13-5.10) (push) Waiting to run
CI / kmod (android13-5.15) (push) Waiting to run
CI / zygisk (push) Blocked by required conditions
CI / lsposed (push) Blocked by required conditions
CI / portshide (push) Waiting to run
CI / release (push) Blocked by required conditions
feat(logging): unify kmod and zygisk debug-logging fan-out
2026-04-28 00:24:36 +03:00
okhsunrog
7b4f34be69 style: fix ktlint blank-line warnings in DebugLoggingPrefs.kt
ktlint requires a blank line between a top-level declaration and the
next declaration that's preceded by a comment block (rule
standard:spacing-between-declarations-with-comments). Adds the two
missing blank lines around the new ZYGISK_DEBUG_LOGGING_PERSIST and
ZYGISK_DEBUG_LOGGING_FILE constants from the previous commit.
2026-04-28 00:20:20 +03:00
okhsunrog
f543284ffb refactor(zygisk): persist debug_logging outside module dir
The Zygisk debug_logging flag lived only in the module dir
(/data/adb/modules/vpnhide_zygisk/), which KSU/Magisk wipe on every
module reinstall. Until the user next opened the app, all newly
forked target apps under that module would fall back to the OFF
default — even when the user had toggled debug logging ON. Same
problem the targets.txt file solved long ago by living in the
persistent dir; apply the same fix here.

- Canonical location: /data/adb/vpnhide_zygisk/debug_logging
  (alongside the existing targets.txt). Survives module reinstall.
- writeDebugFlagFiles writes BOTH the persistent file and the
  module-dir mirror on every toggle, so the .so still reads its
  current value via get_module_dir() fd at next fork.
- zygisk/module/service.sh copies persistent → module-dir at every
  boot, mirroring the existing targets.txt copy. Restores the
  module-dir mirror after a reinstall without needing the app
  to open.
- MainActivity.onCreate re-propagation comment updated: this call
  is now a safety-net for "reinstall mid-session, no reboot yet"
  rather than the only defence against module-dir wipe.
- docs/state.md sections 1, 2, 3 updated.

No changelog fragment — internal refactor; the user-visible effect
(less likely to need to re-toggle after zygisk reinstall) is too
subtle for a CHANGELOG line.
2026-04-28 00:12:58 +03:00
okhsunrog
96dd1bf307 feat(logging): include kmod in debug-logging toggle fan-out
The "Debug logging" toggle in Diagnostics drove three sinks: app
process (VpnHideLog.enabled), system_server hooks (/data/system/
vpnhide_debug_logging via inotify FileObserver), and the Zygisk
module (/data/adb/modules/vpnhide_zygisk/debug_logging at fork).
The kernel module's /proc/vpnhide_debug was its own little channel,
flipped only inside exportDebugZip's capture window — so a user
who enabled "Debug logging" persistently still got silent kmod,
and the kmod-flip in exportDebugZip would even reset it back to
0 at the end regardless of the user's preference.

Make kmod the fourth sink:

- writeDebugFlagFiles now drives /proc/vpnhide_debug alongside the
  other two file flags, batched into a single su invocation so the
  toggle UI doesn't pay three round-trips.
- /proc/vpnhide_debug is per-boot in-kernel state, so kmod's
  service.sh re-seeds it from the canonical /data/system/
  vpnhide_debug_logging at every boot — same model as targets.txt
  → /proc/vpnhide_targets.
- Manual echo 1/echo 0 > /proc/vpnhide_debug calls in exportDebugZip
  go away — applyDebugLoggingRuntime + the existing loggingWasForced
  save/restore block now handle kmod uniformly with the other sinks,
  fixing the "kmod stays OFF after capture even if user toggle was
  ON" bug as a side effect.
- Updated en + ru toggle description to mention dmesg, since the
  toggle is no longer logcat-only.
- docs/state.md §3 and §4 updated; new "Debug-logging fan-out"
  subsection in §3 with a diagram.

Behaviour change worth a changelog entry: a user with toggle ON
will now get verbose kmod entries in dmesg too.
2026-04-28 00:09:49 +03:00
Danila Gornushko
9523d13634
Merge pull request #117 from okhsunrog/docs/state-and-paths
docs: catalogue every persistent path the project touches
2026-04-27 23:41:25 +03:00
okhsunrog
3c0fe63289 docs: catalogue every persistent path the project touches
Adds docs/state.md — a reference doc grouped by location prefix
(/data/adb/modules/, /data/adb/vpnhide_*/, /data/system/vpnhide_*,
/proc/vpnhide_*, app SharedPrefs + filesDir, iptables chains,
external paths we read). Each entry records format, writers,
readers (with file:line cites), lifetime, and mode/owner/SELinux
label where it matters.

Includes a heads-up that Vector LSPosed redirects the app's
SharedPreferences to /data/misc/<vector-uuid>/prefs/<pkg>/, so
inspecting /data/data/<pkg>/shared_prefs/ from a root shell
shows nothing even after the user toggles a setting.

Linked from CLAUDE.md's "Read before touching code" list.
2026-04-27 23:33:40 +03:00
Danila Gornushko
604e5cbeb4
Merge pull request #116 from okhsunrog/fix/lsposed-reflection-smoke-check
Some checks failed
CI / kmod (android13-5.15) (push) Waiting to run
CI / kmod (android14-5.15) (push) Waiting to run
CI / kmod (android14-6.1) (push) Waiting to run
CI / kmod (android15-6.6) (push) Waiting to run
CI / kmod (android16-6.12) (push) Waiting to run
CI / setup (push) Waiting to run
CI / lint (push) Blocked by required conditions
CI / kmod (android12-5.10) (push) Waiting to run
CI / kmod (android13-5.10) (push) Waiting to run
CI / zygisk (push) Blocked by required conditions
CI / lsposed (push) Blocked by required conditions
CI / portshide (push) Waiting to run
CI / release (push) Blocked by required conditions
CI Image / build-push (push) Has been cancelled
feat(lsposed): smoke-check private AOSP fields used by writeToParcel hooks
2026-04-27 04:23:23 +03:00
okhsunrog
3ca5920404 feat(lsposed): smoke-check private AOSP fields used by writeToParcel hooks
The three writeToParcel hooks reach into private NetworkCapabilities /
NetworkInfo / LinkProperties fields by reflection (mIfaceName, mRoutes,
mStackedLinks, mTransportTypes, mNetworkCapabilities, mTransportInfo,
mNetworkType, mState, mDetailedState, mIsAvailable) plus two private
constructors. If a future AOSP rev renames or retypes any of those, the
hook silently fails open per call: getObjectField throws NoSuchFieldError,
the inner catch logs it but doesn't replace param.result, the original
writeToParcel runs and the VPN flag leaks to the target app. Worse, the
NC hook reads its critical field BEFORE the isTarget check, so a rename
spams an exception on every system_server NetworkCapabilities IPC, not
just target ones.

HookEntry.installSystemServerHooks now runs runReflectionSmokeCheck()
once at install (one boot, ~12 reflection lookups, not hot path) over a
declarative probe table for the 10 fields + 2 ctors. Each FieldProbe
carries a typeCheck predicate (e.g. mTransportTypes must be `long`,
mRoutes must be MutableList-assignable) and an optional minSdk gate so
mTransportInfo (API 29) is skipped on Android 9 / API 28 rather than
being reported as broken there. Per-hook gates split the keys into
critical vs non-critical: LP installs as long as mIfaceName + copy ctor
are intact (mRoutes / mStackedLinks already have inner try/catch and
degrade gracefully); NC installs as long as the two long bitmasks are
intact (mTransportInfo's existing inner try/catch covers it); NI is
all-or-nothing because its body has no per-field defensive catches.
A broken critical key skips the hook entirely instead of installing
one that would throw NoSuchFieldError on every call.

writeHookStatusFile now always writes `aosp_sdk=<int>` and, when the
smoke-check rejected anything, `broken_fields=<comma list>` (the keys
include `:type=<actual>` for type mismatches, so a bug report carries
exact AOSP-drift information). DashboardData reads both and emits an
ERROR issue with both the SDK and the broken list so users can see and
report the drift even when the lsposed module otherwise heartbeats as
Active.

Verified empirically on Pixel 8 Pro / Android 16 (SDK 36): post-reboot
status file shows aosp_sdk=36 and no broken_fields line; logcat shows
all three hooks firing per call without exceptions.
2026-04-27 04:14:44 +03:00
Danila Gornushko
70315fff21
Merge pull request #115 from okhsunrog/chore/publish-agents-md
chore: publish CLAUDE.md / AGENTS.md repo conventions
2026-04-27 03:29:38 +03:00
okhsunrog
c1514db72a chore: publish CLAUDE.md / AGENTS.md repo conventions
Lives at CLAUDE.md (Claude Code's filename), exposed as AGENTS.md via
symlink so vendor-neutral tooling that follows the AGENTS-convention
(Codex, Cursor, etc.) finds the same file. Both names point at one
canonical source.

Content:
  - layout overview (kmod / zygisk / lsposed / portshide)
  - links to CONTRIBUTING / docs / kmod/BUILDING (the rest of the repo
    rules already live there; this file just makes them prominent)
  - workflow rules: changelog fragment per user-visible change,
    don't bump VERSION / don't run release.py without ask, don't put
    `#NN` in commit messages (auto-links to wrong PRs)
  - build entry points: kmod/build.py, zygisk/build.py, gradle commands
  - design note about a possible KPatch-Next KPM port

The only side effect is that the auto-review workflow (claude-code-review)
now has CLAUDE.md to check compliance against — two of its four agents
were running idle without one. Scope is plain markdown, useful as a
short orientation file regardless of which (if any) AI tooling reads it.
2026-04-27 03:23:10 +03:00
Danila Gornushko
d26f7096f3
Merge pull request #113 from okhsunrog/fix/portshide-reapply-after-netd-flush
fix(portshide): re-apply iptables 30s after boot in case netd flushes
2026-04-27 02:59:08 +03:00
okhsunrog
b8682882c2 ci(claude-review): pass --comment so the plugin actually posts the review
The `code-review` plugin is dry-run by default — it formats the review
into the Job Summary but does NOT publish a PR comment unless `--comment`
is passed in the slash invocation. That's why every auto-review run so
far finished green with permission_denials_count > 0 and zero comments
on the PR: the action ran, Claude reviewed, but the plugin's last
sentence was "No --comment argument was provided, so no GitHub comment
will be posted."

Adding `--comment` to the prompt. Permissions were never the blocker —
the `claude[bot]` GitHub App already has write on issues + PRs at
install time, and the on-demand workflow (`@claude` mentions) has been
posting fine without any of our workflow-level changes.
2026-04-27 02:57:31 +03:00
okhsunrog
3fb956eb74 fix(portshide): re-apply iptables 30s after boot in case netd flushes
service.sh waits for netd's bw_OUTPUT chain to appear, then applies our
rules once. On slow boots netd has been observed to flush/rebuild its
own chains AFTER bw_OUTPUT first appears — wiping the rules we just
installed. The rules then stay missing until the user opens the app
and Save runs apply again.

Add a backgrounded second pass 30 seconds after the first apply. The
apply script is already idempotent (chains created with
`-N ... 2>/dev/null || true`, atomically rebuilt via
`iptables-restore --noflush`), so the extra pass is harmless when
nothing was wiped and self-healing when it was. Cost is one shell
process and one iptables-restore per boot.

shellcheck clean (no new lint).
2026-04-27 02:45:51 +03:00
okhsunrog
4a8f5e3495 ci(claude): revert workflow-level write perms, enable display_report
Earlier commit e977af0 raised pull-requests/issues to `write` in both
workflows on the assumption that workflow-level permissions gate the
Claude action's GitHub API. They don't — the action uses an OIDC token
exchanged for the `claude[bot]` GitHub App installation token, whose
permissions are configured at App install time and already include
read+write on issues + pull requests. Workflow `permissions:` only
control the default `secrets.GITHUB_TOKEN`, which the action doesn't
use unless `github_token: ${{ secrets.GITHUB_TOKEN }}` is passed in.

Revert the perms back to read-only (now matches reality) and add a
short comment explaining why — so the next reader doesn't try the
same dead end.

Also enable `display_report: true` on the auto-review job so Claude's
full review text shows up in the Actions log, even when it classifies
its inline findings as low-confidence and skips publishing them. Right
now we couldn't tell whether the action had nothing to flag or whether
the inline classifier filtered everything out — display_report makes
that visible without leaking secrets (it's the same review body that
would have been posted as a comment).
2026-04-27 02:43:43 +03:00
okhsunrog
e977af0064 ci(claude): grant pull-requests/issues write so reviews are actually posted
Both Claude workflows had only read-level scopes, so the auto-review
job ran for ~3 minutes per PR and silently dropped its findings — the
post-buffered-inline-comments step reported "No buffered inline
comments" because GitHub denied the API write. Same with the on-demand
@claude responder.

  claude-code-review.yml: pull-requests read -> write
  claude.yml:             pull-requests read -> write
                          issues read -> write

Deliberately keeping `contents: read` everywhere — Claude advises,
doesn't commit. If we ever want @claude-driven code edits, we'd add
that scope explicitly to the on-demand workflow only.
2026-04-27 02:29:12 +03:00
Danila Gornushko
f9e0ae7aca
Merge pull request #112 from okhsunrog/fix/lsposed-app-hardening
fix(lsposed/app): suExec timeout + cache refresh preserves edits
2026-04-27 02:28:28 +03:00
okhsunrog
83ebbd340c fix(lsposed/app): suExec timeout + cache refresh preserves edits
Two small Kotlin fixes — both prevent silent UX failures.

ShellUtils.suExec — bound the runtime of a single su invocation.
Previously `proc.waitFor()` blocked indefinitely if the su binary hung
(e.g. waiting on a GUI prompt the user dismissed). UI was already on
Dispatchers.IO so the main thread didn't block, but the spinner just
never went away.

  * Drain stdout on a dedicated thread (an AtomicReference holds the
    result), same as the existing stderr drain. `readText()` directly
    on the process input stream blocks until EOF — without a separate
    drain, a hung child means waitFor(timeout) never even gets reached.
  * `proc.waitFor(timeoutSec, TimeUnit.SECONDS)` with a 10-second cap.
    On timeout, destroyForcibly() the child and return exit=-1.
  * 1-second join on each drain to bound worst-case cleanup.
  * `SU_DEFAULT_TIMEOUT_SEC` constant + `timeoutSec` parameter on both
    `suExec` and `suExecAsync` so callers with a longer-running command
    (none today, but easy to add) can override.

AppHidingScreen / AppPickerScreen / PortsHidingScreen — don't drop
unsaved checkbox edits when a cache refresh fires under us.

  * `LaunchedEffect(cachedApps, targets)` rebuilds `allApps` from disk
    snapshot whenever either flow emits — ON_RESUME via
    `TargetsCache.ensureLoaded`, or another screen calling
    `TargetsCache.refresh()`. With dirty=true (user has unsaved
    checkbox toggles), the rebuild silently overwrote local state and
    flipped dirty back to false. Edits gone.
  * Add an `if (dirty) return@LaunchedEffect` guard at the top of each
    of the three rebuilders. Saves restore the snapshot-as-truth path
    naturally (Save flips dirty=false, then the next emission rebuilds).

Local: ktlint clean, `:app:lintDebug + :app:testDebugUnitTest` BUILD SUCCESSFUL.
Verified on Pixel 8 Pro: dashboard 26/26 PASS in Enforcing — no
regression in any probe.
2026-04-27 02:17:01 +03:00
Danila Gornushko
c92d690694
Merge pull request #110 from okhsunrog/fix/userspace-rust-robustness
fix(rust): tighten netlink/recvmsg edge cases + install rollback
2026-04-27 02:04:20 +03:00
Danila Gornushko
f7540cb39c
Merge pull request #111 from okhsunrog/add-claude-github-actions-1777244526193
Add Claude Code GitHub Workflow
2026-04-27 02:03:00 +03:00
Danila Gornushko
5b21f3a331 "Claude Code Review workflow" 2026-04-27 02:02:08 +03:00
Danila Gornushko
29e871d43c "Claude PR Assistant workflow" 2026-04-27 02:02:07 +03:00
okhsunrog
c63c12bf86 fix(rust): tighten netlink/recvmsg edge cases + install rollback
Four small touches across zygisk and lsposed/native, all on adversarial-
or kernel-edge-case paths. Happy paths and the per-app filter behaviour
are unchanged.

zygisk hooks: recvmsg multi-iov no longer bypasses the filter
  Old code bailed out as soon as `msg_iovlen != 1`. A caller could
  hand recvmsg an iov array with one extra zero-length iov to skip
  filtering entirely. Now we filter the portion of `ret` that landed
  in iov[0] (recvmsg fills iovs in order) and propagate any shrink
  back into the returned byte count. Bytes that fell into iov[1..]
  pass through — bionic only ever uses iov[0], so legitimate dumps
  are unaffected; an attacker who deliberately splits a netlink
  message across iovs at least loses everything that landed in
  iov[0], which is strictly better than the previous bypass.

zygisk install_hooks: roll back partial installs on error
  install_hooks() walks five libc symbols and `?`-propagates the
  first failure. Previously, if hook 1-4 succeeded and hook 5 failed,
  the process kept hooks 1-4 in place — the app would see filtered
  ioctl/getifaddrs/openat/recvmsg but unfiltered recv, a torn plan
  that's worse than no install at all. Now we collect each
  shadowhook stub as it installs and, on the first failure, walk
  back through them in LIFO order calling shadowhook_unhook before
  surfacing the original error. Added a thin `shadowhook::unhook`
  wrapper so the FFI call site stays out of lib.rs.

zygisk hooks: document the TOCTOU window in is_dirfd_proc_net
  match_rel_proc_net resolves dirfd via readlink before the open,
  so a caller who races dup2/fchdir between the two can pass the
  classifier and have the open land elsewhere. This is accepted
  exposure (caller-controlled self-DoS, real detectors don't go
  through dirfd anyway) — added a comment so the next reader
  doesn't think it's an oversight.

lsposed/native for_each_rtattr: pass bounds-checked payload slice
  Old callbacks computed `b.as_ptr().add(rta_off + 4)` and read
  payload from there. If a kernel response set rta_len == 4
  (header only, zero payload), the read would pass the rtattr
  loop check (`if rta.rta_len < 4 { break }`) and then access
  bytes belonging to the next attr or past the message end. Now
  for_each_rtattr verifies `off + rta_len <= end` and yields a
  `&[u8]` payload slice; callbacks check its length before
  reading (e.g. `payload.len() >= 4` for an i32 ifindex).
  Replaces the `*const i32` read with a safe `i32::from_ne_bytes`.

Verified on Pixel 8 Pro (husky, android14-6.1, Android 16):
  Enforcing  : 26/26 PASS, COLD start ~1.9 s.
  Permissive : 22/26 PASS — same four by-design FAILs as before
               (proc_dev / sys_class_net / proc_ipv6_route /
               proc_if_inet6). No regression in netlink_getlink,
               netlink_getroute, getifaddrs, ioctl_*, proc_route,
               proc_fib_trie. zygisk and APK-native md5 match local
               build artifacts.
2026-04-27 01:54:47 +03:00
Danila Gornushko
aa9b33c8bf
Merge pull request #109 from okhsunrog/fix/kmod-robustness-doc-sync
fix(kmod): robustness pass + doc sync
2026-04-27 01:38:50 +03:00
okhsunrog
d2ff69816e fix(kmod): robustness pass + doc sync (review items #21 #22 #29 + chatgpt)
Eight small touches in vpnhide_kmod.c plus a README sync. No behaviour
change for normal probe paths — the per-UID filter, the matcher, and
all six hooks behave identically. Edits target failure modes and
maintainability around the existing logic.

Code:

  arm64 ABI guard
    Handlers read syscall arguments via regs->regs[N] (AAPCS64). Add
    `#ifndef CONFIG_ARM64 #error … #endif` so a non-arm64 build fails
    loudly instead of silently producing a module that reads garbage.

  maxactive 20 -> VPNHIDE_KRETPROBE_MAXACTIVE = 64
    All six probes used .maxactive = 20, only marginally above the
    NR_CPUS*2 default (~18 on a 9-core Pixel 8 Pro). Hot ioctl/netlink
    paths under multi-app concurrency can exhaust that and silently
    bump nmissed (= leaked iface). 64 buys headroom for ~30 KB total.

  dev_ioctl: replace `data->cmd = 0` magic flag with `bool active`
    The old code set cmd to 0 in the entry handler when the caller
    wasn't a target, then keyed the ret handler on `cmd == 0`. Magic
    sentinel; if any future ioctl number ever hashed to 0 the flag
    would silently misbehave.

  filter_ifconf_buf returns enum, sock_ioctl_ret handles partial
    writes
    Old function silently bailed on copy_from_user / copy_to_user
    failure and could leave userspace with a half-compacted buffer
    plus the original (now-stale) ifc_len. Now the function returns
    `FILTER_IFCONF_NO_CHANGE / CHANGED / COPY_FAULT` and the caller
    skips the put_user(ifc_len) on COPY_FAULT — better to leak all
    ifaces visibly than to expose a length-vs-content mismatch.

  put_user(ifc_len) error checked
    Previously dropped on the floor — if updating ifc_len failed,
    userspace would see compacted buffer with old length. Now logs
    via vpnhide_dbg and returns; userspace falls back to the
    pre-compaction view.

  READ_ONCE/WRITE_ONCE around debug_enabled
    Single bool, written from /proc/vpnhide_debug, read from every
    probe handler. Compiler can't tear or hoist now — kosher style
    for unsynchronised flags.

  Header comment: dev_ioctl/sock_ioctl
    Corrected the file-top hook list — it still claimed `dev_ifconf`
    for SIOCGIFCONF, but the actual probe is on `sock_ioctl` (LTO
    inlines dev_ifconf on 5.10 + the symbol moves out of
    sock_do_ioctl on 6.1+, both rationale already in the inline
    comment block at hook 2).

Doc:

  README.md `rtnl_fill_ifinfo` table row + the standalone
  `-EMSGSIZE trick` and `why NOT -EMSGSIZE` sections were stale
  after #103 (which made all three netlink fill probes use
  `skb_trim` + return 0). Replaced with one short joint section
  pointing at issue #38 for context.

Verified on Pixel 8 Pro (husky, android14-6.1, Android 16):
  Enforcing  : 26/26 PASS, COLD start ~1020 ms.
  Permissive : 22/26 PASS, same 4 by-design FAILs as before, no
               regression in netlink_getlink / netlink_getroute /
               getifaddrs / ioctl_* / proc_route / proc_fib_trie.
2026-04-27 01:34:33 +03:00
Danila Gornushko
a4d2f9cb74
Merge pull request #108 from okhsunrog/chore/ci-and-scripts-cleanup
chore: CI + scripts cleanup (review items #12 #13 #24 #31 #32 #37)
2026-04-27 01:18:13 +03:00
okhsunrog
0d4cf09866 chore: CI + scripts cleanup (review items #12 #13 #24 #31 #32 #37)
Six small review-list items rolled together — all CI/dev-tooling, no
runtime behaviour change.

  #12  Dockerfile: pin Rust 1.95.0 and cargo-ndk 4.1.2 (was floating
       `stable` + latest cargo-ndk on monthly rebuild). Versions live
       in ENV vars to make the next bump a one-line edit.

  #13  Add shellcheck to lint job. SC2034/SC3043 excluded — Magisk
       reads SKIPUNZIP externally; Android's /system/bin/sh (mksh on
       Pixel) does support `local` despite POSIX. Verified locally
       that the 11 .sh files (module-side + dev tooling) pass.
       shellcheck baked into the CI image via apt; inline apt-get
       fallback covers the window before image rebuild.

  #24  ci.yml keystore.properties: replace heredoc with `printf '%s\n'`.
       Heredoc without single-quoted EOF re-expands $, backticks and
       backslashes in the password — printf takes the value verbatim.

  #31  scripts/release.py::patch_file now hard-fails when a regex
       pattern doesn't match (was silently leaving stale versions).

  #32  Split rotate_fragments_into_history into rotate + delete steps
       so release.py can save_json + write_md *before* unlinking the
       fragment files. If anything in between fails, fragments are
       still on disk and the run is retryable.

  #37  codegen-interfaces.py: emit `assert!(matches_vpn(…), msg)` /
       `assert!(!matches_vpn(…), msg)` instead of
       `assert_eq!(matches_vpn(…), true/false, msg)` —
       clippy::bool_assert_comparison was firing on every generated
       row under `cargo clippy --tests`. Both generated test modules
       regenerated. CI's clippy steps now also pass `--tests` so this
       class of regression is caught.
2026-04-27 01:14:03 +03:00
okhsunrog
e57417b686 ci: drop broken uniffi-bindgen pre-install from CI image
Some checks are pending
CI / kmod (android14-5.15) (push) Waiting to run
CI / kmod (android14-6.1) (push) Waiting to run
CI / kmod (android15-6.6) (push) Waiting to run
CI / kmod (android16-6.12) (push) Waiting to run
CI / setup (push) Waiting to run
CI / lint (push) Blocked by required conditions
CI / kmod (android12-5.10) (push) Waiting to run
CI / kmod (android13-5.10) (push) Waiting to run
CI / kmod (android13-5.15) (push) Waiting to run
CI / zygisk (push) Blocked by required conditions
CI / lsposed (push) Blocked by required conditions
CI / portshide (push) Waiting to run
CI / release (push) Blocked by required conditions
#106 added \`cargo install uniffi-bindgen --version "^0.29" --locked\`,
which fails ci-image rebuilds:

  error: could not find \`uniffi-bindgen\` in registry \`crates-io\`
         with version \`^0.29\`

Two errors in the original change:
  1. The crate Gobley installs is \`gobley-uniffi-bindgen\` (its own
     fork on crates.io at 0.3.7), not upstream \`uniffi-bindgen\`.
  2. Gobley installs the binary into \`app/build/gobley-tools-install/
     uniffi-bindgen/\`, not \`~/.cargo/bin\`. A globally pre-installed
     binary wouldn't satisfy the task's UP-TO-DATE check anyway.

\`org.gradle.caching=true\` from #106 already makes
\`installUniffiBindgen\` go UP-TO-DATE on warm runs (verified locally),
so the optimisation is in effect via the build cache instead.
2026-04-27 00:41:09 +03:00
Danila Gornushko
005a54a55f
Merge pull request #106 from okhsunrog/ci/more-gradle-speedups
ci: shave another ~80s off lint/lsposed jobs
2026-04-27 00:34:56 +03:00
okhsunrog
5350f8e2f6 ci: shave another ~80s off lint/lsposed jobs
Profiling the warm-cache run on PR #105 showed three remaining hot spots
in the Gradle phase:

  installUniffiBindgen          52s   ← cargo install on every CI build
  cargoBuildAndroidArm64Debug   30s   ← Rust crate compile
  lintAnalyze* (3 variants)     43s   ← AGP Lint × main + unit + androidTest

This PR cuts the first one entirely and trims the third.

  - Dockerfile: pre-install uniffi-bindgen 0.29.x in the CI image so
    Gobley's :app:installUniffiBindgen task finds it ready instead of
    rebuilding it from sources on every run. Triggers a ci-image
    rebuild on merge — wait for that workflow to finish before merging
    consumers (or the first lint/lsposed run will still hit the old
    image and behave as before).
  - lsposed/gradle.properties: enable build cache + configuration
    cache. Verified locally: `./gradlew :app:assembleDebug
    --configuration-cache` reports "Configuration cache entry stored"
    cleanly with Gobley 0.3.7 + AGP 8.9.3 + Kotlin 2.1.20.
  - lsposed/app/build.gradle.kts: `lint { checkTestSources = false }`.
    Skips lintAnalyzeDebugUnitTest / lintAnalyzeDebugAndroidTest. Test
    sources here are pure JVM unit-test logic — functional bugs caught
    by :app:testDebugUnitTest, no Android-lifecycle code to lint.
    Deliberately leave `checkReleaseBuilds` at its default so ad-hoc
    `./gradlew :app:lint` still catches R8/ProGuard issues.
  - .github/workflows/ci.yml: `:app:lint` -> `:app:lintDebug`. Lints
    the debug variant only on PRs; release-variant Lint stays
    available locally / for future tag-time CI.
  - docs/development.md: refresh local-lint snippet.

Expected effect on warm cache (cumulative on top of PR #105):
  lint     286s -> ~190s  (3m10s, -32%)
  lsposed  227s -> ~130s  (2m10s, -42%)
2026-04-27 00:28:25 +03:00
Danila Gornushko
7fd63f5e22
Merge pull request #105 from okhsunrog/ci/speed-up-gradle-jobs
ci: speed up gradle jobs + add ruff for python scripts
2026-04-27 00:01:36 +03:00
okhsunrog
71538b04a1 ci: replace bash [[ with POSIX case in lsposed Build APK step
Container jobs default to /bin/sh, not bash, so '[[ ... ]]' fails with
'[[: not found'. POSIX 'case' supports glob patterns natively.
2026-04-26 23:54:26 +03:00
okhsunrog
ecaddaa21d ci: pin ruff-action to v4.0.0 (no v4 alias — immutable releases) 2026-04-26 23:50:12 +03:00
okhsunrog
91013acb54 ci+chore: add ruff (format + lint) for python scripts
Repo had ~1800 lines of Python (kmod/build.py, scripts/*, zygisk/build.py,
portshide/build-zip.py) with no formatter or linter. Long-lived scripts
like scripts/release.py and scripts/codegen-interfaces.py benefit from
catching unused-import / undefined-name / outdated-syntax issues early.

  pyproject.toml — ruff config, target-py312, line-length 100,
                   rules E F W I B UP SIM. Excludes zygisk/third_party,
                   target/, .claude/.
  ci.yml — astral-sh/ruff-action@v4 for `format --check` and `check`,
           ahead of the slow Rust/Gradle steps so it fails fast.
  docs/development.md — add `uvx ruff …` to the local-lint snippet.

Cleanup applied (`ruff format` + `ruff check --fix`):
  - reformat: kmod/build.py, scripts/{changelog_lib,codegen-interfaces,
    release,stats}.py, zygisk/build.py
  - I001: split multi-name imports onto separate lines after the
    sys.path.insert prelude (kmod/build.py, zygisk/build.py)
  - E501 manual: wrap one console.print line in scripts/release.py

Stdlib-only invariant from scripts/build_lib.py is preserved — ruff is
a dev/CI tool, not imported at runtime.
2026-04-26 23:48:37 +03:00
okhsunrog
9986100a77 ci: speed up gradle jobs (cache + assembleDebug for PRs)
The two slowest CI jobs were both Gradle:
  lint     6m41s (Android lint = 264s of it)
  lsposed  6m44s (assembleRelease = 307s of it)
Cargo was already cached; Gradle was not.

Changes:

  - gradle/actions/setup-gradle@v6 in lint and lsposed jobs. Caches
    ~/.gradle/caches, wrapper, configuration cache. cache-read-only on
    PRs so only main pushes write it.
  - lint job now has a cargo cache too (was missing). Combined key for
    both Cargo.lock files.
  - lint: Android lint and Kotlin unit tests run in one Gradle
    invocation (./gradlew :app:lint :app:testDebugUnitTest). Saves a
    second Gobley/AGP configuration phase + JVM startup.
  - lsposed: assembleDebug for PRs and main pushes, assembleRelease only
    for v* tags. R8/ProGuard runs only when the artifact actually goes
    into a release.
  - Drop --no-daemon: with one invocation per job (or one warm daemon
    between two), keeping the daemon is cheaper than killing it.
  - Drop the manual `export ANDROID_NDK_ROOT="$ANDROID_NDK_HOME"`. The
    CI image's Dockerfile already sets ANDROID_NDK_ROOT (line 63), so
    the workaround is redundant.
2026-04-26 23:39:50 +03:00
Danila Gornushko
447c97a8a2
Merge pull request #104 from okhsunrog/chore/16kb-alignment-build-scripts
build: 16 KiB cdylib alignment + unified kmod/zygisk build scripts
2026-04-26 23:37:41 +03:00
okhsunrog
35b3dcdf50 build: align native cdylib on 16 KiB; unify kmod/zygisk build scripts
Two related changes that ship together because they touch the same
build-script + docs surface and were verified together on-device.

16 KiB alignment
  - zygisk/build.rs: pass `-Wl,-z,max-page-size=16384` to lld so the
    cdylib's LOAD segments line up on 16 KiB pages. NDK r28+ already
    does this by default, but the flag keeps r27 builds compatible.
  - lsposed/native/build.rs: new file, same flag, for libvpnhide_checks.so.
  - docs/development.md: bumped the NDK requirement to r28+ and noted
    the 16 KiB rationale.

Verified via `llvm-readelf -l`: both libvpnhide_zygisk.so and
libvpnhide_checks.so now show `Align 0x4000` on every LOAD segment.

Unified build entry points
  - kmod/build.py replaces kmod/build-zip.py. Single script that
    auto-detects whether to build natively (we're inside the DDK image
    or `--kdir` was passed) or to spawn `ghcr.io/ylarod/ddk-min` via
    podman/docker. CI uses the same script with `--inside-container`.
  - zygisk/build-zip.py renamed to zygisk/build.py for symmetry; logic
    unchanged.
  - kmod/BUILDING.md rewritten — local build is now one command:
    `./kmod/build.py --kmi android14-6.1` (or `--all`). The old
    hand-rolled podman/docker recipes are gone.
  - .github/workflows/ci.yml updated to call the new entry points.
    The DDK image tag in CI now has a comment pointing at
    `DDK_IMAGE_TAG` in kmod/build.py as the source of truth.
  - README.{md,en.md}, kmod/README.md, zygisk/README.md, docs/releasing.md,
    scripts/build_lib.py: reference updates.
  - README.en.md: also fixes a "bacame" typo and tightens the Windows
    zygisk-build note (the aux.rs / libgit2 issue is still real).

Verified end-to-end on Pixel 8 Pro (husky, android14-6.1, Android 16):
APK installs, kmod + zygisk modules load, all 26 self-checks PASS in
Enforcing, 22/26 PASS in Permissive (the same 4 by-design FAILs as
before — kmod doesn't cover those paths in Permissive).
2026-04-26 23:26:30 +03:00
Danila Gornushko
14f4765048
Merge pull request #103 from okhsunrog/fix/kmod-rtnl-fill-emsgsize-infinite-retry
fix(kmod): rtnl_fill_ifinfo skb_trim instead of -EMSGSIZE
2026-04-26 23:02:16 +03:00
okhsunrog
a0e645f506 fix(kmod): replace EMSGSIZE return with skb_trim in rtnl_fill_ifinfo
Returning -EMSGSIZE from rtnl_fill_ifinfo for VPN devs hangs RTM_GETLINK
dumps on android14-6.1: the dump iterator retries the same entry forever
on an empty skb. inet6_fill_ifaddr in this same file already documented
and worked around the issue using skb_trim + return-0; mirror that for
rtnl_fill_ifinfo so RTM_GETLINK skips VPN ifaces cleanly.

Reproduced on Pixel 8 Pro (husky, android14-6.1) with SELinux Permissive:
the netlink_getlink check inside dev.okhsunrog.vpnhide hung on splash
screen. After the fix all three netlink-backed checks (getifaddrs,
netlink_getlink, netlink_getroute) PASS in Permissive in <1s.
2026-04-26 22:54:40 +03:00
Danila Gornushko
e39e05c8eb
Merge pull request #102 from okhsunrog/fix/proc-net-tcp-truncation
fix(zygisk): grow /proc/net read buffer past 64 KiB instead of truncating
2026-04-26 21:15:03 +03:00
okhsunrog
ebe0e3676e fix(zygisk): grow /proc/net read buffer past 64 KiB instead of truncating
`open_filtered_proc_net` used a fixed 65 536-byte stack array as the
read buffer. On devices with several hundred concurrent TCP sockets
(p2p clients, browsers with many tabs, dev devices with tethering),
`/proc/net/tcp6` exceeds 64 KiB — the previous code silently dropped
the tail and handed the truncated content back to the caller via
memfd. An app reading its own socket list could miss real entries,
which is a correctness bug, not a security one (the truncated tail
might contain non-VPN sockets the app needs).

Replace the stack array with a thread-local growable Vec<u8>:

* Initial capacity 64 KiB matches the previous fixed buffer, so the
  first call (and most subsequent ones) does no reallocation.
* When the read fills capacity, `Vec::reserve(8 * 1024)` triggers
  amortised doubling — large files cost O(log size) reallocations.
* `clear()` between calls keeps capacity intact: the second and later
  calls in the same thread are zero-allocation.
* Bounded memory: per-thread overhead is `max-observed-size`. On
  Android only one or two threads ever open /proc/net/* (network-
  info worker, diagnostics probe), so the steady state is well under
  a megabyte of RAM.

Trade-off: had to drop the `const { ... }` block on the thread_local
because `Vec::with_capacity` is not const fn. The per-`with()` lazy-
init flag check is negligible relative to the syscalls we're about
to issue.

Empirically: on the device used for testing /proc/net/tcp6 stays at
~15 KiB (no truncation triggered with the old buffer either). The
fix is preventive for power-user scenarios where the limit is
actually reached. SELinux on stock Android blocks /proc/net/* for
untrusted_app domain entirely, so the new path is exercised only on
permissive ROMs or by privileged callers — testing was therefore
limited to verifying no regression on the happy path.
2026-04-26 17:46:13 +03:00
Danila Gornushko
7c7d725991
Merge pull request #101 from okhsunrog/fix/data-system-files-not-world-readable
fix: tighten /data/system/vpnhide_*.txt to 0640 root:system
2026-04-26 16:59:57 +03:00
okhsunrog
c571e59afd fix: tighten /data/system/vpnhide_*.txt to 0640 root:system
The three files written under /data/system/ to coordinate state between
the LSPosed system_server hook and the app — vpnhide_uids.txt,
vpnhide_hidden_pkgs.txt, vpnhide_observer_uids.txt — were chmodded
0644 root:root. /data/system/ itself is mode 0775 system:system,
traversable by untrusted apps, so any "other"-readable file there is
both enumerable (`ls /data/system/`) and openable by name. Untrusted
apps could:

  cat /data/system/vpnhide_uids.txt          # all target UIDs
  cat /data/system/vpnhide_hidden_pkgs.txt   # the hide list
  cat /data/system/vpnhide_observer_uids.txt # observer UIDs

If the reader's own UID is in vpnhide_uids.txt, that's a positive
"vpnhide is filtering me right now" detection — strictly stronger than
the presence-of-marker fingerprint we already closed for
vpnhide_hook_active in PR #100.

Switch every write site to mode 0640 + chown root:system. system_server
runs as UID 1000 with `system` (GID 1000) in its supplementary groups,
so it still gets read via the group bit. Untrusted apps fall to the
"other" octet (now ---) and get EACCES on open.

Empirically verified on Pixel:
  before: 644 root:root → `cat` from untrusted shell succeeds
  after:  640 root:system → untrusted shell EACCES;
                            `su system -c cat` (uid=1000) reads fine,
                            mirroring what system_server sees

Boot-time service.sh in both kmod and zygisk modules also include an
idempotent migration block that re-stamps any pre-PR files left at
0644 by an older version on the next boot. Closes #36 in REVIEW.
2026-04-26 16:41:06 +03:00
Danila Gornushko
750a2b13db
Merge pull request #100 from okhsunrog/fix/review-cheap-batch
Some checks are pending
CI / kmod (android13-5.15) (push) Waiting to run
CI / kmod (android14-5.15) (push) Waiting to run
CI / kmod (android14-6.1) (push) Waiting to run
CI / kmod (android15-6.6) (push) Waiting to run
CI / kmod (android16-6.12) (push) Waiting to run
CI / setup (push) Waiting to run
CI / lint (push) Blocked by required conditions
CI / kmod (android12-5.10) (push) Waiting to run
CI / kmod (android13-5.10) (push) Waiting to run
CI / zygisk (push) Blocked by required conditions
CI / lsposed (push) Blocked by required conditions
CI / portshide (push) Waiting to run
CI / release (push) Blocked by required conditions
fix: code-review cheap-batch — 10 small correctness/hardening fixes
2026-04-26 16:24:43 +03:00
okhsunrog
cd46097991 ci: narrow workflow contents permission to read; grant write only on release
Workflow-level `contents: write` was granted to every job — lint,
zygisk build, lsposed build, portshide build, kmod matrix — even
though only the release job needs it (to create the draft GitHub
release via softprops/action-gh-release@v2). Tighten to the
least-privilege default of `contents: read` at the workflow level
and override with `permissions: contents: write` on the release job
alone. Reduces blast radius if any of the lint/build jobs ever runs
untrusted code from a PR.
2026-04-26 16:04:43 +03:00
okhsunrog
91204f596a chore(lsposed): include vpnhide-zygisk tag in diagnostics logcat capture
The Diagnostics → Export debug bundle ran logcat with -s filtering
on VPNHideTest, VpnHide, and VpnHide-Dashboard tags only. zygisk
writes under `vpnhide-zygisk` (zygisk/src/lib.rs LOG_TAG), so the
zip the user ends up sharing was missing every native-side hook
trace — exactly the half that's hardest to reproduce locally.
2026-04-26 16:04:43 +03:00
okhsunrog
3c2ea4c1ff fix(lsposed): surface zygisk module-dir copy errors instead of swallowing them
`cp $ZYGISK_TARGETS $ZYGISK_MODULE_TARGETS 2>/dev/null; true` hid
both stderr and the exit code, so a read-only mount or SELinux
denial silently broke "edits in the app aren't picked up by zygisk
on next app launch" — the user had no signal.

Rewrite as `if [ -d ... ]; then cp ... 2>&1; fi`, capture exit and
output, and log a warn() only when both indicate a real failure
(non-zero exit + non-empty stderr). The dir-not-installed case
naturally produces exit=0 + empty output and stays quiet.
2026-04-26 16:04:43 +03:00
okhsunrog
d7ef5c7b8f refactor(lsposed): mStackedLinks rewrite only on actual changes
The else-branch in sanitizeLinkProperties' stacked-links loop set
`modified = true` whenever `stackedCopy !== value`, i.e. after any
successful clone. Cloning is the common case (the catch fallback is
just a defensive shim for a missing copy ctor), so the flag was
effectively always-true: every non-empty stacked map triggered the
trailing `stacked.clear() + putAll(filtered)`, even when no VPN data
was present and no modification happened.

Drop the `stackedCopy !== value` disjunct. `modified` now reflects
only real sanitization changes — the rewrite is skipped when there's
nothing to write back, and the clones in `filtered` are discarded
along with the never-mutated original entries staying in place. No
behavioural change for VPN-stripping callers; just less wasted work
on the non-VPN-bearing path.
2026-04-26 16:04:43 +03:00
okhsunrog
b9eacc63bc fix(lsposed): drop MODIFY from FileObserver mask on config files
inotify MODIFY fires on every write(2), not just on close. With the
current single-write `echo $b64 | base64 -d > $file` writers (see
ShellUtils.ensureSelfInTargets and the AppPicker save path) it
effectively always fires before CLOSE_WRITE, causing the hook to
invalidate the cache and re-read the file. If a writer ever switches
to a multi-write pattern (`echo line >> file` repeated, or anything
that flushes mid-buffer), the hook would see a partially-written
file. CLOSE_WRITE + MOVED_TO covers both single-write and atomic-
rename writers without the mid-write race.
2026-04-26 16:04:43 +03:00
okhsunrog
adec082804 fix(lsposed): stop making hook-status file world-readable
`/data/system/vpnhide_hook_active` was being chmodded to 0644 via
`setReadable(true, false)` so the VPN Hide app could read it
directly. But the app already runs every other read of files under
/data/system/ via `suExec` (see DashboardData.kt's read of this
exact file), so the world-readable bit was redundant — and a
discoverable marker for anti-tamper SDKs that scan /data/system/
for known filenames.

Drop the line; the file stays at the default 0640 system:system
mode, readable by system_server (which writes it) and by anything
running as root (which is how the app reads it).
2026-04-26 16:04:43 +03:00
okhsunrog
a0bec24576 chore(codegen): drop dead c == "" branch in test-name validation
`c` iterates one character at a time over the test name string;
a single character is never the empty string, so the disjunct was
always False. The remaining `0x20 <= ord(c) < 0x7F` already
correctly rejects non-ASCII names, so behaviour is unchanged.
2026-04-26 16:04:43 +03:00