fix(kmod): robustness pass + doc sync (review items #21 #22 #29 + chatgpt)

Eight small touches in vpnhide_kmod.c plus a README sync. No behaviour
change for normal probe paths — the per-UID filter, the matcher, and
all six hooks behave identically. Edits target failure modes and
maintainability around the existing logic.

Code:

  arm64 ABI guard
    Handlers read syscall arguments via regs->regs[N] (AAPCS64). Add
    `#ifndef CONFIG_ARM64 #error … #endif` so a non-arm64 build fails
    loudly instead of silently producing a module that reads garbage.

  maxactive 20 -> VPNHIDE_KRETPROBE_MAXACTIVE = 64
    All six probes used .maxactive = 20, only marginally above the
    NR_CPUS*2 default (~18 on a 9-core Pixel 8 Pro). Hot ioctl/netlink
    paths under multi-app concurrency can exhaust that and silently
    bump nmissed (= leaked iface). 64 buys headroom for ~30 KB total.

  dev_ioctl: replace `data->cmd = 0` magic flag with `bool active`
    The old code set cmd to 0 in the entry handler when the caller
    wasn't a target, then keyed the ret handler on `cmd == 0`. Magic
    sentinel; if any future ioctl number ever hashed to 0 the flag
    would silently misbehave.

  filter_ifconf_buf returns enum, sock_ioctl_ret handles partial
    writes
    Old function silently bailed on copy_from_user / copy_to_user
    failure and could leave userspace with a half-compacted buffer
    plus the original (now-stale) ifc_len. Now the function returns
    `FILTER_IFCONF_NO_CHANGE / CHANGED / COPY_FAULT` and the caller
    skips the put_user(ifc_len) on COPY_FAULT — better to leak all
    ifaces visibly than to expose a length-vs-content mismatch.

  put_user(ifc_len) error checked
    Previously dropped on the floor — if updating ifc_len failed,
    userspace would see compacted buffer with old length. Now logs
    via vpnhide_dbg and returns; userspace falls back to the
    pre-compaction view.

  READ_ONCE/WRITE_ONCE around debug_enabled
    Single bool, written from /proc/vpnhide_debug, read from every
    probe handler. Compiler can't tear or hoist now — kosher style
    for unsynchronised flags.

  Header comment: dev_ioctl/sock_ioctl
    Corrected the file-top hook list — it still claimed `dev_ifconf`
    for SIOCGIFCONF, but the actual probe is on `sock_ioctl` (LTO
    inlines dev_ifconf on 5.10 + the symbol moves out of
    sock_do_ioctl on 6.1+, both rationale already in the inline
    comment block at hook 2).

Doc:

  README.md `rtnl_fill_ifinfo` table row + the standalone
  `-EMSGSIZE trick` and `why NOT -EMSGSIZE` sections were stale
  after #103 (which made all three netlink fill probes use
  `skb_trim` + return 0). Replaced with one short joint section
  pointing at issue #38 for context.

Verified on Pixel 8 Pro (husky, android14-6.1, Android 16):
  Enforcing  : 26/26 PASS, COLD start ~1020 ms.
  Permissive : 22/26 PASS, same 4 by-design FAILs as before, no
               regression in netlink_getlink / netlink_getroute /
               getifaddrs / ioctl_* / proc_route / proc_fib_trie.
This commit is contained in:
okhsunrog 2026-04-27 01:34:33 +03:00
parent a4d2f9cb74
commit d2ff69816e
2 changed files with 108 additions and 43 deletions

View file

@ -10,7 +10,7 @@ Zero footprint in the target app's process -- no modified function prologues, no
|---|---|---|
| `dev_ioctl` | `SIOCGIFFLAGS`, `SIOCGIFNAME`, and other per-interface ioctls: returns `-ENODEV` for VPN interfaces | Direct `ioctl()` calls from native code (Flutter/Dart, JNI, C/C++) |
| `sock_ioctl` | `SIOCGIFCONF`: compacts VPN entries out of the returned interface array | Interface enumeration via `ioctl(SIOCGIFCONF)` |
| `rtnl_fill_ifinfo` | Returns `-EMSGSIZE` for VPN devices during RTM_NEWLINK netlink dumps, causing the kernel to skip them | `getifaddrs()` (which uses netlink internally), any netlink-based interface enumeration |
| `rtnl_fill_ifinfo` | Trims VPN entries from RTM_NEWLINK netlink dumps via `skb_trim` and returns 0 | `getifaddrs()` (which uses netlink internally), any netlink-based interface enumeration |
| `inet6_fill_ifaddr` | Trims VPN entries from RTM_GETADDR IPv6 responses via `skb_trim` | IPv6 address enumeration over netlink |
| `inet_fill_ifaddr` | Trims VPN entries from RTM_GETADDR IPv4 responses via `skb_trim` | IPv4 address enumeration over netlink |
| `fib_route_seq_show` | Forward-scans for VPN lines and compacts them out with `memmove` | `/proc/net/route` reads |
@ -112,17 +112,11 @@ The natural choice would be to hook `dev_ifconf` directly, but on GKI 5.10 (Clan
The entry handler stashes the userspace `argp`; the return handler reads back the buffer, compacts out VPN entries, and updates `ifc_len` via `put_user`. Cost is one `cmd == SIOCGIFCONF` compare per socket ioctl for non-target paths.
### rtnl_fill_ifinfo: -EMSGSIZE trick
### rtnl_fill_ifinfo / inet_fill_ifaddr / inet6_fill_ifaddr: skb_trim
To skip a VPN interface during a netlink RTM_GETLINK dump without corrupting the message stream, the return handler sets the return value to `-EMSGSIZE`. The dump iterator interprets this as "skb too small for this entry" and moves to the next device without adding the current one -- effectively skipping it.
All three netlink fill functions are skipped the same way: the entry handler saves `skb->len` before the fill writes anything; the return handler calls `skb_trim(skb, saved_len)` to undo whatever was written, then returns 0 (success). The dump iterator sees a successful entry of zero new bytes and advances to the next interface/address.
This works because the RTM_GETLINK dump iterator (`rtnl_dump_ifinfo`) processes one device at a time. When it gets `-EMSGSIZE`, it stops the current batch and returns what it has so far. On the next `recv()`, the iterator resumes from the skipped device -- but since `rtnl_fill_ifinfo` returns `-EMSGSIZE` again, it advances to the next device. The VPN entry is never seen by userspace.
### inet_fill_ifaddr / inet6_fill_ifaddr: why NOT -EMSGSIZE
The RTM_GETADDR dump uses a different iterator that behaves differently on `-EMSGSIZE`. When the first address entry in a fresh (empty) skb returns `-EMSGSIZE`, the iterator thinks the skb is too small for even one entry and retries indefinitely -- causing an infinite loop that hangs `getifaddrs()` and can freeze system services at boot.
Instead, we save `skb->len` before the fill function runs. In the return handler, we call `skb_trim(skb, saved_len)` to undo whatever the fill function wrote, then return 0 (success). The dump iterator sees a successful return but no new data was added, so it moves to the next address. This cleanly skips VPN addresses without triggering the retry logic.
We do **not** return `-EMSGSIZE` to skip a VPN entry. On Android 14 / 6.1 GKI kernels, the dump iterator interprets `-EMSGSIZE` on an empty skb as "buffer too small for even one entry" and retries the same entry forever — observed in production as a hang of `getifaddrs()` (issue #38). The `skb_trim`-and-return-0 path avoids the retry loop on every netlink dump function uniformly.
### fib_route_seq_show: seq_file buffer compaction

View file

@ -8,14 +8,18 @@
* works on stock Android GKI kernels with CONFIG_KPROBES=y.
*
* Hooks:
* - dev_ioctl: filters SIOCGIFFLAGS / SIOCGIFNAME
* - dev_ifconf: filters SIOCGIFCONF interface enumeration
* - dev_ioctl: filters SIOCGIFFLAGS / SIOCGIFNAME / SIOCGIFMTU / etc.
* - sock_ioctl: filters SIOCGIFCONF interface enumeration
* - rtnl_fill_ifinfo: filters RTM_NEWLINK netlink dumps (getifaddrs)
* - inet6_fill_ifaddr: filters RTM_GETADDR IPv6 responses (getifaddrs)
* - inet_fill_ifaddr: filters RTM_GETADDR IPv4 responses (getifaddrs)
* - fib_route_seq_show: filters /proc/net/route entries
*
* Target UIDs are written to /proc/vpnhide_targets from userspace.
*
* Architecture: arm64 only. The handlers read syscall arguments via
* `regs->regs[N]` (AAPCS64 calling convention). On other architectures
* those slots have a different meaning, so the build is gated below.
*/
#include <linux/module.h>
@ -39,18 +43,42 @@
#include "generated/iface_lists.h"
#ifndef CONFIG_ARM64
#error "vpnhide_kmod currently supports only arm64 (handlers read regs->regs[N] directly)"
#endif
#define MODNAME "vpnhide"
#define MAX_TARGET_UIDS 64
/*
* Pre-allocated kretprobe instance pool size, applied to every probe.
* Default kernel `register_kretprobe` falls back to NR_CPUS*2 ( 18 on
* a 9-core Pixel 8 Pro), which is too low for hot ioctl/netlink paths
* under multi-app concurrency exhausted pool causes silent
* `nmissed++` and the return handler skipped, which surfaces as a VPN
* iface leaking through a single probe call.
*
* 64 covers a comfortable working set (apps × threads doing
* getifaddrs/SIOCGIFCONF/route reads at once) without burning
* meaningful memory: 6 probes × 64 instances × ~80 B 30 KB total.
*/
#define VPNHIDE_KRETPROBE_MAXACTIVE 64
/* ------------------------------------------------------------------ */
/* Debug logging — toggled via /proc/vpnhide_debug */
/* ------------------------------------------------------------------ */
static bool debug_enabled;
/*
* `debug_enabled` is a single bool, written from /proc/vpnhide_debug
* and read from every probe handler. Use READ_ONCE/WRITE_ONCE so the
* compiler doesn't tear the access or hoist it across the probe-hot
* path kosher kernel style for unsynchronised flags.
*/
#define vpnhide_dbg(fmt, ...) \
do { \
if (debug_enabled) \
if (READ_ONCE(debug_enabled)) \
pr_info(MODNAME ": " fmt, ##__VA_ARGS__); \
} while (0)
@ -174,14 +202,15 @@ static ssize_t debug_write(struct file *file, const char __user *ubuf,
if (get_user(c, ubuf))
return -EFAULT;
debug_enabled = (c == '1' || c == 'Y' || c == 'y');
pr_info(MODNAME ": debug %s\n", debug_enabled ? "enabled" : "disabled");
WRITE_ONCE(debug_enabled, c == '1' || c == 'Y' || c == 'y');
pr_info(MODNAME ": debug %s\n",
READ_ONCE(debug_enabled) ? "enabled" : "disabled");
return count;
}
static int debug_show(struct seq_file *m, void *v)
{
seq_printf(m, "%d\n", debug_enabled ? 1 : 0);
seq_printf(m, "%d\n", READ_ONCE(debug_enabled) ? 1 : 0);
return 0;
}
@ -219,6 +248,7 @@ static const struct proc_ops debug_proc_ops = {
struct dev_ioctl_data {
unsigned int cmd;
struct ifreq *kifr; /* kernel pointer, saved from x2 */
bool active; /* true = caller is target UID, run ret handler */
};
static int dev_ioctl_entry(struct kretprobe_instance *ri, struct pt_regs *regs)
@ -227,16 +257,11 @@ static int dev_ioctl_entry(struct kretprobe_instance *ri, struct pt_regs *regs)
data->cmd = (unsigned int)regs->regs[1];
data->kifr = (struct ifreq *)regs->regs[2];
data->active = is_target_uid();
if (!is_target_uid()) {
vpnhide_dbg("dev_ioctl_entry: uid=%u target=0 cmd=0x%x\n",
from_kuid(&init_user_ns, current_uid()), data->cmd);
data->cmd = 0;
} else {
vpnhide_dbg("dev_ioctl_entry: uid=%u target=1 cmd=0x%x\n",
from_kuid(&init_user_ns, current_uid()), data->cmd);
}
vpnhide_dbg("dev_ioctl_entry: uid=%u target=%d cmd=0x%x\n",
from_kuid(&init_user_ns, current_uid()), data->active,
data->cmd);
return 0;
}
@ -245,7 +270,7 @@ static int dev_ioctl_ret(struct kretprobe_instance *ri, struct pt_regs *regs)
struct dev_ioctl_data *data = (void *)ri->data;
char name[IFNAMSIZ];
if (data->cmd == 0 || regs_return_value(regs) != 0)
if (!data->active || regs_return_value(regs) != 0)
return 0;
/*
@ -272,7 +297,7 @@ static struct kretprobe dev_ioctl_krp = {
.handler = dev_ioctl_ret,
.entry_handler = dev_ioctl_entry,
.data_size = sizeof(struct dev_ioctl_data),
.maxactive = 20,
.maxactive = VPNHIDE_KRETPROBE_MAXACTIVE,
.kp.symbol_name = "dev_ioctl",
};
@ -333,28 +358,51 @@ static int sock_ioctl_entry(struct kretprobe_instance *ri, struct pt_regs *regs)
return 0;
}
/* Compact VPN entries out of the userspace ifreq array and update
* ifc_len. */
static void filter_ifconf_buf(struct ifreq __user *usr_ifr, int n, int *out_len)
/*
* Why user-memory access is OK here:
*
* `sock_ioctl_ret` runs as a kretprobe return handler same process
* context that issued the SIOCGIFCONF syscall, kernel mode, original
* task is still mapped and addressable. copy_from_user/copy_to_user
* are safe in this context (it's the same userspace the original
* sock_ioctl handler accessed). PAN/uaccess primitives are honoured.
*
* Faults are handled cleanly: if the user buffer was unmapped or
* raced, the copy fails with -EFAULT and we report COPY_FAULT to the
* caller, who skips the ifc_len rewrite to avoid a half-filtered
* array (`buffer compacted, length unchanged`) escaping to userspace.
*/
enum filter_ifconf_result {
FILTER_IFCONF_NO_CHANGE,
FILTER_IFCONF_CHANGED,
FILTER_IFCONF_COPY_FAULT,
};
/* Compact VPN entries out of the userspace ifreq array. The caller is
* responsible for updating `ifc_len` only on FILTER_IFCONF_CHANGED. */
static enum filter_ifconf_result filter_ifconf_buf(struct ifreq __user *usr_ifr,
int n, int *out_len)
{
struct ifreq tmp;
int i, dst = 0;
for (i = 0; i < n; i++) {
if (copy_from_user(&tmp, &usr_ifr[i], sizeof(tmp)))
return;
return FILTER_IFCONF_COPY_FAULT;
tmp.ifr_name[IFNAMSIZ - 1] = '\0';
if (is_vpn_ifname(tmp.ifr_name))
continue;
if (dst != i) {
if (copy_to_user(&usr_ifr[dst], &tmp, sizeof(tmp)))
return;
return FILTER_IFCONF_COPY_FAULT;
}
dst++;
}
if (dst < n)
*out_len = dst * (int)sizeof(struct ifreq);
if (dst == n)
return FILTER_IFCONF_NO_CHANGE;
*out_len = dst * (int)sizeof(struct ifreq);
return FILTER_IFCONF_CHANGED;
}
static int sock_ioctl_ret(struct kretprobe_instance *ri, struct pt_regs *regs)
@ -363,6 +411,7 @@ static int sock_ioctl_ret(struct kretprobe_instance *ri, struct pt_regs *regs)
struct ifconf __user *uifc;
struct ifconf ifc;
int orig_len;
enum filter_ifconf_result res;
if (!data->target)
return 0;
@ -380,10 +429,32 @@ static int sock_ioctl_ret(struct kretprobe_instance *ri, struct pt_regs *regs)
return 0;
orig_len = ifc.ifc_len;
filter_ifconf_buf(ifc.ifc_req, ifc.ifc_len / (int)sizeof(struct ifreq),
&ifc.ifc_len);
if (ifc.ifc_len != orig_len) {
put_user(ifc.ifc_len, &uifc->ifc_len);
res = filter_ifconf_buf(ifc.ifc_req,
ifc.ifc_len / (int)sizeof(struct ifreq),
&ifc.ifc_len);
if (res == FILTER_IFCONF_COPY_FAULT) {
/*
* Partial copy failure buffer may already be
* half-rewritten. Don't update ifc_len: a shorter
* length on a partially-compacted buffer hides VPN
* entries past the truncation but lets earlier ones
* through, which is worse than just leaving
* everything visible. Userspace sees the original
* length and the (mostly-original) buffer.
*/
vpnhide_dbg(
"ifconf: copy fault during filter; ifc_len untouched\n");
return 0;
}
if (res == FILTER_IFCONF_CHANGED) {
if (put_user(ifc.ifc_len, &uifc->ifc_len)) {
vpnhide_dbg(
"ifconf: put_user(ifc_len=%d) failed; userspace will see compacted buffer with stale length\n",
ifc.ifc_len);
return 0;
}
vpnhide_dbg("ifconf filtered %d -> %d bytes\n", orig_len,
ifc.ifc_len);
}
@ -395,7 +466,7 @@ static struct kretprobe sock_ioctl_krp = {
.handler = sock_ioctl_ret,
.entry_handler = sock_ioctl_entry,
.data_size = sizeof(struct sock_ioctl_data),
.maxactive = 20,
.maxactive = VPNHIDE_KRETPROBE_MAXACTIVE,
.kp.symbol_name = "sock_ioctl",
};
@ -477,7 +548,7 @@ static struct kretprobe rtnl_fill_krp = {
.handler = rtnl_fill_ret,
.entry_handler = rtnl_fill_entry,
.data_size = sizeof(struct rtnl_fill_data),
.maxactive = 20,
.maxactive = VPNHIDE_KRETPROBE_MAXACTIVE,
.kp.symbol_name = "rtnl_fill_ifinfo",
};
@ -554,7 +625,7 @@ static struct kretprobe inet6_fill_krp = {
.handler = inet6_fill_ret,
.entry_handler = inet6_fill_entry,
.data_size = sizeof(struct inet6_fill_data),
.maxactive = 20,
.maxactive = VPNHIDE_KRETPROBE_MAXACTIVE,
.kp.symbol_name = "inet6_fill_ifaddr",
};
@ -618,7 +689,7 @@ static struct kretprobe inet_fill_krp = {
.handler = inet_fill_ret,
.entry_handler = inet_fill_entry,
.data_size = sizeof(struct inet_fill_data),
.maxactive = 20,
.maxactive = VPNHIDE_KRETPROBE_MAXACTIVE,
.kp.symbol_name = "inet_fill_ifaddr",
};
@ -729,7 +800,7 @@ static struct kretprobe fib_route_krp = {
.handler = fib_route_ret,
.entry_handler = fib_route_entry,
.data_size = sizeof(struct fib_route_data),
.maxactive = 20,
.maxactive = VPNHIDE_KRETPROBE_MAXACTIVE,
.kp.symbol_name = "fib_route_seq_show",
};