Fix CI tests on big-endian builds.
We have a long-standing issue on big-endian archs: it might be related
to utash or about how we use utash in ndpiReader
* Fix JA4 ALPN fingerprint to use first and last characters
According to the JA4 specification (line 2139), the ALPN field should
contain the first and last characters of the first ALPN extension value.
Currently, nDPI uses the first and second characters (alpn[0] and alpn[1]),
which produces incorrect fingerprints that don't match other JA4
implementations like Wireshark.
For example, with ALPN 'http/1.1':
- Current (incorrect): 'ht' (first + second char)
- Fixed (correct): 'h1' (first + last char)
This change ensures nDPI's JA4 implementation conforms to the official
specification and maintains interoperability with other JA4 tools.
Fixes: Incorrect JA4 ALPN fingerprint generation
* Fix JA4 ALPN implementation to correctly parse first ALPN protocol
The previous fix attempted to use strlen(ja->client.alpn)-1 but this was
insufficient because nDPI modifies the ALPN string by:
1. Adding null terminators that truncate the last character
2. Converting semicolons to dashes, affecting multi-protocol ALPNs
This complete fix:
- Adds alpn_original_last field to store the true last character
- Captures the last character of the FIRST ALPN protocol only (before ;/,)
- Preserves the original character before nDPI's string modifications
Now correctly implements JA4 spec: first + last characters of first ALPN protocol
Examples:
- ALPN 'h2;http/1.1' -> 'h2' (not 'h.' or 'h1')
- ALPN 'http/1.1' -> 'h1' (not 'ht' or 'h.')
Fixes: #2914
* Fix JA4 SNI detection to properly handle missing SNI extensions
Previously, nDPI incorrectly set JA4 SNI flag to 'd' (domain present) for
flows without any SNI extension. This was because the logic only checked
for NDPI_NUMERIC_IP_HOST risk (set when SNI contains IP) but didn't
distinguish between missing SNI and domain SNI.
Now properly detects:
- No SNI extension → 'i' flag
- SNI with IP address → 'i' flag
- SNI with domain → 'd' flag
This matches the JA4 specification.
Default ports trees are initialized during
`ndpi_finalize_initialization()`
Make `ndpi_init_detection_module()` less likely to fail, because there
are less memory allocations.
The hard limit of total number of protocols (internal and custom) is ~65535,
because protocol ids are `u_int16_t`...
API changes:
1. From `NDPI_MAX_SUPPORTED_PROTOCOLS + NDPI_MAX_NUM_CUSTOM_PROTOCOLS` to
`ndpi_get_num_protocols()` (after having called
`ndpi_finalize_initialization()`);
2. From `proto_id >= NDPI_MAX_SUPPORTED_PROTOCOLS` to
`ndpi_is_custom_protocol(proto_id)` (after having called
`ndpi_finalize_initialization()`);
Close#2136Close#2545
Partial revert of 88bfe2cf0: in the trees we save the index and no more
a pointer to `ndpi_struct->proto_defaults[]`.
Remove same functions from public API
See #2136
The main goal is not to have the bitmask depending on the total number
of protocols anymore: `NDPI_INTERNAL_PROTOCOL_BITMASK` depends only on
internal protocols, i.e. on `NDPI_MAX_INTERNAL_PROTOCOLS`, i.e.
custom-defined protocols are not counted.
See #2136
Keep the old data structure `NDPI_PROTOCOL_BITMASK` with the old
semantic.
Since we need to change the API (and all the application code...)
anyway, simplify the API: by default all the protocols are enabled.
If you need otherwise, please use `ndpi_init_detection_module_ext()`
instead of `ndpi_init_detection_module()` (you can find an example in
the `ndpiReader` code).
To update the application code you likely only need to remove these 3
lines from your code:
```
- NDPI_PROTOCOL_BITMASK all;
- NDPI_BITMASK_SET_ALL(all);
- ndpi_set_protocol_detection_bitmask2(ndpi_str, &all);
```
Removed an unused field and struct definition.
We use `registr_dissector()` instead of
`ndpi_set_bitmask_protocol_detection()`.
Every file in `src/lib/protocols/*.c` is a dissector.
Every dissector can handle multiple protocols.
The real goal is this small change:
```
struct call_function_struct {
- NDPI_PROTOCOL_BITMASK detection_bitmask;
```
i.e. getting rid of another protocol bitmask: this is mandatory to try
to fix#2136 (see also e845e8205b68752c997d05224d8b2fd45acde714)
As a nice side effect, we remove a bitmask comparison in the hot function
`check_ndpi_detection_func()`
TODO: change logging configuration from per-protocol to per-dissector
- default (0) is the native nDPI format
- MuonOF (1) has been added
The format can be changed using metadata.tcp_fingerprint_format
Added ability to identify mass scanners using TCP fingerprint