Commit graph

88 commits

Author SHA1 Message Date
Ivan Nardi
c216c09e2c fuzz: extend fuzzing coverage
Remove some unused code
2025-06-24 15:04:35 +02:00
Ivan Nardi
978ca1ba1a
New API to enable/disable protocols. Removed NDPI_LAST_IMPLEMENTED_PROTOCOL (#2894)
Change the API to enable/disable protocols: you can set that via the
standard `ndpi_set_config()` function, as every configuration
parameters. By default, all protocols are enabled.

Split the (local) context initialization into two phases:
* `ndpi_init_detection_module()`: generic part. It does not depend on the
configuration and on the protocols being enabled or not. It also
calculates the real number of internal protocols
* `ndpi_finalize_initialization()`: apply the configuration. All the
initialization stuff that depend on protocols being enabled or not
must be put here

This is the last step to have the protocols number fully calculated at
runtime

Remove a (now) useless fuzzer.

Important API changes:
* remove `NDPI_LAST_IMPLEMENTED_PROTOCOL` define
* remove `ndpi_get_num_internal_protocols()`. To get the number of
configured protocols (internal and custom) you must use
`ndpi_get_num_protocols()` after having called `ndpi_finalize_initialization()`
2025-06-23 11:24:18 +02:00
Ivan Nardi
2b3fdb4f8a
fuzz: try to improve coverage (#2883)
Revert of 2b14b46df3
2025-06-14 10:48:16 +02:00
Ivan Nardi
6da6991320
Rework sanity checks and remove some functions from API (#2882) 2025-06-12 16:07:56 +02:00
Ivan Nardi
bcfa3f5477 Rename ndpi_bitmask_dealloc into ndpi_bitmask_free 2025-06-09 09:30:30 +02:00
Ivan Nardi
cbd7136b34
Remove NDPI_PROTOCOL_BITMASK; add a new generic bitmask data structure (#2871)
The main difference is that the memory is allocated at runtime

Typical usercase:
```
struct ndpi_bitmask b;

ndpi_bitmask_alloc(&b, ndpi_get_num_internal_protocols());

ndpi_bitmask_set(&b, $BIT);
ndpi_bitmask_is_set(&b, $BIT);
[...]

ndpi_bitmask_dealloc(&b);

```

See #2136
2025-06-09 09:00:17 +02:00
Ivan Nardi
5e54531282
Remove ndpi_set_proto_defaults() from the API (#2863)
Add an explicit field to indicate if the protocol is custom or internal
2025-06-03 17:43:28 +02:00
Ivan Nardi
ed21057710
First step into a dynamic number of protocols (#2857)
We want to get rid of the defines `NDPI_MAX_SUPPORTED_PROTOCOLS` and
`NDPI_MAX_NUM_CUSTOM_PROTOCOLS`.

You can use:
```
ndpi_get_num_protocols()
```

See #2136

Removed some unused functions from public API
2025-06-03 10:22:15 +02:00
Ivan Nardi
70a72f1638
New API to enable/disable protocols; remove ndpi_set_protocol_detection_bitmask2() (#2853)
The main goal is not to have the bitmask depending on the total number
of protocols anymore: `NDPI_INTERNAL_PROTOCOL_BITMASK` depends only on
internal protocols, i.e. on `NDPI_MAX_INTERNAL_PROTOCOLS`, i.e.
custom-defined protocols are not counted.
See #2136

Keep the old data structure `NDPI_PROTOCOL_BITMASK` with the old
semantic.

Since we need to change the API (and all the application code...)
anyway, simplify the API: by default all the protocols are enabled.
If you need otherwise, please use `ndpi_init_detection_module_ext()`
instead of `ndpi_init_detection_module()` (you can find an example in
the `ndpiReader` code).

To update the application code you likely only need to remove these 3
lines from your code:
```
- NDPI_PROTOCOL_BITMASK all;
- NDPI_BITMASK_SET_ALL(all);
- ndpi_set_protocol_detection_bitmask2(ndpi_str, &all);
```

Removed an unused field and struct definition.
2025-06-03 09:45:46 +02:00
Ivan Nardi
8df79a7354
Follow-up of c1d372860 (TCP fingerprint format) (#2850) 2025-05-26 12:32:47 +02:00
Ivan Nardi
3e2d69b92a Follow-up of latest Signal call change (see: 4d41588a7) 2025-04-05 14:22:05 +02:00
Ivan Nardi
f4691c518a
fuzz: extend coverage (#2786) 2025-03-31 17:54:14 +02:00
Ivan Nardi
29eb89a88f
Improved configuration to enable/disable export of flow risk info (#2780)
Follow-up of f568313363: now the
configuration is for flow-risk, not global
2025-03-25 21:35:01 +01:00
Leonardo Teixeira Alves
c49d126d36
Add Autonomous System Organization to geoip (#2763)
Co-authored-by: Leonardo Teixeira Alves <leonardo.alves@zerum.com>
2025-03-06 14:47:17 +01:00
Ivan Nardi
f568313363
Add configuration parameter to enable/disable export of flow risk info (#2761)
For the most common protocols, avoid creating the string message if we
are not going to use it
2025-03-05 16:14:03 +01:00
Ivan Nardi
8ee59bb9b9
fuzz: extend fuzzing coverage (#2750) 2025-02-28 12:38:15 +01:00
Leonardo Teixeira Alves
3d0bfc7bfe
Add city as a geoip possibility (#2746) 2025-02-24 19:41:02 +01:00
Ivan Nardi
2d3f08362e
RTP: payload type info should be set only for real RTP flows (#2742) 2025-02-22 13:35:40 +01:00
Ivan Nardi
5f8545d97a
SSDP: add configuration for disabling metadata extraction (#2736) 2025-02-17 15:16:37 +01:00
Luca
64d536752e Compilation fix 2025-02-07 10:58:24 +01:00
Ivan Nardi
dd4807f8ee
bittorrent: add configuration for "hash" metadata (#2706)
Fix confidence value for same TCP flows
2025-01-31 17:42:47 +01:00
Ivan Nardi
cf8f761b93
HTTP: add configuration for some metadata (#2704)
Extend file configuration for just subclassification.
2025-01-31 16:26:53 +01:00
Ivan Nardi
ecf0f8ace3
Create a specific configuration for classification only (#2689)
In some scenarios, you might not be interested in flow metadata or
flow-risks at all, but you might want only flow (sub-)classification.
Examples: you only want to forward the traffic according to the
classification or you are only interested in some protocol statistics.

Create a new configuration file (for `ndpiReader`, but you can trivially
adapt it for the library itself) allowing exactly that. You can use it
via: `ndpiReader --conf=example/only_classification.conf ...`

Note that this way, the nDPI overhead is lower because it might need
less packets per flow:
* TLS: nDPI processes only the CH (in most cases) and not also the SH
  and certificates
* DNS: only the request is processed (instead of both request and
  response)

We might extend the same "shortcut-logic" (stop processing the flow
immediately when there is a final sub-classification) for others
protocols.

Add the configuration options to enable/disable the extraction of some
TLS metadata.
2025-01-31 15:10:30 +01:00
Ivan Nardi
d4fb7b0aa1
fuzz: extend fuzzing coverage (#2696) 2025-01-23 15:23:01 +01:00
Ivan Nardi
af011e338e
TLS: remove JA3C (#2679)
Last step of removing JA3C fingerprint

Remove some duplicate tests: testing with ja4c/ja3s disabled is already
performed by `disable_metadata_and_flowrisks` configuration.

Close:#2551
2025-01-14 15:02:20 +01:00
Ivan Nardi
63a3547f99
Add (kind of) support for loading a list of JA4C malicious fingerprints (#2678)
It might be usefull to be able to match traffic against a list of
suspicious JA4C fingerprints

Use the same code/logic/infrastructure used for JA3C (note that we are
going to remove JA3C...)

See: #2551
2025-01-14 12:05:03 +01:00
Ivan Nardi
bf830b4236
Add the ability to enable/disable every specific flow risks (#2653) 2025-01-06 16:53:29 +01:00
Ivan Nardi
f20cec4985
fuzz: improve fuzzing coverage (#2642)
Updtae pl7m code (Fix swap-direction mutation)
2024-12-11 16:41:35 +01:00
Ivan Nardi
cff8bd1bb2
Update flow->flow_multimedia_types to a bitmask (#2625)
In the same flow, we can have multiple multimedia types
2024-11-25 10:12:48 +01:00
Ivan Nardi
43f7dc9ba0
fuzz: extend fuzzing coverage (#2626) 2024-11-20 13:36:41 +01:00
Ivan Nardi
1bda2bf414 SIP: extract some basic metadata 2024-11-12 13:34:25 +01:00
Ivan Nardi
819291b7e4
Add configuration of TCP fingerprint computation (#2598)
Extend configuration of raw format of JA4C fingerprint
2024-10-18 16:58:06 +02:00
Ivan Nardi
521d0ca7a0
Add monitoring capability (#2588)
Allow nDPI to process the entire flows and not only the first N packets.
Usefull when the application is interested in some metadata spanning the
entire life of the session.

As initial step, only STUN flows can be put in monitoring.

See `doc/monitoring.md` for further details.

This feature is disabled by default.

Close #2583
2024-10-14 18:05:35 +02:00
Liam Wilson
cdda369e92
Add enable/disable guessing using client IP/port (#2569)
Add configurable options for whether to include client port or client IP
in the flow's protocol guesses. This defaults to include both client
port/IP if the protocol is not guessed with the server IP/port.

This is intended for when flow direction detection is enabled, so we
know that sport = client port, dport = server port.
2024-09-27 09:23:22 +02:00
Ivan Nardi
ddd08f913c
Add some heuristics to detect encrypted/obfuscated/proxied TLS flows (#2553)
Based on the paper: "Fingerprinting Obfuscated Proxy Traffic with
Encapsulated TLS Handshakes".
See: https://www.usenix.org/conference/usenixsecurity24/presentation/xue-fingerprinting

Basic idea:
* the packets/bytes distribution of a TLS handshake is quite unique
* this fingerprint is still detectable if the handshake is
encrypted/proxied/obfuscated

All heuristics are disabled by default.
2024-09-24 14:20:31 +02:00
Liam Wilson
80971e4a17
Allow IP guess before port in ndpi_detection_giveup (#2562)
Add dpi.guess_ip_before_port which when enabled uses classification
by-ip before classification by-port.
2024-09-20 10:25:41 +02:00
Ivan Nardi
0ddbda1f82
Add an heuristic to detect encrypted/obfuscated OpenVPN flows (#2547)
Based on the paper: "OpenVPN is Open to VPN Fingerprinting"
See: https://www.usenix.org/conference/usenixsecurity22/presentation/xue-diwen

Basic idea:
* the distribution of the first byte of the messages (i.e. the distribution
of the op-codes) is quite unique
* this fingerprint might be still detectable even if the OpenVPN packets are
somehow fully encrypted/obfuscated

The heuristic is disabled by default.
2024-09-16 18:38:26 +02:00
Nardi Ivan
85ebda434d OpenVPN, Wireguard: improve sub-classification
Allow sub-classification of OpenVPN/Wireguard flows using their server IP.
That is useful to detect the specific VPN application/app used.
At the moment, the supported protocols are: Mullvad, NordVPN, ProtonVPN.

This feature is configurable.
2024-09-05 16:36:50 +02:00
Ivan Nardi
767f403e0d
fuzz: improve fuzzing coverage (#2535)
Updtae pl7m code (fix a Use-of-uninitialized-value error and add GTP
support)
2024-09-03 12:40:45 +02:00
Ivan Nardi
338eedd05b
HTTP, QUIC, TLS: allow to disable sub-classification (#2533) 2024-09-03 12:35:45 +02:00
Ivan Nardi
34e1ac0bbb
fuzz: fix compilation (#2532) 2024-08-26 21:01:18 +02:00
Luca Deri
b627ec91d1 Compilation fixes 2024-08-24 17:18:33 +02:00
Ivan Nardi
65e31b0ea3
FPC: small improvements (#2512)
Add printing of fpc_dns statistics and add a general cconfiguration option.
Rework the code to be more generic and ready to handle other logics.
2024-07-22 17:42:23 +02:00
Ivan Nardi
843e487270
Add infrastructure for explicit support of Fist Packet Classification (#2488)
Let's start with some basic helpers and with FPC based on flow addresses.

See: #2322
2024-07-03 18:02:07 +02:00
Ivan Nardi
26cc1f131f
fuzz: improve fuzzing coverage (#2474)
Remove some code never triggered

AFP: the removed check is included in the following one
MQTT: fix flags extraction
2024-06-17 13:45:47 +02:00
Nardi Ivan
526cf6f291 Zoom: remove "stun_zoom" LRU cache
Since 070a0908b we are able to detect P2P calls directly from the packet
content, without any correlation among flows
2024-06-17 10:19:55 +02:00
Ivan Nardi
b90d39c4ac
RTP/STUN: look for STUN packets after RTP/RTCP classification (#2465)
After a flow has been classified as RTP or RTCP, nDPI might analyse more
packets to look for STUN/DTLS packets, i.e. to try to tell if this flow
is a "pure" RTP/RTCP flow or if the RTP/RTCP packets are multiplexed with
STUN/DTLS.
Useful for proper (sub)classification when the beginning of the flows
are not captured or if there are lost packets in the the captured traffic.

Disabled by default
2024-06-07 13:12:04 +02:00
Ivan Nardi
070a0908b3
Zoom: faster detection of P2P flows (#2467) 2024-06-07 09:50:41 +02:00
Ivan Nardi
95fe21015d
Remove "zoom" cache (#2420)
This cache was added in b6b4967aa, when there was no real Zoom support.
With 63f349319, a proper identification of multimedia stream has been
added, making this cache quite useless: any improvements on Zoom
classification should be properly done in Zoom dissector.

Tested for some months with a few 10Gbits links of residential traffic: the
cache pretty much never returned a valid hit.
2024-05-06 12:51:45 +02:00
Luca Deri
ad117bfaab
Domain Classification Improvements (#2396)
* Added
size_t ndpi_compress_str(const char * in, size_t len, char * out, size_t bufsize);
size_t ndpi_decompress_str(const char * in, size_t len, char * out, size_t bufsize);

used to compress short strings such as domain names. This code is based on
https://github.com/Ed-von-Schleck/shoco

* Major code rewrite for ndpi_hash and ndpi_domain_classify

* Improvements to make sure custom categories are loaded and enabled

* Fixed string encoding

* Extended SalesForce/Cloudflare domains list
2024-04-18 23:21:40 +02:00