Commit graph

99 commits

Author SHA1 Message Date
Ivan Nardi
338eedd05b
HTTP, QUIC, TLS: allow to disable sub-classification (#2533) 2024-09-03 12:35:45 +02:00
Ivan Nardi
34e1ac0bbb
fuzz: fix compilation (#2532) 2024-08-26 21:01:18 +02:00
Luca Deri
b627ec91d1 Compilation fixes 2024-08-24 17:18:33 +02:00
Ivan Nardi
65e31b0ea3
FPC: small improvements (#2512)
Add printing of fpc_dns statistics and add a general cconfiguration option.
Rework the code to be more generic and ready to handle other logics.
2024-07-22 17:42:23 +02:00
Ivan Nardi
843e487270
Add infrastructure for explicit support of Fist Packet Classification (#2488)
Let's start with some basic helpers and with FPC based on flow addresses.

See: #2322
2024-07-03 18:02:07 +02:00
Ivan Nardi
26cc1f131f
fuzz: improve fuzzing coverage (#2474)
Remove some code never triggered

AFP: the removed check is included in the following one
MQTT: fix flags extraction
2024-06-17 13:45:47 +02:00
Nardi Ivan
526cf6f291 Zoom: remove "stun_zoom" LRU cache
Since 070a0908b we are able to detect P2P calls directly from the packet
content, without any correlation among flows
2024-06-17 10:19:55 +02:00
Ivan Nardi
b90d39c4ac
RTP/STUN: look for STUN packets after RTP/RTCP classification (#2465)
After a flow has been classified as RTP or RTCP, nDPI might analyse more
packets to look for STUN/DTLS packets, i.e. to try to tell if this flow
is a "pure" RTP/RTCP flow or if the RTP/RTCP packets are multiplexed with
STUN/DTLS.
Useful for proper (sub)classification when the beginning of the flows
are not captured or if there are lost packets in the the captured traffic.

Disabled by default
2024-06-07 13:12:04 +02:00
Ivan Nardi
070a0908b3
Zoom: faster detection of P2P flows (#2467) 2024-06-07 09:50:41 +02:00
Ivan Nardi
95fe21015d
Remove "zoom" cache (#2420)
This cache was added in b6b4967aa, when there was no real Zoom support.
With 63f349319, a proper identification of multimedia stream has been
added, making this cache quite useless: any improvements on Zoom
classification should be properly done in Zoom dissector.

Tested for some months with a few 10Gbits links of residential traffic: the
cache pretty much never returned a valid hit.
2024-05-06 12:51:45 +02:00
Luca Deri
ad117bfaab
Domain Classification Improvements (#2396)
* Added
size_t ndpi_compress_str(const char * in, size_t len, char * out, size_t bufsize);
size_t ndpi_decompress_str(const char * in, size_t len, char * out, size_t bufsize);

used to compress short strings such as domain names. This code is based on
https://github.com/Ed-von-Schleck/shoco

* Major code rewrite for ndpi_hash and ndpi_domain_classify

* Improvements to make sure custom categories are loaded and enabled

* Fixed string encoding

* Extended SalesForce/Cloudflare domains list
2024-04-18 23:21:40 +02:00
Ivan Nardi
0535e54484
STUN: fix boundary checks on attribute list parsing (#2387)
Restore all unit tests.
Add some configuration knobs.
Fix the endianess.
2024-04-12 22:55:51 +02:00
Ivan Nardi
1b3ef7d7b2
STUN: improve extraction of Mapped-Address metadata (#2370)
Enable parsing of Mapped-Address attribute for all STUN flows: that
means that STUN classification might require more packets.

Add a configuration knob to enable/disable this feature.

Note that we can have (any) STUN metadata also for flows *not*
classified as STUN (because of DTLS).

Add support for ipv6.

Restore the correct extra dissection logic for Telegram flows.
2024-04-08 10:24:51 +02:00
Toni
41eef9246c
Disable -Wno-unused-parameter -Wno-unused-function. (#2358)
* unused parameters and functions pollute the code and decrease readability

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-04-03 14:10:21 +02:00
Ivan Nardi
6152d595e8
STUN: add a parameter to configure how long the extra dissection lasts (#2336)
Tradeoff: performance (i.e. number of packets) vs sub-classification
2024-03-07 14:39:32 +01:00
Ivan Nardi
03ecb026ff
fuzz: improve fuzzing coverage (#2309) 2024-02-09 19:19:03 +01:00
Ivan Nardi
400cd516b5
Allow multiple struct ndpi_detection_module_struct to share some state (#2271)
Add the concept of "global context".

Right now every instance of `struct ndpi_detection_module_struct` (we
will call it "local context" in this description) is completely
independent from each other. This provide optimal performances in
multithreaded environment, where we pin each local context to a thread,
and each thread to a specific CPU core: we don't have any data shared
across the cores.

Each local context has, internally, also some information correlating
**different** flows; something like:
```
if flow1 (PeerA <-> Peer B) is PROTOCOL_X; then
  flow2 (PeerC <-> PeerD) will be PROTOCOL_Y
```
To get optimal classification results, both flow1 and flow2 must be
processed by the same local context. This is not an issue at all in the far
most common scenario where there is only one local context, but it might
be impractical in some more complex scenarios.

Create the concept of "global context": multiple local contexts can use
the same global context and share some data (structures) using it.
This way the data correlating multiple flows can be read/write from
different local contexts.
This is an optional feature, disabled by default.

Obviously data structures shared in a global context must be thread safe.
This PR updates the code of the LRU implementation to be, optionally,
thread safe.

Right now, only the LRU caches can be shared; the other main structures
(trees and automas) are basically read-only: there is little sense in
sharing them. Furthermore, these structures don't have any information
correlating multiple flows.

Every LRU cache can be shared, independently from the others, via
`ndpi_set_config(ndpi_struct, NULL, "lru.$CACHE_NAME.scope", "1")`.

It's up to the user to find the right trade-off between performances
(i.e. without shared data) and classification results (i.e. with some
shared data among the local contexts), depending on the specific traffic
patterns and on the algorithms used to balance the flows across the
threads/cores/local contexts.

Add some basic examples of library initialization in
`doc/library_initialization.md`.

This code needs libpthread as external dependency. It shouldn't be a big
issue; however a configure flag has been added to disable global context
support. A new CI job has been added to test it.

TODO: we should need to find a proper way to add some tests on
multithreaded enviroment... not an easy task...

*** API changes ***

If you are not interested in this feature, simply add a NULL parameter to
any `ndpi_init_detection_module()` calls.
2024-02-01 15:33:11 +01:00
Ivan Nardi
92c2ac5a0f
fuzz: fuzz_config: try restoring good coverage (#2291)
Last changes reduce fuzzing coverage of this fuzzer :(
2024-01-29 10:53:28 +01:00
Ivan Nardi
b3efa7d3fe
fuzz: fuzz_config: we need bigegr inputs (#2285) 2024-01-25 09:58:20 +01:00
Ivan Nardi
d577508727
fuzz: extend fuzzing coverage (#2281) 2024-01-24 21:16:58 +01:00
Ivan Nardi
42d23cff6a
config: follow-up (#2268)
Some changes in the parameters names.
Add a fuzzer to fuzz the configuration file format.
Add the infrastructure to configuratin callbacks.
Add an helper to map LRU cache indexes to names.
2024-01-20 16:14:41 +01:00
Nardi Ivan
0712d496fe config: allow configuration of guessing algorithms 2024-01-18 10:21:24 +01:00
Nardi Ivan
6c85f10cd5 config: move debug/log configuration to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
c704be1a20 config: DNS: add two configuration options
* Enable/disable sub-classification of DNS flows
* Enable/disable processing of DNS responses
2024-01-18 10:21:24 +01:00
Nardi Ivan
950f209a17 config: HTTP: enable/disable processing of HTTP responses 2024-01-18 10:21:24 +01:00
Nardi Ivan
c669044a44 config: configure TLS certificate expiration with the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
88720331ae config: remove enum ndpi_prefs 2024-01-18 10:21:24 +01:00
Nardi Ivan
1289951b32 config: remove ndpi_set_detection_preferences() 2024-01-18 10:21:24 +01:00
Nardi Ivan
311d8b6dae config: move cfg of aggressiviness and opportunistic TLS to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
4cbe2674ab config: move IP lists configurations to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
f55358973f config: move LRU cache configurations to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
d72a760ac3 New API for library configuration
This is the first step into providing (more) configuration options in nDPI.

The idea is to have a simple way to configure (most of) nDPI: only one
function (`ndpi_set_config()`) to set any configuration parameters
(in the present or on in the future) and we try to keep this function
prototype as agnostic as possible.

You can configure the library:
* via API, using `ndpi_set_config()`
* via a configuration file, in a text format

This way, anytime we need to add a new configuration parameter:
* we don't need to add two public functions (a getter and a setter)
* we don't break API/ABI compatibility of the library; even changing
the parameter type (from integer to a list of integer, for example)
doesn't break the compatibility.

The complete list of configuration options is provided in
`doc/configuration_parameters.md`.

As a first example, two configuration knobs are provided:
* the ability to enable/disable the extraction of the sha1 fingerprint of
the TLS certificates.
* the upper limit on the number of packets per flow that will be subject
to inspection
2024-01-18 10:21:24 +01:00
Ivan Nardi
b3f2b1bb7f
STUN: rework extra dissection (#2202)
Keep looking for RTP packets but remove the monitoring concept.
We will re-introduce a more general concept of "flow in monitoring
state" later.
The function was disabled by default.
Some configuration knobs will be provided when/if #2190 is merged.
2023-12-11 14:53:12 +01:00
Ivan Nardi
3b35cb37d9
Keep separating public and private API (#2157)
See: b08c787fe
2023-11-29 17:13:00 +01:00
Ivan Nardi
03fd155ae3
IPv6: add support for custom categories (#2126) 2023-10-29 12:56:44 +01:00
Nardi Ivan
1366d94156 fuzzing: extend fuzzing coverage
Try fuzzing some functions which write to file/file descriptor; to avoid
slowing the fuzzer, close its stdout
2023-10-09 15:41:46 +02:00
Ivan Nardi
cc4461f424
fuzz: extend coverage (#2073) 2023-08-20 15:18:19 +02:00
Ivan Nardi
950f5cc4e3
fuzz: extend fuzzing coverage (#2040)
Some notes:
* libinjection: according to https://github.com/libinjection/libinjection/issues/44,
it seems NULL characters are valid in the input string;
* RTP: `rtp_get_stream_type()` is called only for RTP packets; if you
want to tell RTP from RTCP you should use `is_rtp_or_rtcp()`;
* TLS: unnecessary check; we already make the same check just above, at
the beginning of the `while` loop
2023-07-11 10:12:08 +02:00
Ivan Nardi
3608ab01b6
STUN: keep monitoring/processing STUN flows (#2012)
Look for RTP packets in the STUN sessions.
TODO: tell RTP from RTCP
2023-06-21 09:16:20 +02:00
Ivan Nardi
40b6d5a2e1
fuzz: extend fuzzers coverage (#1952) 2023-04-25 16:37:28 +02:00
Toni
b6629ba2be
Improved debug output. (#1951)
* try to get rid of some `printf(..)`s as they do not belong to a shared library
 * replaced all `exit(..)`s with `abort()`s to indicate an abnormal process termination

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-04-21 12:40:26 +02:00
Ivan Nardi
032e778a6d
Simplify ndpi_internal_guess_undetected_protocol() (#1941)
`ndpi_guess_undetected_protocol()/ndpi_internal_guess_undetected_protocol()`
is a strange function:
* it is exported by the library and it is actively used by `ntopng`
* it is intrinsecally ipv4-only
* it returns basically something like "classification_by_ip"/"classification_by_port"
(these information have already been calculated in `ndpi_do_guess()`...)
* it access the bittorrent LRU caches (similarly to
`ndpi_detection_giveup()` but without all the other caches...)

So:
* make the interface IPv4/6 agnostic
* use the classifications already available

This work will allow to make the Bittorrent caches IPV6-aware (see
81e1ea5).

Handle Dropbox classification "by-port" in the "standard" way.
2023-04-12 14:39:10 +02:00
Ivan Nardi
4d11941d32
Ookla: rework detection (#1922)
The logic of the LRU cache has been changed: once we know an ip has
connected to an Ookla server, all the following (unknown) flows (for
a short time interval) from the same ip to the port 8080 are treated
as Ookla ones.

Most of the changes in this commit are about introducing the concept of
"aggressive detection". In some cases, to properly detect a
protocol we might use some statistical/behavior logic that, from one
side, let us to identify the protocol more often but, from the other
side, might lead to some false positives.
To allow the user/application to easily detect when such logic has been
triggered, the new confidence value `NDPI_CONFIDENCE_DPI_AGGRESSIVE` has been
added.
It is always possible to disable/configure this kind of logic via the
API.

Detection of Ookla flows using plain TLS over port 8080 is the first
example of aggressive detection in nDPI.

Tested with:
* Android 9.0 with app 4.8.3
* Ubuntu 20.04 with Firefox 110
* Win 10 with app 1.15 and 1.16
* Win 10 with Chrome 108, Edge 108 and Firefox 106
2023-03-30 17:13:51 +02:00
Ivan Nardi
4075324e2b
fuzz: extend fuzz coverage (#1888) 2023-02-16 18:04:34 +01:00
Ivan Nardi
b51a2ac72a
fuzz: some improvements and add two new fuzzers (#1881)
Remove `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` define from
`fuzz/Makefile.am`; it is already included by the main configure script
(when fuzzing).

Add a knob to force disabling of AESNI optimizations: this way we can
fuzz also no-aesni crypto code.

Move CRC32 algorithm into the library.

Add some fake traces to extend fuzzing coverage. Note that these traces
are hand-made (via scapy/curl) and must not be used as "proof" that the
dissectors are really able to identify this kind of traffic.

Some small updates to some dissectors:

CSGO: remove a wrong rule (never triggered, BTW). Any UDP packet starting
with "VS01" will be classified as STEAM (see steam.c around line 111).
Googling it, it seems right so.

XBOX: XBOX only analyses UDP flows while HTTP only TCP ones; therefore
that condition is false.

RTP, STUN: removed useless "break"s

Zattoo: `flow->zattoo_stage` is never set to any values greater or equal
to 5, so these checks are never true.

PPStream: `flow->l4.udp.ppstream_stage` is never read. Delete it.

TeamSpeak: we check for `flow->packet_counter == 3` just above, so the
following check `flow->packet_counter >= 3` is always false.
2023-02-09 20:02:12 +01:00
Ivan Nardi
9fc724de5a
Add some fuzzers to test other data structures. (#1870)
Start using a dictionary for fuzzing (see:
https://llvm.org/docs/LibFuzzer.html#dictionaries).
Remove some dead code.
Fuzzing with debug enabled is not usually a great idea (from performance
POV). Keep the code since it might be useful while debugging.
2023-01-25 11:44:59 +01:00
Ivan Nardi
5e8c1ebbb7
fuzz: fix memory allocation failure logic (#1867)
We *do* want to have some allocation errors.
Fix some related bugs
Fix: 29be01ef
2023-01-20 14:27:33 +01:00
Ivan Nardi
1b98bec0ab
LRU caches: add a generic (optional and configurable) expiration logic (#1855)
Two caches already implemented a similar mechanism: make it generic.
2023-01-18 18:18:36 +01:00
Ivan Nardi
560280e6f0
fuzz: add fuzzer testing nDPI (initial) configurations (#1830)
The goal of this fuzzer is to test init and deinit of the library, with
different configurations. In details:
* random memory allocation failures, even during init phase
* random `ndpi_init_prefs` parameter of `ndpi_init_detection_module()`
* random LRU caches sizes
* random bitmask of enabled protocols
* random parameters of `ndpi_set_detection_preferences()`
* random initialization of opportunistic TLS
* random load/don't load of configuration files

This new fuzzer is a C++ file, because it uses `FuzzedDataProvider`
class (see
https://github.com/google/fuzzing/blob/master/docs/split-inputs.md).
Note that the (existing) fuzzers need to be linked with C++ compiler
anyway, so this new fuzzer doesn't add any new requirements.
2022-12-23 19:07:13 +01:00