Commit graph

39 commits

Author SHA1 Message Date
Luca Deri
37ca034697 (C) update 2026-01-01 10:31:40 +01:00
Ivan Nardi
83d85775a8
Provide an explicit state for the flow classification process (#2942)
Application should keep calling nDPI until flow state became
`NDPI_STATE_CLASSIFIED`.

The main loop in the application is simplified to something like:
```
res = ndpi_detection_process_packet(...);
if(res->state == NDPI_STATE_CLASSIFIED) {
  /* Done: you can get finale classification and all metadata.
     nDPI doesn't need more packets for this flow */
} else {
  /* nDPI needs more packets for this flow. The provided
     classification is not final and more metadata might be
     extracted.
     If `res->state` is `NDPI_STATE_PARTIAL`, partial/initial
     classification is available in `res->proto`
     as usual but it can be updated later.
  */
}

/*
    Example A (QUIC flow):
     pkt 1: proto QUIC state NDPI_STATE_PARTIAL
     pkt 2: proto QUIC/Youtube  state NDPI_STATE_CLASSIFIED
    Example B (GoogleMeet call):
     pkt 1:   proto STUN state NDPI_STATE_PARTIAL
     pkt N:   proto DTLS state NDPI_STATE_PARTIAL
     pkt N+M: proto DTLS/GoogleCall state NDPI_STATE_CLASSIFIED
    Example C (standard TLS flow):
     pkt 1:   proto Unknown state NDPI_STATE_INSPECTING
     pkt 2:   proto Unknown state NDPI_STATE_INSPECTING
     pkt 3:   proto Unknown state NDPI_STATE_INSPECTING
     pkt 4:   proto TLS/Facebook state NDPI_STATE_PARTIAL
     pkt N:   proto TLS/Facebook state NDPI_STATE_CLASSIFIED
 */
}
```
You can take a look at `ndpiReader` for a slightly more complex example.

API changes:
* remove the third parameter from `ndpi_detection_giveup()`. If you need
to know if the classification flow has been guessed, you can access
`flow->protocol_was_guessed`
* remove `ndpi_extra_dissection_possible()`
* change some prototypes from accepting `ndpi_protocol foo` to
`ndpi_master_app_protocol bar`. The update is trivial: from `foo` to
`foo.proto`
2025-11-03 12:08:15 +01:00
Ivan Nardi
28ae2e14d8
Check ndpi_finalize_initialization() return value (#2884) 2025-06-14 11:31:23 +02:00
Ivan Nardi
70a72f1638
New API to enable/disable protocols; remove ndpi_set_protocol_detection_bitmask2() (#2853)
The main goal is not to have the bitmask depending on the total number
of protocols anymore: `NDPI_INTERNAL_PROTOCOL_BITMASK` depends only on
internal protocols, i.e. on `NDPI_MAX_INTERNAL_PROTOCOLS`, i.e.
custom-defined protocols are not counted.
See #2136

Keep the old data structure `NDPI_PROTOCOL_BITMASK` with the old
semantic.

Since we need to change the API (and all the application code...)
anyway, simplify the API: by default all the protocols are enabled.
If you need otherwise, please use `ndpi_init_detection_module_ext()`
instead of `ndpi_init_detection_module()` (you can find an example in
the `ndpiReader` code).

To update the application code you likely only need to remove these 3
lines from your code:
```
- NDPI_PROTOCOL_BITMASK all;
- NDPI_BITMASK_SET_ALL(all);
- ndpi_set_protocol_detection_bitmask2(ndpi_str, &all);
```

Removed an unused field and struct definition.
2025-06-03 09:45:46 +02:00
Luca Deri
1577955fca Added ndpi_find_protocol_qoe() API call
Updated (C)
2025-02-10 21:21:51 +01:00
Ivan Nardi
eb133b8fa5
TLS: better state about handshake (#2534)
Keep track if we received CH or/and SH messsages: usefull with
unidirectional flows
2024-09-03 12:44:22 +02:00
Luca Deri
2315f44efa Compilation fixes 2024-08-24 16:59:56 +02:00
Toni
41eef9246c
Disable -Wno-unused-parameter -Wno-unused-function. (#2358)
* unused parameters and functions pollute the code and decrease readability

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-04-03 14:10:21 +02:00
Ivan Nardi
400cd516b5
Allow multiple struct ndpi_detection_module_struct to share some state (#2271)
Add the concept of "global context".

Right now every instance of `struct ndpi_detection_module_struct` (we
will call it "local context" in this description) is completely
independent from each other. This provide optimal performances in
multithreaded environment, where we pin each local context to a thread,
and each thread to a specific CPU core: we don't have any data shared
across the cores.

Each local context has, internally, also some information correlating
**different** flows; something like:
```
if flow1 (PeerA <-> Peer B) is PROTOCOL_X; then
  flow2 (PeerC <-> PeerD) will be PROTOCOL_Y
```
To get optimal classification results, both flow1 and flow2 must be
processed by the same local context. This is not an issue at all in the far
most common scenario where there is only one local context, but it might
be impractical in some more complex scenarios.

Create the concept of "global context": multiple local contexts can use
the same global context and share some data (structures) using it.
This way the data correlating multiple flows can be read/write from
different local contexts.
This is an optional feature, disabled by default.

Obviously data structures shared in a global context must be thread safe.
This PR updates the code of the LRU implementation to be, optionally,
thread safe.

Right now, only the LRU caches can be shared; the other main structures
(trees and automas) are basically read-only: there is little sense in
sharing them. Furthermore, these structures don't have any information
correlating multiple flows.

Every LRU cache can be shared, independently from the others, via
`ndpi_set_config(ndpi_struct, NULL, "lru.$CACHE_NAME.scope", "1")`.

It's up to the user to find the right trade-off between performances
(i.e. without shared data) and classification results (i.e. with some
shared data among the local contexts), depending on the specific traffic
patterns and on the algorithms used to balance the flows across the
threads/cores/local contexts.

Add some basic examples of library initialization in
`doc/library_initialization.md`.

This code needs libpthread as external dependency. It shouldn't be a big
issue; however a configure flag has been added to disable global context
support. A new CI job has been added to test it.

TODO: we should need to find a proper way to add some tests on
multithreaded enviroment... not an easy task...

*** API changes ***

If you are not interested in this feature, simply add a NULL parameter to
any `ndpi_init_detection_module()` calls.
2024-02-01 15:33:11 +01:00
Nardi Ivan
0712d496fe config: allow configuration of guessing algorithms 2024-01-18 10:21:24 +01:00
Nardi Ivan
88720331ae config: remove enum ndpi_prefs 2024-01-18 10:21:24 +01:00
Ivan Nardi
a5595d16c0
CI: update list of compilers (#2223)
Try using latest gcc and clang versions.
We still care about RHEL7: since handling a RHEL7 runner on GitHub is
quite complex, let try to use a similar version of gcc, at least
2023-12-20 19:22:22 +01:00
rl1987
59d476195c
Fix typos (#2204)
* Fix typo in ndpiSimpleIntegration.c

* Fix misspelling in a comment
2023-12-10 19:58:22 +01:00
Toni
0cb6f4cb75
Fixed hash buffer size in ndpiSimpleIntegration. (#2143)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-11-10 10:23:37 +01:00
Gowa2017
c882120afd
thread_index may by negative. (#1814)
* thread_index may by negative.

Like:
192.168.8.155:55848 --> 183.3.224.139

* reader thread index also need to uint32
2022-12-05 10:22:05 +01:00
Ivan Nardi
ca5ffc4988
TLS: improve handling of ALPN(s) (#1784)
Tell "Advertised" ALPN list from "Negotiated" ALPN; the former is
extracted from the CH, the latter from the SH.

Add some entries to the known ALPN list.

Fix printing of "TLS Supported Versions" field.
2022-10-25 17:06:29 +02:00
Nardi Ivan
f3a74d97d8 TLS/DTLS: we process certificate for UDP flows, too
Note that current code access `certificate_processed` state even before
setting the protocol classification, so this piece of information can't
be saved in `flow->protos` union.
2022-09-30 06:53:29 +02:00
Ivan Nardi
e6b332aa4a
Add support for flow client/server information (#1671)
In a lot of places in ndPI we use *packet* source/dest info
(address/port/direction) when we are interested in *flow* client/server
info, instead.

Add basic logic to autodetect this kind of information.

nDPI doesn't perform any "flow management" itself but this task is
delegated to the external application. It is then likely that the
application might provide more reliable hints about flow
client/server direction and about the TCP handshake presence: in that case,
these information might be (optionally) passed to the library, disabling
the internal "autodetect" logic.

These new fields have been used in some LRU caches and in the "guessing"
algorithm.
It is quite likely that some other code needs to be updated.
2022-07-24 17:46:24 +02:00
Luca Deri
3ad989f6a8 Added BPF filtering for discarding non-IP packets 2022-04-27 17:05:33 +02:00
Toni Uhlig
c3df3a12aa Fixed msys2 build warnings and re-activated CI Mingw64 build.
* Removed Visual Studio leftovers. Maintaining an autotools project with VS integration requires some additional overhead.

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Signed-off-by: lns <matzeton@googlemail.com>
2022-04-14 19:17:48 +02:00
ol-andreyizrailev
8cc5cb9f76
Increment current/total number of active flows on successful flow insertion (#1434)
Memory allocation or ndpi_tsearch might fail, so the two values should be
incremented only when insertion actually happened.

Co-authored-by: Andrey Izrailev <Andrey.Izrailev@oktetlabs.ru>
2022-02-09 11:43:10 +01:00
Ivan Nardi
5bb5bec477
Remove struct ndpi_id_struct (#1427)
Remove the last uses of `struct ndpi_id_struct`.
That code is not really used and it has not been updated for a very long
time: see #1279 for details.

Correlation among flows is achieved via LRU caches.

This change allows to further reduce memory consumption (see also
91bb77a8).

At nDPI 4.0 (more precisly, at a6b10cf, because memory stats
were wrong until that commit):
```
nDPI Memory statistics:
	nDPI Memory (once):      221.15 KB
	Flow Memory (per flow):  2.94 KB
```
Now:
```
nDPI Memory statistics:
	nDPI Memory (once):      235.27 KB
	Flow Memory (per flow):  688 B        <--------
```
i.e. memory usage per flow has been reduced by 77%.

Close #1279
2022-01-30 19:18:12 +01:00
Toni
9b8679a320
Fix some race conditions by using atomic operations. (#1420)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-01-29 09:18:51 +01:00
Toni
011ee3ecbd
Fixed wrong ip tuple comparison. #1386 (#1418)
* Added u32 pads to `union ip_tuple` so btree search should now work as expected.
   The bug caused new flow's when the remote answers, resulting in two Flows per direction. Fail.
 * Fixed a race condition during shutdown phase.

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-01-26 15:47:38 +01:00
Ivan Nardi
91bb77a880
A final(?) effort to reduce memory usage per flow (#1389)
Remove some unused fields and re-organize other ones.
In particular:
* Update the parameters of `ndpi_ssl_version2str()` function
* Zattoo, Thunder: these timestamps aren't really used.
* Ftp/mail: these protocols are dissected only over TCP.
* Attention must be paid to TLS.Bittorrent flows to avoid invalid
read/write to `flow->protos.bittorrent.hash` field.

This is the last(?) commit of a long series (see 22241a1d, 227e586e,
730c2360, a8ffcd8b) aiming to reduce library memory consumption.

Before, at nDPI 4.0 (more precisly, at a6b10cf7, because memory stats
were wrong until that commit):
```
nDPI Memory statistics:
	nDPI Memory (once):      221.15 KB
	Flow Memory (per flow):  2.94 KB
```
Now:
```
nDPI Memory statistics:
	nDPI Memory (once):      231.71 KB
	Flow Memory (per flow):  1008 B       <---------
```
i.e. memory usage per flow has been reduced by 66%, dropping below the
psychological threshold of 1 KB.

To further reduce this value, we probably need to look into #1279:
let's fight this battle another day.
2021-12-22 19:54:06 +01:00
Ivan Nardi
a8ffcd8bb0
Rework how hostname/SNI info is saved (#1330)
Looking at `struct ndpi_flow_struct` the two bigger fields are
`host_server_name[240]` (mainly for HTTP hostnames and DNS domains) and
`protos.tls_quic.client_requested_server_name[256]`
(for TLS/QUIC SNIs).

This commit aims to reduce `struct ndpi_flow_struct` size, according to
two simple observations:
 1) maximum one of these two fields is used for each flow. So it seems safe
to merge them;
 2) even if hostnames/SNIs might be very long, in practice they are rarely
longer than a fews tens of bytes. So, using a (single) large buffer is a
waste of memory for all kinds of flows. If we need to truncate the name,
we keep the *last* characters, easing domain matching.

Analyzing some real traffic, it seems safe to assume that the vast
majority of hostnames/SNIs is shorter than 80 bytes.

Hostnames/SNIs are always converted to lowercase.

Attention was given so as to be sure that unit-tests outputs are not
affected by this change.

Because of a bug, TLS/QUIC SNI were always truncated to 64 bytes (the
*first* 64 ones): as a consequence, there were some "Suspicious DGA
domain name" and "TLS Certificate Mismatch" false positives.
2021-11-24 10:46:48 +01:00
Ivan Nardi
afc2b641eb
Fix writes to flow->protos union fields (#1354)
We can write to `flow->protos` only after a proper classification.

This issue has been found in Kerberos, DHCP, HTTP, STUN, IMO, FTP,
SMTP, IMAP and POP code.
There are two kinds of fixes:
 * write to `flow->protos` only if a final protocol has been detected
 * move protocol state out of `flow->protos`
The hard part is to find, for each protocol, the right tradeoff between
memory usage and code complexity.

Handle Kerberos like DNS: if we find a request, we set the protocol
and an extra callback to further parsing the reply.

For all the other protocols, move the state out of `flow->protos`. This
is an issue only for the FTP/MAIL stuff.

Add DHCP Class Identification value to the output of ndpiReader and to
the Jason serialization.

Extend code coverage of fuzz tests.

Close #1343
Close #1342
2021-11-15 16:20:57 +01:00
Ivan Nardi
9e0f0ce3df
Fix access to some TLS fields in flow structure (#1277)
Fields 'tls.hello_processed` and `tls.subprotocol_detected` are used by
QUIC (i.e UDP...), too.
2021-08-20 18:11:37 +02:00
Toni
8d0c7b1fae
Fixed Mingw64 build, SonerCloud-CI and more. (#1273)
* Added ARM build and unit test run for SonarCloud-CI.

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>

* Fixed Mingw64 build.

 * adapted to SonarCloud-CI workflow
 * removed broken and incomplete Windows example (tested on VS2017/VS2019)
 * removed unnecessary include (e.g. pthread.h for the library which does not make use of it)

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2021-08-18 11:34:16 +02:00
Toni
6ad0d6666c
Implemented function to retrieve flow information. #1253 (#1254)
* fixed [h]euristic typo

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2021-07-23 10:37:20 +02:00
Luca Deri
87ad2b58dc Compilation fix 2021-04-26 10:50:19 +02:00
Luca Deri
fc3db8f169 Implemented TLS Certificate Sibject matching
Improved AnyDesk detection
2021-02-22 22:37:33 +01:00
Luca Deri
288ccd6215 Fixes due to datatype rename 2021-01-22 09:17:34 +01:00
Luca Deri
019a64630b
Merge pull request #983 from lnslbrty/fix/libpcap-obsolete-pcap_lookupdev-usage
Replaced obsolete libpcap pcap_lookupdev with pcap_findalldevs.
2020-08-16 10:03:33 +02:00
Toni Uhlig
b31fde4bbb
Replaced obsolete libpcap pcap_lookupdev with pcap_findalldevs.
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2020-08-15 12:35:49 +02:00
Toni Uhlig
aa856735c0
num_extra_packets_checked check can be 0 for some protocols and therefor requires lesser-or-equal condition for max_extra_packets_to_check
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2020-08-13 10:32:31 +02:00
Toni Uhlig
8da5f42fa0
Changed ndpi_ssl_version2str function call in ndpiSimpleIntegration.
Fixes build error introduced with 23c072153.

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2020-08-08 12:05:11 +02:00
lucaderi
ecdf7df454 Compilation fixes for non-Linux (or outdated Linux) platforms 2020-06-25 10:25:24 +02:00
Toni Uhlig
17c26911fb
ndpiSimpleIntegration: added another integration example
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2020-06-24 22:03:18 +02:00