Commit graph

269 commits

Author SHA1 Message Date
Luca
44a290286b More NDPI_PROBING_ATTEMPT changes 2024-05-22 18:04:33 +02:00
Luca Deri
b83eb7c7a2 Implemented STUN peer_address, relayed_address, response_origin, other_address parsing
Added code to ignore invalid STUN realm
Extended JSON output with STUN information
2024-04-12 19:50:04 +02:00
Ivan Nardi
1b3ef7d7b2
STUN: improve extraction of Mapped-Address metadata (#2370)
Enable parsing of Mapped-Address attribute for all STUN flows: that
means that STUN classification might require more packets.

Add a configuration knob to enable/disable this feature.

Note that we can have (any) STUN metadata also for flows *not*
classified as STUN (because of DTLS).

Add support for ipv6.

Restore the correct extra dissection logic for Telegram flows.
2024-04-08 10:24:51 +02:00
Luca Deri
9185c2ccc4 Added support for STUN Mapped IP address 2024-04-03 23:03:46 +02:00
Toni
41eef9246c
Disable -Wno-unused-parameter -Wno-unused-function. (#2358)
* unused parameters and functions pollute the code and decrease readability

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-04-03 14:10:21 +02:00
Ivan Nardi
ad25affcb7
reader_util: fix GRE detunneling (#2314) 2024-02-10 09:16:27 +01:00
Ivan Nardi
400cd516b5
Allow multiple struct ndpi_detection_module_struct to share some state (#2271)
Add the concept of "global context".

Right now every instance of `struct ndpi_detection_module_struct` (we
will call it "local context" in this description) is completely
independent from each other. This provide optimal performances in
multithreaded environment, where we pin each local context to a thread,
and each thread to a specific CPU core: we don't have any data shared
across the cores.

Each local context has, internally, also some information correlating
**different** flows; something like:
```
if flow1 (PeerA <-> Peer B) is PROTOCOL_X; then
  flow2 (PeerC <-> PeerD) will be PROTOCOL_Y
```
To get optimal classification results, both flow1 and flow2 must be
processed by the same local context. This is not an issue at all in the far
most common scenario where there is only one local context, but it might
be impractical in some more complex scenarios.

Create the concept of "global context": multiple local contexts can use
the same global context and share some data (structures) using it.
This way the data correlating multiple flows can be read/write from
different local contexts.
This is an optional feature, disabled by default.

Obviously data structures shared in a global context must be thread safe.
This PR updates the code of the LRU implementation to be, optionally,
thread safe.

Right now, only the LRU caches can be shared; the other main structures
(trees and automas) are basically read-only: there is little sense in
sharing them. Furthermore, these structures don't have any information
correlating multiple flows.

Every LRU cache can be shared, independently from the others, via
`ndpi_set_config(ndpi_struct, NULL, "lru.$CACHE_NAME.scope", "1")`.

It's up to the user to find the right trade-off between performances
(i.e. without shared data) and classification results (i.e. with some
shared data among the local contexts), depending on the specific traffic
patterns and on the algorithms used to balance the flows across the
threads/cores/local contexts.

Add some basic examples of library initialization in
`doc/library_initialization.md`.

This code needs libpthread as external dependency. It shouldn't be a big
issue; however a configure flag has been added to disable global context
support. A new CI job has been added to test it.

TODO: we should need to find a proper way to add some tests on
multithreaded enviroment... not an easy task...

*** API changes ***

If you are not interested in this feature, simply add a NULL parameter to
any `ndpi_init_detection_module()` calls.
2024-02-01 15:33:11 +01:00
Ivan Nardi
9b26e74bb7
example: rework code between ndpiReader.c and reader_util.c (#2273) 2024-01-22 18:12:06 +01:00
Nardi Ivan
0712d496fe config: allow configuration of guessing algorithms 2024-01-18 10:21:24 +01:00
Nardi Ivan
6c85f10cd5 config: move debug/log configuration to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
88720331ae config: remove enum ndpi_prefs 2024-01-18 10:21:24 +01:00
Ivan Nardi
111015b872
ndpiReader: improve the check on max number of pkts processed per flow (#2261)
Allow to disable this check.

I don't know how much sense these limits have in the application
(especially with those default values...) since we have always had a
hard limit on the library itself (`max_packets_to_process` set to 32).
The only value might be that they provide different limits for TCP and
UDP traffic.

Keep them for the time being...
2024-01-15 20:12:57 +01:00
Ivan Nardi
dd8be1fcb1
Fix some warnings reported by CODESonar (#2227)
Remove some unreached/duplicated code.

Add error checking for `atoi()` calls.

About `isdigit()` and similar functions. The warning reported is:
```
Negative Character Value help
isdigit() is invoked here with an argument of signed type char, but only
has defined behavior for int arguments that are either representable
as unsigned char or equal to the value of macro EOF(-1).
Casting the argument to unsigned char will avoid the undefined behavior.
In a number of libc implementations, isdigit() is implemented using lookup
tables (arrays): passing in a negative value can result in a read underrun.
```
Switching to our macros fix that.
Add a check to `check_symbols.sh` to avoid using the original functions
from libc.
2024-01-12 13:30:43 +01:00
Toni
6c3d162cd6
Add realtime protocol output to ndpiReader. (#2197)
* support for using a new flow callback invoked before the flow memory is free'd
 * minor fixes

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-01-09 00:39:59 +01:00
Luca Deri
8285fffdae Implements JA4 Support (#2191) 2023-12-22 20:40:42 +01:00
Ivan Nardi
241c42ad7e
ndpiReader: fix guessed_flow_protocols statistic (#2203)
Increment the counter only if the flow has been guessed
2023-12-12 19:44:03 +01:00
Nardi Ivan
4a0eda69ad QUIC: export QUIC version as metadata 2023-10-11 15:15:20 +02:00
Nardi Ivan
86115a8a65 fuzz: extend fuzzing coverage 2023-10-07 13:34:37 +02:00
Luca
77e5daf03e Cleaned up mining datastructure 2023-09-27 17:05:12 +02:00
Toni
ef3adb9830
Added printf/fprintf replacement for some internal modules. (#1974)
* logging is instead redirected to `ndpi_debug_printf`

Signed-off-by: lns <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-09-26 23:10:57 +02:00
Ivan Nardi
2a0052f25e
fuzz: add fuzzers to test reader_util code (#2080) 2023-09-10 15:07:52 +02:00
Ivan Nardi
cc4461f424
fuzz: extend coverage (#2073) 2023-08-20 15:18:19 +02:00
Luca Deri
196395398e Compilation fixes for older C compilers 2023-08-01 15:35:33 +02:00
Toni
e4d3d619bc
Add Service Location Protocol dissector. (#2036)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-08-01 08:50:46 +02:00
Ivan Nardi
5019022e13
DNS: extract geolocation information, if available (#2065)
The option NSID (RFC5001) is used by Google DNS to report the
airport code of the metro where the DNS query is handled.

This option is quite rare, but the added overhead in DNS code is pretty
much zero for "normal" DNS traffic
2023-07-31 07:44:43 +02:00
Ivan Nardi
c85f2fb0f4
TLS: add basic, basic, detection of Encrypted ClientHello (#2053) 2023-07-21 03:41:43 +02:00
Luca Deri
1f55dc511f Implemented Count-Min Sketch [count how many times a value has been observed]
- ndpi_cm_sketch_init()
- ndpi_cm_sketch_add()
- ndpi_cm_sketch_count()
- ndpi_cm_sketch_destroy()
2023-07-13 21:54:51 +02:00
Chiara Maggi
0b0f255cc2
added feature to extract filename from http attachment (#2037)
* added feature to extract filename from http attachment

* fixed some issues

* added check for filename format

* added check for filename format

* remove an unnecessary print

* changed the size from 952 to 960

* modified some test result files

* small changes string size

* comment removed and mallocs checked
2023-07-11 22:45:19 +02:00
Ivan Nardi
88425e0199
Simplify the report of streaming multimedia info (#2026)
The two fields `flow->flow_type` and `flow->protos.rtp.stream_type` are
pretty much identical: rename the former in `flow->flow_multimedia_type`
and remove the latter.
2023-06-26 12:05:16 +02:00
Luca Deri
9cc4cbb9d1 Reworked teams handling 2023-06-15 22:31:11 +02:00
Luca Deri
d0609ea601 Implemented Zoom/Teams stream type detection 2023-06-14 23:44:57 +02:00
Luca Deri
63a2b359b9 Added vlan_id in ndpi_flow2json() prototype 2023-06-09 19:50:47 +02:00
Toni
04f5c5196e
Changed logging callback function sig. (#2000)
* make user data available for any build config

Signed-off-by: lns <matzeton@googlemail.com>
2023-05-30 08:27:37 +02:00
Ivan Nardi
efb261a95c
Fix some memory errors triggered by allocation failures (#1995)
Some low hanging fruits found using nallocfuzz.
See: https://github.com/catenacyber/nallocfuzz
See: https://github.com/google/oss-fuzz/pull/9902

Most of these errors are quite trivial to fix; the only exception is the
stuff in the uthash.
If the insertion fails (because of an allocation failure), we need to
avoid some memory leaks. But the only way to check if the `HASH_ADD_*`
failed, is to perform a new lookup: a bit costly, but we don't use that
code in any critical data-path.
2023-05-29 19:24:00 +02:00
Ivan Nardi
46ff069117
ndpiReader: improve printing of payload statistics (#1989)
Add a basic unit test

Fix an endianess issue
2023-05-29 16:53:11 +02:00
Ivan Nardi
86b56646b5
ndpiReader: fix export of DNS/BitTorrent attributes (#1985)
There is no BitTorrent hash in the DNS flows
2023-05-20 17:23:48 +02:00
Ivan Nardi
9004d5c2ca
ndpiReader: fix export of HTTP attributes (#1982) 2023-05-20 15:12:14 +02:00
headshog
dd7198fcaf fixed numeric truncation error 2023-05-20 15:09:51 +02:00
Ivan Nardi
0223d3c4f5
HTTP: improve extraction of metadata and of flow risks (#1959) 2023-05-05 13:35:20 +02:00
Ivan Nardi
ba99330031
ndpiReader: fix flow stats (#1943) 2023-04-14 11:17:37 +02:00
Ivan Nardi
25c1111911
fuzz: add a new fuzzer triggering the payload analyzer function(s) (#1926) 2023-04-04 14:39:29 +02:00
Maatuq
f1193d5e6f
add support for gre decapsulation (#1442) (#1921) 2023-04-04 14:20:11 +02:00
Ivan Nardi
04a426feef
ndpiReader: fix VXLAN de-tunneling (#1913)
```
==20665==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6040000aec81 at pc 0x0000004f5c6f bp 0x7fff07e9e1f0 sp 0x7fff07e9e1e8
READ of size 1 at 0x6040000aec81 thread T0
SCARINESS: 12 (1-byte-read-heap-buffer-overflow)
    #0 0x4f5c6e in ndpi_is_valid_vxlan ndpi/example/reader_util.c:1784:6
    #1 0x4f5c6e in ndpi_workflow_process_packet ndpi/example/reader_util.c:2292:16
    #2 0x4dd821 in LLVMFuzzerTestOneInput ndpi/fuzz/fuzz_ndpi_reader.c:135:7
    #3 0x4f91ba in ExecuteFilesOnyByOne /src/aflplusplus/utils/aflpp_driver/aflpp_driver.c:234:7
    #4 0x4f8f8c in main /src/aflplusplus/utils/aflpp_driver/aflpp_driver.c:318:12
    #5 0x7f2289324082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/libc-start.c:308:16
    #6 0x41e6cd in _start
```
Found by oss-fuzz.
See: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=57369
2023-03-25 19:19:51 +01:00
Maatuq
530d0de438
Add support for vxlan decapsulation (#1441) (#1900)
Close #1441
2023-03-22 18:18:12 +01:00
Ivan Nardi
22fb8349b9
ndpiReader: print how many packets (per flow) were needed to perform full DPI (#1891)
Average values are already printed, but this change should ease to
identify regressions/improvements.
2023-03-01 21:50:47 +01:00
Ivan Nardi
b51a2ac72a
fuzz: some improvements and add two new fuzzers (#1881)
Remove `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` define from
`fuzz/Makefile.am`; it is already included by the main configure script
(when fuzzing).

Add a knob to force disabling of AESNI optimizations: this way we can
fuzz also no-aesni crypto code.

Move CRC32 algorithm into the library.

Add some fake traces to extend fuzzing coverage. Note that these traces
are hand-made (via scapy/curl) and must not be used as "proof" that the
dissectors are really able to identify this kind of traffic.

Some small updates to some dissectors:

CSGO: remove a wrong rule (never triggered, BTW). Any UDP packet starting
with "VS01" will be classified as STEAM (see steam.c around line 111).
Googling it, it seems right so.

XBOX: XBOX only analyses UDP flows while HTTP only TCP ones; therefore
that condition is false.

RTP, STUN: removed useless "break"s

Zattoo: `flow->zattoo_stage` is never set to any values greater or equal
to 5, so these checks are never true.

PPStream: `flow->l4.udp.ppstream_stage` is never read. Delete it.

TeamSpeak: we check for `flow->packet_counter == 3` just above, so the
following check `flow->packet_counter >= 3` is always false.
2023-02-09 20:02:12 +01:00
Ivan Nardi
9f27cd56b0
ndpiReader: fix packet dissection (CAPWAP and TSO) (#1878)
Fix decapsulation of CAPWAP; we are interested only in "real" user data
tunneled via CAPWAP.
When Tcp Segmentation Offload is enabled in the NIC, the received packet
might have 0 as "ip length" in the IPv4 header
(see
https://osqa-ask.wireshark.org/questions/16279/why-are-the-bytes-00-00-but-wireshark-shows-an-ip-total-length-of-2016/)

The effect of these two bugs was that some packets were discarded.

Be sure that flows order is deterministic
2023-01-30 10:59:18 +01:00
Ivan Nardi
3e6cadbb76
ndpireader: fix "Discarded bytes" statistics (#1877) 2023-01-27 07:09:34 +01:00
Ivan Nardi
29c5cc39fb
Some small changes (#1869)
All dissector callbacks should not be exported by the library; make static
some other local functions.
The callback logic in `ndpiReader` has never been used.
With internal libgcrypt, `gcry_control()` should always return no
errors.
We can check `categories` length at compilation time.
2023-01-25 11:44:09 +01:00
Ivan Nardi
ad6bfbad4d
Add protocol disabling feature (#1808)
The application may enable only some protocols.
Disabling a protocol means:
*) don't register/use the protocol dissector code (if any)
*) disable classification by-port for such a protocol
*) disable string matchings for domains/certificates involving this protocol
*) disable subprotocol registration (if any)

This feature can be tested with `ndpiReader -B list_of_protocols_to_disable`.

Custom protocols are always enabled.

Technically speaking, this commit doesn't introduce any API/ABI
incompatibility. However, calling `ndpi_set_protocol_detection_bitmask2()`
is now mandatory, just after having called `ndpi_init_detection_module()`.

Most of the diffs (and all the diffs in `/src/lib/protocols/`) are due to
the removing of some function parameters.

Fix the low level macro `NDPI_LOG`. This issue hasn't been detected
sooner simply because almost all the code uses only the helpers `NDPI_LOG_*`
2022-12-18 08:10:57 +00:00