Commit graph

250 commits

Author SHA1 Message Date
Ivan Nardi
338eedd05b
HTTP, QUIC, TLS: allow to disable sub-classification (#2533) 2024-09-03 12:35:45 +02:00
Ivan Nardi
34e1ac0bbb
fuzz: fix compilation (#2532) 2024-08-26 21:01:18 +02:00
Vladimir Gavrilov
aec2e2fbb8
Endian-independent implementation of IEEE 802.3 CRC32 (#2529) 2024-08-25 11:54:17 +02:00
Luca Deri
b627ec91d1 Compilation fixes 2024-08-24 17:18:33 +02:00
Ivan Nardi
653175e724
Fix verify_dist_tarball.sh after latest release (#2519)
Moving from 4.8 to 4.10 (and so, from 4.9 to 4.11 for development builds)
made some paths one character longer; that triggers an error with tar
when running `verify_dist_tarball.sh` script:

```
tar: libndpi-4.11.0/fuzz/corpus/fuzz_filecfg_config/flow_risk.anonymous_subscriber.list.protonvpn.load.txt: file name is too long (max 99); not dumped
```
As a quick fix, reduce the length of that file name.
2024-08-07 10:04:33 +02:00
Ivan Nardi
65e31b0ea3
FPC: small improvements (#2512)
Add printing of fpc_dns statistics and add a general cconfiguration option.
Rework the code to be more generic and ready to handle other logics.
2024-07-22 17:42:23 +02:00
Ivan Nardi
c3ba65311e
fuzzing: improve coverage (#2495)
Fix detection of WebDAV and Gnutella (over HTTP)
Fix detection of z3950

Add two fuzzers to test `ndpi_memmem()` and `ndpi_strnstr()`

Remove some dead code:
* RTP: the same exact check is performed at the very beginning of the
function
* MQTT: use a better helper to exclude the protocol
* Colletd: `ndpi_hostname_sni_set()` never fails

Update pl7m code (fix a Use-of-uninitialized-value error)
2024-07-12 14:22:25 +02:00
Ivan Nardi
843e487270
Add infrastructure for explicit support of Fist Packet Classification (#2488)
Let's start with some basic helpers and with FPC based on flow addresses.

See: #2322
2024-07-03 18:02:07 +02:00
Ivan Nardi
ebcea42e2b
fuzz: improve fuzzers using pl7m (#2486)
For some unclear reasons, fuzzers using pl7m create huge corpus,
triggering OOM in oss-fuzz runs (where the memory RSS limit is set to
2560Mb). Example:

```
==25340== ERROR: libFuzzer: out-of-memory (used: 2564Mb; limit: 2560Mb)
To change the out-of-memory limit use -rss_limit_mb=<N>

Live Heap Allocations: 2364004039 bytes in 133791 chunks; quarantined: 60662293 bytes in 3664 chunks; 176432 other chunks; total chunks: 313887; showing top 95% (at most 8 unique contexts)
1285841683 byte(s) (54%) in 2956 allocation(s)
    #0 0x56f814ef4bde in __interceptor_malloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x56f814e04416 in operator new(unsigned long) cxa_noexception.cpp:0
    #2 0x56f814de6b2d in assign<unsigned char *, 0> /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1443:3
    #3 0x56f814de6b2d in operator= /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1412:9
    #4 0x56f814de6b2d in fuzzer::InputCorpus::AddToCorpus(std::__Fuzzer::vector<unsigned char, std::__Fuzzer::allocator<unsigned char>> const&, unsigned long, bool, bool, bool, std::__Fuzzer::chrono::duration<long long, std::__Fuzzer::ratio<1l, 1000000l>>, std::__Fuzzer::vector<unsigned int, std::__Fuzzer::allocator<unsigned int>> const&, fuzzer::DataFlowTrace const&, fuzzer::InputInfo const*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerCorpus.h:221:10
    #5 0x56f814de60e5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:539:16
    #6 0x56f814de7df2 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:829:7
    #7 0x56f814de8127 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:867:3
    #8 0x56f814dd6736 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:914:6
    #9 0x56f814e02c62 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #10 0x7fa11e2c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/libc-start.c:308:16

1031350683 byte(s) (43%) in 2468 allocation(s)
    #0 0x56f814ef4bde in __interceptor_malloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x56f814e04416 in operator new(unsigned long) cxa_noexception.cpp:0
    #2 0x56f814de6b2d in assign<unsigned char *, 0> /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1443:3
    #3 0x56f814de6b2d in operator= /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1412:9
    #4 0x56f814de6b2d in fuzzer::InputCorpus::AddToCorpus(std::__Fuzzer::vector<unsigned char, std::__Fuzzer::allocator<unsigned char>> const&, unsigned long, bool, bool, bool, std::__Fuzzer::chrono::duration<long long, std::__Fuzzer::ratio<1l, 1000000l>>, std::__Fuzzer::vector<unsigned int, std::__Fuzzer::allocator<unsigned int>> const&, fuzzer::DataFlowTrace const&, fuzzer::InputInfo const*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerCorpus.h:221:10
    #5 0x56f814de60e5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:539:16
    #6 0x56f814de7635 in fuzzer::Fuzzer::MutateAndTestOne() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:760:19
    #7 0x56f814de8425 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:905:5
    #8 0x56f814dd6736 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:914:6
    #9 0x56f814e02c62 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #10 0x7fa11e2c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/libc-start.c:308:16
```

See: https://oss-fuzz.com/testcase-detail/4717811415449600
See: https://oss-fuzz.com/testcase-detail/6164130982068224

Let's *try* the following workaround: set the parameter `-max-len` to
512K, to force the engine to not genereate inputs (i.e. pcap files...)
larger than 512K. Right now the value used is 1MB, i.e the default,
because we have file larger than 1MB in the initial seeds (i.e.
`/tests/pcaps/*`).

Let's hope than having smaller files lead to smaller corpus...

Update pl7m code (fix a Use-of-uninitialized-value error)
2024-06-29 18:01:05 +02:00
Ivan Nardi
83e6e753af
fuzz: pl7m: add a custom mutator for better fuzzing of pcap files (#2483)
Pl7m is a custom mutator (used for structure aware fuzzing) for network
traffic packet captures (i.e. pcap files).

The output of the mutator is always a valid pcap file, containing the
same flows/sessions of the input file. That's it: the mutator only
changes the packet payload after the TCP/UDP header, keeping all the
original L2/L3 information (IP addresses and L4 ports).

See: https://github.com/IvanNardi/pl7m
2024-06-27 18:07:43 +02:00
Nardi Ivan
556f892a56 wireshark: lua: export some metadata
Export some metadata (for the moment, SNI and TLS fingerprints) to
Wireshark/tshark via extcap.
Note that:
* metadata are exported only once per flow
* metadata are exported (all together) when nDPI stopped processing
the flow

Still room for a lot of improvements!
In particular:
* we need to add some boundary checks (if we are going to export other
attributes)
* we should try to have a variable length trailer
2024-06-25 16:39:45 +02:00
Ivan Nardi
26cc1f131f
fuzz: improve fuzzing coverage (#2474)
Remove some code never triggered

AFP: the removed check is included in the following one
MQTT: fix flags extraction
2024-06-17 13:45:47 +02:00
Nardi Ivan
526cf6f291 Zoom: remove "stun_zoom" LRU cache
Since 070a0908b we are able to detect P2P calls directly from the packet
content, without any correlation among flows
2024-06-17 10:19:55 +02:00
Ivan Nardi
b90d39c4ac
RTP/STUN: look for STUN packets after RTP/RTCP classification (#2465)
After a flow has been classified as RTP or RTCP, nDPI might analyse more
packets to look for STUN/DTLS packets, i.e. to try to tell if this flow
is a "pure" RTP/RTCP flow or if the RTP/RTCP packets are multiplexed with
STUN/DTLS.
Useful for proper (sub)classification when the beginning of the flows
are not captured or if there are lost packets in the the captured traffic.

Disabled by default
2024-06-07 13:12:04 +02:00
Ivan Nardi
070a0908b3
Zoom: faster detection of P2P flows (#2467) 2024-06-07 09:50:41 +02:00
Ivan Nardi
399be12585
Small fixes after API cleanup done in c63446e59 (#2449) 2024-05-20 19:06:24 +02:00
Ivan Nardi
7c6910d9e5
Fix/improve fuzzing (#2426) 2024-05-08 11:46:02 +02:00
Ivan Nardi
95fe21015d
Remove "zoom" cache (#2420)
This cache was added in b6b4967aa, when there was no real Zoom support.
With 63f349319, a proper identification of multimedia stream has been
added, making this cache quite useless: any improvements on Zoom
classification should be properly done in Zoom dissector.

Tested for some months with a few 10Gbits links of residential traffic: the
cache pretty much never returned a valid hit.
2024-05-06 12:51:45 +02:00
Ivan Nardi
a62679952c
fuzz: fix shoco fuzzer (#2405)
```
==22779==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f0900701020 at pc 0x555bcd2a6f02 bp 0x7ffe3ba5e790 sp 0x7ffe3ba5e788
READ of size 1 at 0x7f0900701020 thread T0
    #0 0x555bcd2a6f01 in shoco_decompress /home/ivan/svnrepos/nDPI/src/lib/third_party/src/shoco.c:184:26
    #1 0x555bcd2a4018 in LLVMFuzzerTestOneInput /home/ivan/svnrepos/nDPI/fuzz/fuzz_alg_shoco.cpp:18:3
    #2 0x555bcd1aa816 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/ivan/svnrepos/nDPI/fuzz/fuzz_alg_shoco+0x4f816) (BuildId: c54d1c32163c9937e06f62127348ae6bd26d9309)
    #3 0x555bcd193be8 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/home/ivan/svnrepos/nDPI/fuzz/fuzz_alg_shoco+0x38be8) (BuildId: c54d1c32163c9937e06f62127348ae6bd26d9309)
    #4 0x555bcd1996fa in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/ivan/svnrepos/nDPI/fuzz/fuzz_alg_shoco+0x3e6fa) (BuildId: c54d1c32163c9937e06f62127348ae6bd26d9309)
    #5 0x555bcd1c3c92 in main (/home/ivan/svnrepos/nDPI/fuzz/fuzz_alg_shoco+0x68c92) (BuildId: c54d1c32163c9937e06f62127348ae6bd26d9309)
    #6 0x7f090257a082 in __libc_start_main /build/glibc-e2p3jK/glibc-2.31/csu/../csu/libc-start.c:308:16
    #7 0x555bcd18e96d in _start (/home/ivan/svnrepos/nDPI/fuzz/fuzz_alg_shoco+0x3396d) (BuildId: c54d1c32163c9937e06f62127348ae6bd26d9309)

Address 0x7f0900701020 is located in stack of thread T0 at offset 4128 in frame
    #0 0x555bcd2a3d97 in LLVMFuzzerTestOneInput /home/ivan/svnrepos/nDPI/fuzz/fuzz_alg_shoco.cpp:5

  This frame has 4 object(s):
    [32, 4128) 'out' (line 9) <== Memory access at offset 4128 overflows this variable
    [4256, 8352) 'orig' (line 9)
```

Found by oss-fuzzer.
See: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=68211
2024-04-22 18:39:23 +02:00
Ivan Nardi
ef89183469
fuzz: improvements (#2400)
Create the zip file with all the traces only once.

Add a new fuzzer to test "shoco" compression algorithm
2024-04-20 18:15:23 +02:00
Luca Deri
ad117bfaab
Domain Classification Improvements (#2396)
* Added
size_t ndpi_compress_str(const char * in, size_t len, char * out, size_t bufsize);
size_t ndpi_decompress_str(const char * in, size_t len, char * out, size_t bufsize);

used to compress short strings such as domain names. This code is based on
https://github.com/Ed-von-Schleck/shoco

* Major code rewrite for ndpi_hash and ndpi_domain_classify

* Improvements to make sure custom categories are loaded and enabled

* Fixed string encoding

* Extended SalesForce/Cloudflare domains list
2024-04-18 23:21:40 +02:00
Ivan Nardi
0535e54484
STUN: fix boundary checks on attribute list parsing (#2387)
Restore all unit tests.
Add some configuration knobs.
Fix the endianess.
2024-04-12 22:55:51 +02:00
Ivan Nardi
1b3ef7d7b2
STUN: improve extraction of Mapped-Address metadata (#2370)
Enable parsing of Mapped-Address attribute for all STUN flows: that
means that STUN classification might require more packets.

Add a configuration knob to enable/disable this feature.

Note that we can have (any) STUN metadata also for flows *not*
classified as STUN (because of DTLS).

Add support for ipv6.

Restore the correct extra dissection logic for Telegram flows.
2024-04-08 10:24:51 +02:00
Ivan Nardi
700637a162
fuzzing: extend fuzzing coverage (#2371) 2024-04-05 21:02:54 +02:00
Toni
41eef9246c
Disable -Wno-unused-parameter -Wno-unused-function. (#2358)
* unused parameters and functions pollute the code and decrease readability

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-04-03 14:10:21 +02:00
Luca Deri
51f5fc7140
Added support for roaring bitmap v3 (#2355)
* Integrated RoaringBitmap v3

* Renamed ndpi_bitmap64 ro ndpi_bitmap64_fuse

* Fixes to ndpi_bitmap for new roaring library

* Fixes for bitmap serialization

* Fixed format

* Warning fix

* Conversion fix

* Warning fix

* Added check for roaring v3 support

* Updated file name

* Updated path

* Uses clang-9 (instead of clang-7) for builds

* Fixed fuzz_ds_bitmap64_fuse

* Fixes nDPI printf handling

* Disabled printf

* Yet another printf fix

* Cleaup

* Fx for compiling on older platforms

* Fixes for old compilers

* Initialization changes

* Added compiler check

* Fixes for old compilers

* Inline function is not static inline

* Added missing include
2024-03-25 08:15:19 +01:00
Ivan Nardi
6152d595e8
STUN: add a parameter to configure how long the extra dissection lasts (#2336)
Tradeoff: performance (i.e. number of packets) vs sub-classification
2024-03-07 14:39:32 +01:00
Ivan Nardi
03ecb026ff
fuzz: improve fuzzing coverage (#2309) 2024-02-09 19:19:03 +01:00
Ivan Nardi
400cd516b5
Allow multiple struct ndpi_detection_module_struct to share some state (#2271)
Add the concept of "global context".

Right now every instance of `struct ndpi_detection_module_struct` (we
will call it "local context" in this description) is completely
independent from each other. This provide optimal performances in
multithreaded environment, where we pin each local context to a thread,
and each thread to a specific CPU core: we don't have any data shared
across the cores.

Each local context has, internally, also some information correlating
**different** flows; something like:
```
if flow1 (PeerA <-> Peer B) is PROTOCOL_X; then
  flow2 (PeerC <-> PeerD) will be PROTOCOL_Y
```
To get optimal classification results, both flow1 and flow2 must be
processed by the same local context. This is not an issue at all in the far
most common scenario where there is only one local context, but it might
be impractical in some more complex scenarios.

Create the concept of "global context": multiple local contexts can use
the same global context and share some data (structures) using it.
This way the data correlating multiple flows can be read/write from
different local contexts.
This is an optional feature, disabled by default.

Obviously data structures shared in a global context must be thread safe.
This PR updates the code of the LRU implementation to be, optionally,
thread safe.

Right now, only the LRU caches can be shared; the other main structures
(trees and automas) are basically read-only: there is little sense in
sharing them. Furthermore, these structures don't have any information
correlating multiple flows.

Every LRU cache can be shared, independently from the others, via
`ndpi_set_config(ndpi_struct, NULL, "lru.$CACHE_NAME.scope", "1")`.

It's up to the user to find the right trade-off between performances
(i.e. without shared data) and classification results (i.e. with some
shared data among the local contexts), depending on the specific traffic
patterns and on the algorithms used to balance the flows across the
threads/cores/local contexts.

Add some basic examples of library initialization in
`doc/library_initialization.md`.

This code needs libpthread as external dependency. It shouldn't be a big
issue; however a configure flag has been added to disable global context
support. A new CI job has been added to test it.

TODO: we should need to find a proper way to add some tests on
multithreaded enviroment... not an easy task...

*** API changes ***

If you are not interested in this feature, simply add a NULL parameter to
any `ndpi_init_detection_module()` calls.
2024-02-01 15:33:11 +01:00
Toni
44c2e59661
Provide a u64 wrapper for ndpi_set_config() (#2292)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-01-30 17:13:35 +01:00
Ivan Nardi
92c2ac5a0f
fuzz: fuzz_config: try restoring good coverage (#2291)
Last changes reduce fuzzing coverage of this fuzzer :(
2024-01-29 10:53:28 +01:00
Ivan Nardi
b3efa7d3fe
fuzz: fuzz_config: we need bigegr inputs (#2285) 2024-01-25 09:58:20 +01:00
Ivan Nardi
d577508727
fuzz: extend fuzzing coverage (#2281) 2024-01-24 21:16:58 +01:00
Ivan Nardi
9b26e74bb7
example: rework code between ndpiReader.c and reader_util.c (#2273) 2024-01-22 18:12:06 +01:00
Ivan Nardi
42d23cff6a
config: follow-up (#2268)
Some changes in the parameters names.
Add a fuzzer to fuzz the configuration file format.
Add the infrastructure to configuratin callbacks.
Add an helper to map LRU cache indexes to names.
2024-01-20 16:14:41 +01:00
Nardi Ivan
0712d496fe config: allow configuration of guessing algorithms 2024-01-18 10:21:24 +01:00
Nardi Ivan
6c85f10cd5 config: move debug/log configuration to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
c704be1a20 config: DNS: add two configuration options
* Enable/disable sub-classification of DNS flows
* Enable/disable processing of DNS responses
2024-01-18 10:21:24 +01:00
Nardi Ivan
950f209a17 config: HTTP: enable/disable processing of HTTP responses 2024-01-18 10:21:24 +01:00
Nardi Ivan
c669044a44 config: configure TLS certificate expiration with the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
88720331ae config: remove enum ndpi_prefs 2024-01-18 10:21:24 +01:00
Nardi Ivan
1289951b32 config: remove ndpi_set_detection_preferences() 2024-01-18 10:21:24 +01:00
Nardi Ivan
311d8b6dae config: move cfg of aggressiviness and opportunistic TLS to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
4cbe2674ab config: move IP lists configurations to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
f55358973f config: move LRU cache configurations to the new API 2024-01-18 10:21:24 +01:00
Nardi Ivan
d72a760ac3 New API for library configuration
This is the first step into providing (more) configuration options in nDPI.

The idea is to have a simple way to configure (most of) nDPI: only one
function (`ndpi_set_config()`) to set any configuration parameters
(in the present or on in the future) and we try to keep this function
prototype as agnostic as possible.

You can configure the library:
* via API, using `ndpi_set_config()`
* via a configuration file, in a text format

This way, anytime we need to add a new configuration parameter:
* we don't need to add two public functions (a getter and a setter)
* we don't break API/ABI compatibility of the library; even changing
the parameter type (from integer to a list of integer, for example)
doesn't break the compatibility.

The complete list of configuration options is provided in
`doc/configuration_parameters.md`.

As a first example, two configuration knobs are provided:
* the ability to enable/disable the extraction of the sha1 fingerprint of
the TLS certificates.
* the upper limit on the number of packets per flow that will be subject
to inspection
2024-01-18 10:21:24 +01:00
Vladimir Gavrilov
7f9973bd0c
Add HL7 protocol dissector (#2240)
* Add HL7 protocol dissector

* Small fixes

* Small fixes
2024-01-02 20:57:05 +01:00
Vladimir Gavrilov
0180c1f04a
Add IEC62056 (DLMS/COSEM) protocol dissector (#2229)
* Add IEC62056 (DLMS/COSEM) protocol dissector

* Fix detection on big endian architectures

* Update protocols.rst

* Add ndpi_crc16_x25 to fuzz/fuzz_alg_crc32_md5.c

* Update pcap sample

* Remove empty .out file

* iec62056: add some documentation

---------

Co-authored-by: Nardi Ivan <nardi.ivan@gmail.com>
2024-01-02 16:45:54 +01:00
Ivan Nardi
3c7ed34ce9
fuzz: improve fuzzing coverage (#2239) 2024-01-02 15:22:44 +01:00
Vladimir Gavrilov
6fc8aa4e61
Add WebDAV detection support (#2224)
* Add WebDAV detection support

* Add pcap example

* Update test results

* Remove redundant checks

* Add WebDAV related HTTP methods to fuzz/dictionary.dict

* Add note about WebDAV
2023-12-22 13:23:37 +01:00