Commit graph

4969 commits

Author SHA1 Message Date
Petr
c35a5ca087
shell: reformatted, fixed inspections, typos (#2506)
Reformatted shell scripts according to [ShellCheck](https://github.com/koalaman/shellcheck/).

I. Most common changes:
1. https://github.com/koalaman/shellcheck/wiki/SC2086
	`$var` → `"$var"`
	Note: this isn't always necessary and I've been careful not to substitute where it wasn't necessary in meaning.
2. https://github.com/koalaman/shellcheck/wiki/SC2006
	`` `command` `` → `$(command)`
3. https://github.com/koalaman/shellcheck/wiki/SC2004
	`$(( $a + $b ))` → `$(( a + b ))`
4. https://github.com/koalaman/shellcheck/wiki/SC2164
	`cd "$dir"` → `cd "$dir" || exit 1`
5. https://github.com/koalaman/shellcheck/wiki/SC2166
	`[ check1 -o check2 ]` → `[ check1 ] || [ check2 ]`
6. https://github.com/koalaman/shellcheck/wiki/SC2002
	`cat "${file}" | wc -c` → `< "${file}" wc -c`
	Note: this looks a bit uglier but works faster.

II. Some special changes:
1. In file `utils/common.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2112
	This script is interpreted by `sh`, not by `bash`, but uses the keyword `function`.
	So I replaced `#!/usr/bin/env sh` to `#!/usr/bin/env bash`.
2. After that I thought of replacing all shebangs to `#!/usr/bin/env bash` for consistency and cross-platform compatibility, especially since most of the files already use bash.
3. But in cases when it was `#!/bin/sh -e` or `#!/bin/bash -eu` another problem appears:
	https://github.com/koalaman/shellcheck/wiki/SC2096
	So I decided to make all shebangs look uniform:
	```
	#!/usr/bin/env bash
	set -e (or set -eu) (if needed)
	```
4. In file `tests/ossfuzz.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2162
	`read i` → `read -r i`
	Note: I think that there is no need in special treatment for backslashes, but I could be wrong.
5. In file `tests/do.sh.in`:
	https://github.com/koalaman/shellcheck/wiki/SC2035
	`ls *.*cap*` → `ls -- *.*cap*`
6. In file `utils/verify_dist_tarball.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2268
	`[ "x${TARBALL}" = x ]` → `[ -z "${TARBALL}" ]`
7. In file `utils/check_symbols.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2221
	`'[ndpi_utils.o]'|'[ndpi_memory.o]'|'[roaring.o]')` → `'[ndpi_utils.o]'|'[ndpi_memory.o]')`
8. In file `autogen.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2145
	`echo "./configure $@"` → `echo "./configure $*"`
	https://github.com/koalaman/shellcheck/wiki/SC2068
	`./configure $@` → `./configure "$@"`

III. `LIST6_MERGED` and `LIST_MERGED6`
	There were typos with this variables in files `utils/aws_ip_addresses_download.sh`, `utils/aws_ip_addresses_download.sh` and `utils/microsoft_ip_addresses_download.sh` where variable `LIST6_MERGED` was defined, but `LIST_MERGED6` was removed by `rm`.
	I changed all `LIST_MERGED6` to `LIST6_MERGED`.

Not all changes are absolutely necessary, but some may save you from future bugs.
2024-07-18 17:32:49 +02:00
Ivan Nardi
ca429669ac
smpp: fix parsing of Generic Nack message (#2496) 2024-07-18 17:31:07 +02:00
Vladimir Gavrilov
6a77a891a8
Add Nano (XNO) protocol support (#2508) 2024-07-18 16:18:12 +02:00
Luca
86b67e6687 Added ClickHouse protocol 2024-07-17 14:21:45 +02:00
Petr
0a3a82680d
python: reformatted, fixed bugs (#2504) 2024-07-17 11:00:42 +02:00
Petr
989dde1a40
.gitignore: reformatted, added patterns for IDEs, for deb packages and for test results (#2503) 2024-07-16 17:39:55 +02:00
Vladimir Gavrilov
c3fff52646
Add HLS support (#2502) 2024-07-16 12:01:28 +02:00
Vladimir Gavrilov
996cddbd18
Refactor ndpi_strnstr to use ndpi_memmem (#2500) 2024-07-15 20:43:19 +02:00
Petr
2f66a6a3e1
ndpi_memmem: optimized, fixed bug, added tests (#2499) 2024-07-15 08:35:10 +02:00
Petr
e059daa0f1
Optimize performance of ndpi_strnstr() and possible bugfix (#2494) 2024-07-15 08:34:08 +02:00
Petr
f8e32bc75b
Fixed mistake in shebang (SC1113) (#2498) 2024-07-15 07:21:03 +02:00
Ivan Nardi
c3ba65311e
fuzzing: improve coverage (#2495)
Fix detection of WebDAV and Gnutella (over HTTP)
Fix detection of z3950

Add two fuzzers to test `ndpi_memmem()` and `ndpi_strnstr()`

Remove some dead code:
* RTP: the same exact check is performed at the very beginning of the
function
* MQTT: use a better helper to exclude the protocol
* Colletd: `ndpi_hostname_sni_set()` never fails

Update pl7m code (fix a Use-of-uninitialized-value error)
2024-07-12 14:22:25 +02:00
Ivan Nardi
456f0fd427
Improve detection of Cloudflare WARP traffic (#2491)
See: #2484
2024-07-04 08:59:04 +02:00
Ivan Nardi
843e487270
Add infrastructure for explicit support of Fist Packet Classification (#2488)
Let's start with some basic helpers and with FPC based on flow addresses.

See: #2322
2024-07-03 18:02:07 +02:00
Ivan Nardi
e5661337d0
Minor fix in CI action (#2489) 2024-07-03 16:17:21 +02:00
Ivan Nardi
1e411e98fa
Reduce snaplen of some traces (#2490)
To avoid the following error with some old libpcap versions:
```
ERROR: could not open pcap file: invalid file capture length 524288, bigger than maximum of 262144
```
2024-07-03 16:17:08 +02:00
Ivan Nardi
d42f0e6ab3
Add detection of Twitter bot (#2487)
Update the global list of crawlers ips
2024-07-03 16:16:54 +02:00
Ivan Nardi
dab8d3056e
Make the CI faster (#2475)
Without the `-fsanitize-memory-track-origins` flag, MSAN job is ~30%
faster. Since this flag is useful only while debugging (and not to
simply discover memory issues), avoid it on the CI. Note that, by
default it is still enabled by default.

Right now, MingW runs on *every* ubuntu builds: limit it only to the
standard matrix (i.e. ubuntu 20.04, 22.04, 24.04 with default
configuration), without any sanitizers (note that MingW doesn't support
*san anyway).

armhf job is by far the longest job in the CI: remove asan configuration
to make it faster. Note that we already have a lot of different jobs (on
x86_64) with some sanitizers, and that the other 2 jobs on arm/s390x don't
have asan support anyway.
If we really, really want a job with arm + asan we can add it as a
async/scheduled job.

Remove an old workaround for ubuntu jobs

Avoid installing packages needed only for the documentation

About `check_symbols.sh` script: even if uses the compiled library/objects,
it basicaly only checks if we are using, in the source code, same functions
that we shoudn't. We don't need to perform the same kind of check so
many times..
2024-07-01 11:55:08 +02:00
Luca Deri
731b75b44c Modified separator from , (comma) to | (pipe) as some fields such as the HTTP user agent as sometimes they contain commas and create parsing problems 2024-07-01 09:53:38 +02:00
Ivan Nardi
fc334d56c4
tunnelbear: improve detection over wireguard (#2485)
See #2484
2024-07-01 08:20:18 +02:00
Ivan Nardi
4f05d21441
Improve detection of Twitter/X (#2482) 2024-07-01 08:20:04 +02:00
Ivan Nardi
2009623762
Add detection of OpenAI ChatGPT bots (#2481) 2024-07-01 08:19:35 +02:00
Ivan Nardi
ebcea42e2b
fuzz: improve fuzzers using pl7m (#2486)
For some unclear reasons, fuzzers using pl7m create huge corpus,
triggering OOM in oss-fuzz runs (where the memory RSS limit is set to
2560Mb). Example:

```
==25340== ERROR: libFuzzer: out-of-memory (used: 2564Mb; limit: 2560Mb)
To change the out-of-memory limit use -rss_limit_mb=<N>

Live Heap Allocations: 2364004039 bytes in 133791 chunks; quarantined: 60662293 bytes in 3664 chunks; 176432 other chunks; total chunks: 313887; showing top 95% (at most 8 unique contexts)
1285841683 byte(s) (54%) in 2956 allocation(s)
    #0 0x56f814ef4bde in __interceptor_malloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x56f814e04416 in operator new(unsigned long) cxa_noexception.cpp:0
    #2 0x56f814de6b2d in assign<unsigned char *, 0> /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1443:3
    #3 0x56f814de6b2d in operator= /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1412:9
    #4 0x56f814de6b2d in fuzzer::InputCorpus::AddToCorpus(std::__Fuzzer::vector<unsigned char, std::__Fuzzer::allocator<unsigned char>> const&, unsigned long, bool, bool, bool, std::__Fuzzer::chrono::duration<long long, std::__Fuzzer::ratio<1l, 1000000l>>, std::__Fuzzer::vector<unsigned int, std::__Fuzzer::allocator<unsigned int>> const&, fuzzer::DataFlowTrace const&, fuzzer::InputInfo const*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerCorpus.h:221:10
    #5 0x56f814de60e5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:539:16
    #6 0x56f814de7df2 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:829:7
    #7 0x56f814de8127 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:867:3
    #8 0x56f814dd6736 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:914:6
    #9 0x56f814e02c62 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #10 0x7fa11e2c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/libc-start.c:308:16

1031350683 byte(s) (43%) in 2468 allocation(s)
    #0 0x56f814ef4bde in __interceptor_malloc /src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
    #1 0x56f814e04416 in operator new(unsigned long) cxa_noexception.cpp:0
    #2 0x56f814de6b2d in assign<unsigned char *, 0> /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1443:3
    #3 0x56f814de6b2d in operator= /work/llvm-stage2/runtimes/runtimes-bins/compiler-rt/lib/fuzzer/libcxx_fuzzer_x86_64/include/c++/v1/vector:1412:9
    #4 0x56f814de6b2d in fuzzer::InputCorpus::AddToCorpus(std::__Fuzzer::vector<unsigned char, std::__Fuzzer::allocator<unsigned char>> const&, unsigned long, bool, bool, bool, std::__Fuzzer::chrono::duration<long long, std::__Fuzzer::ratio<1l, 1000000l>>, std::__Fuzzer::vector<unsigned int, std::__Fuzzer::allocator<unsigned int>> const&, fuzzer::DataFlowTrace const&, fuzzer::InputInfo const*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerCorpus.h:221:10
    #5 0x56f814de60e5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:539:16
    #6 0x56f814de7635 in fuzzer::Fuzzer::MutateAndTestOne() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:760:19
    #7 0x56f814de8425 in fuzzer::Fuzzer::Loop(std::__Fuzzer::vector<fuzzer::SizedFile, std::__Fuzzer::allocator<fuzzer::SizedFile>>&) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:905:5
    #8 0x56f814dd6736 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:914:6
    #9 0x56f814e02c62 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #10 0x7fa11e2c3082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/libc-start.c:308:16
```

See: https://oss-fuzz.com/testcase-detail/4717811415449600
See: https://oss-fuzz.com/testcase-detail/6164130982068224

Let's *try* the following workaround: set the parameter `-max-len` to
512K, to force the engine to not genereate inputs (i.e. pcap files...)
larger than 512K. Right now the value used is 1MB, i.e the default,
because we have file larger than 1MB in the initial seeds (i.e.
`/tests/pcaps/*`).

Let's hope than having smaller files lead to smaller corpus...

Update pl7m code (fix a Use-of-uninitialized-value error)
2024-06-29 18:01:05 +02:00
Ivan Nardi
83e6e753af
fuzz: pl7m: add a custom mutator for better fuzzing of pcap files (#2483)
Pl7m is a custom mutator (used for structure aware fuzzing) for network
traffic packet captures (i.e. pcap files).

The output of the mutator is always a valid pcap file, containing the
same flows/sessions of the input file. That's it: the mutator only
changes the packet payload after the TCP/UDP header, keeping all the
original L2/L3 information (IP addresses and L4 ports).

See: https://github.com/IvanNardi/pl7m
2024-06-27 18:07:43 +02:00
Anshul Thakur
8f6f73505d
Remove duplicate API for ndpi_serialize_uint32_double in ndpi_api.h (#2480)
Co-authored-by: Anshul Thakur <athakur@cdot.in>
2024-06-26 14:10:57 +02:00
Nardi Ivan
556f892a56 wireshark: lua: export some metadata
Export some metadata (for the moment, SNI and TLS fingerprints) to
Wireshark/tshark via extcap.
Note that:
* metadata are exported only once per flow
* metadata are exported (all together) when nDPI stopped processing
the flow

Still room for a lot of improvements!
In particular:
* we need to add some boundary checks (if we are going to export other
attributes)
* we should try to have a variable length trailer
2024-06-25 16:39:45 +02:00
Nardi Ivan
f44832cc51 wireshark: lua: filtering and trailer dissection work with tshark, too
```
ivan@ivan-Latitude-E6540:~/$ tshark -C "nDPI extcap" -i ndpi -o extcap.ndpi.i:/home/ivan/svnrepos/nDPI/tests/pcap/anydesk.pcapng -Y "ndpi.protocol.name contains DNS"
Capturing on 'nDPI interface: ndpi'
   62 22635386.425683 192.168.1.187 DNS.AnyDesk 192.168.1.1  128   Standard query 0xec22 A relay-3185a847.net.anydesk.com
   63 22635386.439540  192.168.1.1 DNS.AnyDesk 192.168.1.187 144   Standard query response 0xec22 A relay-3185a847.net.anydesk.com A 37.61.223.15
   64 22635386.721277 192.168.1.187 DNS.AnyDesk 192.168.1.1  128   Standard query 0xea89 A relay-9b6827f2.net.anydesk.com
   65 22635386.732444  192.168.1.1 DNS.AnyDesk 192.168.1.187 144   Standard query response 0xea89 A relay-9b6827f2.net.anydesk.com A 138.199.36.115
4 packets captured

```
2024-06-25 16:39:45 +02:00
Nardi Ivan
3f0ea18866 wireshark: lua: fix DNS dissection
Not sure when we (or Wireshark, or Lua...) broke it, but we can't call
tonumber() on Bool variables.
2024-06-25 16:39:45 +02:00
Nardi Ivan
b5afa165f0 wireshark: extcap: restore filtering mechanism 2024-06-25 16:39:45 +02:00
Nardi Ivan
2daab3f248 wireshark: lua: latest Wireshark versions correctly handle 64 bit mask 2024-06-25 16:39:45 +02:00
Nardi Ivan
46e450456b wireshark: lua: minor improvements
* Use a proper TVB to parse the nDPI trailer
* Fix some flow risks definitions
2024-06-25 16:39:45 +02:00
Luca Deri
e92568e4bb Improved logic for checking invalid DNS queries 2024-06-23 22:27:19 +02:00
Mark Jeffery
aa1d7247d1
Added default port mappings to ndpiReader help -H (#2477)
Close #2125
2024-06-19 13:47:18 +02:00
Ivan Nardi
9feede9a14
Zoom: fix detection of screen sharing (#2476) 2024-06-17 17:30:24 +02:00
Ivan Nardi
26cc1f131f
fuzz: improve fuzzing coverage (#2474)
Remove some code never triggered

AFP: the removed check is included in the following one
MQTT: fix flags extraction
2024-06-17 13:45:47 +02:00
Nardi Ivan
a35fae6b75 Sync unit tests results 2024-06-17 11:23:39 +02:00
Toni
8fd649ab1e
Add Ripe Atlas probe protocol. (#2473)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-06-17 11:00:38 +02:00
Nardi Ivan
bbe52da5cf Zoom: harden RTP/RTCP detection 2024-06-17 10:19:55 +02:00
Nardi Ivan
526cf6f291 Zoom: remove "stun_zoom" LRU cache
Since 070a0908b we are able to detect P2P calls directly from the packet
content, without any correlation among flows
2024-06-17 10:19:55 +02:00
Ivan Nardi
2bedd14aae
Update main Readme and remove commented out code (#2471)
Avoid that useless code is copy-and-pasted into new dissector, as in
https://github.com/ntop/nDPI/pull/2470#discussion_r1639384434
2024-06-17 10:11:17 +02:00
Mark Jeffery
f796c94375
Added protocol - JRMI - Java Remote Method Invocation (#2470) 2024-06-15 10:52:28 +02:00
Luca
7c7c375b45 Improved detection of Android connectiity checks 2024-06-12 10:55:07 +02:00
Mark Jeffery
312dc424bd
Added NDPI_PROTOCOL_NTOP assert and removed percentage comparison (#2460)
Close #2413
2024-06-10 19:45:19 +02:00
Ivan Nardi
aee2c81f76
Zoom: fix integer overflow (#2469)
```
AddressSanitizer:DEADLYSIGNAL
=================================================================
==29508==ERROR: AddressSanitizer: SEGV on unknown address 0x50710145d51d (pc 0x55cb788f25fe bp 0x7ffcfefa15f0 sp 0x7ffcfefa1240 T0)
==29508==The signal is caused by a READ memory access.
    #0 0x55cb788f25fe in ndpi_search_zoom /home/ivan/svnrepos/nDPI/src/lib/protocols/zoom.c:210:24
    #1 0x55cb787e9418 in check_ndpi_detection_func /home/ivan/svnrepos/nDPI/src/lib/ndpi_main.c:7174:6
    #2 0x55cb7883f753 in check_ndpi_udp_flow_func /home/ivan/svnrepos/nDPI/src/lib/ndpi_main.c:7209:10
    #3 0x55cb7883bc9d in ndpi_check_flow_func /home/ivan/svnrepos/nDPI/src/lib/ndpi_main.c:7240:12
```
Found by oss-fuzzer
See: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=69520
2024-06-10 11:06:17 +02:00
Ivan Nardi
b90d39c4ac
RTP/STUN: look for STUN packets after RTP/RTCP classification (#2465)
After a flow has been classified as RTP or RTCP, nDPI might analyse more
packets to look for STUN/DTLS packets, i.e. to try to tell if this flow
is a "pure" RTP/RTCP flow or if the RTP/RTCP packets are multiplexed with
STUN/DTLS.
Useful for proper (sub)classification when the beginning of the flows
are not captured or if there are lost packets in the the captured traffic.

Disabled by default
2024-06-07 13:12:04 +02:00
Ivan Nardi
070a0908b3
Zoom: faster detection of P2P flows (#2467) 2024-06-07 09:50:41 +02:00
Ivan Nardi
619005c5b2
STUN: add support for Microsoft Multiplexed TURN channels (#2464) 2024-06-05 16:55:58 +02:00
Ivan Nardi
b6beeef623
TLS: add support for DTLS (over STUN) over TCP (#2463)
TODO: TCP reassembler on top of UDP reassembler

See: #2414
2024-06-05 13:49:34 +02:00
Ivan Nardi
6a9d4b1ab6
CI: more parallel work (#2459) 2024-06-05 11:56:03 +02:00
Ivan Nardi
7be482f5b1
Update unit tests results (#2466) 2024-06-05 11:55:30 +02:00