Commit graph

73 commits

Author SHA1 Message Date
Luca Deri
1c4ae434ab Improved statistics 2024-10-16 23:55:21 +02:00
Luca Deri
97ce729392 Parser for ndpiReader JSON files 2024-10-15 22:25:02 +02:00
Nardi Ivan
5b0374c28b Add detection of SurfShark VPN 2024-09-05 16:36:50 +02:00
Nardi Ivan
f350379e95 Add detection of NordVPN 2024-09-05 16:36:50 +02:00
Ivan Nardi
7558bcd39f
Fix url for downloading X/Twitter crawler IPs (#2526) 2024-08-22 15:35:33 +02:00
Petr
2a3f4dc8b4
Performed some grammar and typo fixes (#2511) 2024-07-19 11:22:35 +02:00
Petr
be0b2c2d90
ipaddr2list.py, ndpi2timeline.py: reformatted (#2509) 2024-07-18 20:35:47 +02:00
Petr
c35a5ca087
shell: reformatted, fixed inspections, typos (#2506)
Reformatted shell scripts according to [ShellCheck](https://github.com/koalaman/shellcheck/).

I. Most common changes:
1. https://github.com/koalaman/shellcheck/wiki/SC2086
	`$var` → `"$var"`
	Note: this isn't always necessary and I've been careful not to substitute where it wasn't necessary in meaning.
2. https://github.com/koalaman/shellcheck/wiki/SC2006
	`` `command` `` → `$(command)`
3. https://github.com/koalaman/shellcheck/wiki/SC2004
	`$(( $a + $b ))` → `$(( a + b ))`
4. https://github.com/koalaman/shellcheck/wiki/SC2164
	`cd "$dir"` → `cd "$dir" || exit 1`
5. https://github.com/koalaman/shellcheck/wiki/SC2166
	`[ check1 -o check2 ]` → `[ check1 ] || [ check2 ]`
6. https://github.com/koalaman/shellcheck/wiki/SC2002
	`cat "${file}" | wc -c` → `< "${file}" wc -c`
	Note: this looks a bit uglier but works faster.

II. Some special changes:
1. In file `utils/common.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2112
	This script is interpreted by `sh`, not by `bash`, but uses the keyword `function`.
	So I replaced `#!/usr/bin/env sh` to `#!/usr/bin/env bash`.
2. After that I thought of replacing all shebangs to `#!/usr/bin/env bash` for consistency and cross-platform compatibility, especially since most of the files already use bash.
3. But in cases when it was `#!/bin/sh -e` or `#!/bin/bash -eu` another problem appears:
	https://github.com/koalaman/shellcheck/wiki/SC2096
	So I decided to make all shebangs look uniform:
	```
	#!/usr/bin/env bash
	set -e (or set -eu) (if needed)
	```
4. In file `tests/ossfuzz.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2162
	`read i` → `read -r i`
	Note: I think that there is no need in special treatment for backslashes, but I could be wrong.
5. In file `tests/do.sh.in`:
	https://github.com/koalaman/shellcheck/wiki/SC2035
	`ls *.*cap*` → `ls -- *.*cap*`
6. In file `utils/verify_dist_tarball.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2268
	`[ "x${TARBALL}" = x ]` → `[ -z "${TARBALL}" ]`
7. In file `utils/check_symbols.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2221
	`'[ndpi_utils.o]'|'[ndpi_memory.o]'|'[roaring.o]')` → `'[ndpi_utils.o]'|'[ndpi_memory.o]')`
8. In file `autogen.sh`:
	https://github.com/koalaman/shellcheck/wiki/SC2145
	`echo "./configure $@"` → `echo "./configure $*"`
	https://github.com/koalaman/shellcheck/wiki/SC2068
	`./configure $@` → `./configure "$@"`

III. `LIST6_MERGED` and `LIST_MERGED6`
	There were typos with this variables in files `utils/aws_ip_addresses_download.sh`, `utils/aws_ip_addresses_download.sh` and `utils/microsoft_ip_addresses_download.sh` where variable `LIST6_MERGED` was defined, but `LIST_MERGED6` was removed by `rm`.
	I changed all `LIST_MERGED6` to `LIST6_MERGED`.

Not all changes are absolutely necessary, but some may save you from future bugs.
2024-07-18 17:32:49 +02:00
Petr
0a3a82680d
python: reformatted, fixed bugs (#2504) 2024-07-17 11:00:42 +02:00
Petr
f8e32bc75b
Fixed mistake in shebang (SC1113) (#2498) 2024-07-15 07:21:03 +02:00
Ivan Nardi
d42f0e6ab3
Add detection of Twitter bot (#2487)
Update the global list of crawlers ips
2024-07-03 16:16:54 +02:00
Toni
3639d2045b
Remove unused code. (#2450)
* some `#ifdef`ed code dates back to 2019, 2020 and 2021
 * some function signatures were still present in `ndpi_main.h`
   which may cause linker errors for libnDPI dependee's
 * return an error while trying to serialize a double instead
   of `fprintf(stderr, ...)`

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-05-21 13:08:07 +02:00
Nardi Ivan
478c31b6c9 utils: update script to download Cloudflare ips 2024-02-26 09:26:21 +01:00
Ivan Nardi
12e142565e
Add a script to download/update the domain suffix list (#2321) 2024-02-20 11:51:58 +01:00
Toni
f4d7aa45fe
Improved Polish gambling sites fetch script. (#2315)
* fails quite often in the CI, so ignore potential xmllint error

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2024-02-10 13:33:56 +01:00
Ivan Nardi
dd8be1fcb1
Fix some warnings reported by CODESonar (#2227)
Remove some unreached/duplicated code.

Add error checking for `atoi()` calls.

About `isdigit()` and similar functions. The warning reported is:
```
Negative Character Value help
isdigit() is invoked here with an argument of signed type char, but only
has defined behavior for int arguments that are either representable
as unsigned char or equal to the value of macro EOF(-1).
Casting the argument to unsigned char will avoid the undefined behavior.
In a number of libc implementations, isdigit() is implemented using lookup
tables (arrays): passing in a negative value can result in a read underrun.
```
Switching to our macros fix that.
Add a check to `check_symbols.sh` to avoid using the original functions
from libc.
2024-01-12 13:30:43 +01:00
Toni
5fb631c8fe
Improved belgium gambling sites regex. (#2184)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-11-29 16:15:23 +01:00
Ivan Nardi
55664392a9
Ip address list: aggregate Mullvad and Tor lists too (#2154)
Missing from bdb73db1a
See #2150
2023-11-21 08:28:32 +01:00
Ivan Nardi
bdb73db1a4
IP lists: aggregate addresses wherever possible (#2152)
See #2150
2023-11-17 12:26:23 +01:00
Ivan Nardi
6c9571d9a9
Remove duplicate addreess list (#2151)
We are loading the same AS list as GOTO
See #2150
2023-11-16 20:08:06 +01:00
Toni
6dcecd73d3
Added malicious sites from the polish cert. (#2121)
* added handling of parsing errors

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-11-02 09:04:04 +01:00
Ivan Nardi
e8e4b9e8ff
IPv6: add support for IPv6 risk tree (#2118)
Fix the script to download crawler addressess
2023-10-27 13:58:15 +02:00
Ivan Nardi
611c3b66f0
ipv6: add support for ipv6 addresses lists (#2113) 2023-10-26 20:15:44 +02:00
Luca Deri
1832d247b3 Tool for creating bitcoing IP files 2023-10-26 00:01:44 +02:00
Maatuq
4a8e7105b2
add ethereum protocol dissector. (#2111)
as explained here for bitcoin https://www.ntop.org/guides/nDPI/protocols.html#ndpi-protocol-bitcoin
the same is applicable for ethereum.
ethereum detection was removed from mining protocol and is now handled separately.

Signed-off-by: Mahmoud Maatuq <mahmoudmatook.mm@gmail.com>
2023-10-25 12:44:33 +02:00
Toni
ef3adb9830
Added printf/fprintf replacement for some internal modules. (#1974)
* logging is instead redirected to `ndpi_debug_printf`

Signed-off-by: lns <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-09-26 23:10:57 +02:00
Luca Deri
36abf06c6f Swap from Aho-Corasick to an experimental/home-grown algorithm that uses a probabilistic
approach for handling Internet domain names.

For switching back to Aho-Corasick it is necessary to edit
ndpi-typedefs.h and uncomment the line
// #define USE_LEGACY_AHO_CORASICK

[1] With Aho-Corasick
$ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory
nDPI Memory statistics:
nDPI Memory (once):      37.34 KB
Flow Memory (per flow):  960 B
Actual Memory:           33.09 MB
Peak Memory:             33.09 MB

[2] With the new algorithm
$ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory
nDPI Memory statistics:
nDPI Memory (once):      37.31 KB
Flow Memory (per flow):  960 B
Actual Memory:           7.42 MB
Peak Memory:             7.42 MB

In essence from ~33 MB to ~7 MB

This new algorithm will enable larger lists to be loaded (e.g. top 1M domans
https://s3-us-west-1.amazonaws.com/umbrella-static/index.html)

In ./lists there are file names that are named as <category>_<string>.list
With -G ndpiReader can load all of them at startup
2023-08-29 17:34:04 +02:00
Luca Deri
eeeee46b1e Changes for supporinng more efficient sub-string matching 2023-08-26 17:55:50 +02:00
snicket2100
1fbe8a2385
Mullvad VPN service added (based on entry node IP addresses) (#2062) 2023-08-02 19:44:16 +02:00
Ivan Nardi
bc91192aca
ProtonVPN: split the ip list (#2060)
Use two separate lists:
* one for the ingress nodes, which triggers a ProtonVPN classification
* one for the egress nodes, which triggers the
`NDPI_ANONYMOUS_SUBSCRIBER` risk

Add a command line option (to `ndpiReader`) to easily test IP/port
matching.

Add another example of custom rule.
2023-07-27 09:05:22 +02:00
Ivan Nardi
fa0bd515b5
Add detection of Roblox games (#2054) 2023-07-21 03:39:40 +02:00
Ivan Nardi
09548bb7cf
tests: restore some old paths as symbolic links (#2050) 2023-07-16 13:47:35 +02:00
snicket2100
abee1a2a6f
Included Gambling website data from the Polish hazard.mf.gov.pl list (#2041)
* Refreshed the Belgium Gambling Site list data

Unfortunately some hostnames have been removed from that list,
which means they are disappearing from the `ndpi_gambling_match.c.inc`
file as well.

* build: added `libxml2-utils` (for `xmllint`)

* Included Gambling website data from the Polish `hazard.mf.gov.pl` list

The list contains over 30k gambling website hostnames as of today.
2023-07-14 09:55:46 +02:00
Luca Deri
be8178fc8d Refreshed ASN lists
Enhanced the Line IP list with https://ipinfo.io/AS23576/125.209.252.0/24 used by line
2023-06-13 19:04:08 +02:00
Ivan Nardi
3e673e91a9
ProtonVPN: add basic detection (#2006) 2023-06-08 16:52:55 +02:00
Toni
6da3474203
Improved helper scripts. (#1986)
* added additional (more restrictive) checks

Signed-off-by: lns <matzeton@googlemail.com>
2023-05-28 12:45:44 +02:00
Ivan Nardi
b11e6a453b
Add support for Epic Games and GeForceNow/Nvidia (#1990) 2023-05-27 12:13:54 +02:00
Ivan Nardi
63ac50e4f4
Improve detection of Alibaba flows (#1991) 2023-05-27 10:19:58 +02:00
Toni
334b43579e
Fixed invalid use of ndpi_free(). Sorry, my fault. (#1988)
* Fixed invalid use of ndpi_free(). Sorry, my fault.

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>

* Fine tuned symbol check script.

 * added check for expected syms in modules

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>

---------

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-05-24 13:19:06 +02:00
Toni
5e8f93c2d1
Improved missing usage of nDPIs malloc wrapper. Fixes #1978. (#1979)
* added CI check

Signed-off-by: lns <matzeton@googlemail.com>
2023-05-20 16:18:52 +02:00
Toni
c97e2d67ff
Added scripts to auto generate hostname/SNI *.inc files. (#1984)
* add illegal gambling sites (Belgium)

Signed-off-by: lns <matzeton@googlemail.com>
2023-05-20 15:41:15 +02:00
Ivan Nardi
684e041998
Improve detection of crawlers/bots (#1968)
Add support for Facebook crawler
2023-05-09 16:42:29 +02:00
Luca Deri
bfe79243bc Refreshed lists 2023-05-08 16:51:38 +02:00
Ivan Nardi
6b94c9675a
Improve detection of crawler/bot traffic (#1956) 2023-05-04 11:27:34 +02:00
Ivan Nardi
2a3ade397b
DisneyPlus/Hulu ip lists should be auto-generated (#1905)
Remove two stale ip lists:
1) these 3 ips are in the Amazon ranges (now)...
2) the Instagram list originated from AS32934, which is now a Facebook
AS; see https://github.com/ntop/nDPI/pull/1264/commits/8dabd06301a802dd38616ba8684a1d995783e023
2023-03-20 19:26:40 +01:00
0xA50C1A1
ba4e145aad
Add Yandex services detection (#1882)
Add Yandex services detection

Add VK and Yandex to the TLS certificate match list
2023-02-09 20:02:43 +01:00
0xA50C1A1
4bb851384e
Add VK detection (#1880) 2023-02-02 15:27:59 +01:00
sharonenoch
503aac70bc
Line app support (#1759)
* Standard support for LINE app

* Added test pcap for LINE app

* make check result for LINE app

* Make check success as 1kxun has LINE packets

* Added the ASN inc file for LINE

* Removed extra lines as its effecting make check

* Editing the SNI required a new pcap output file for TLS.Line format

* Run Configure with --with-pcre --with-maxminddb to enable the generation of h323-overflow.pcap.out

Co-authored-by: Sharon Enoch <sharone@amzetta.com>
2022-10-01 12:01:41 +02:00
Toni
ac24b35b1f
Add Discord dissector. (#1694)
* fixed RiotGames false positive

Signed-off-by: lns <matzeton@googlemail.com>
2022-08-03 12:03:36 +02:00
Ivan Nardi
d8d525fff2
Update the protocol bitmask for some protocols (#1675)
Tcp retransmissions should be ignored.

Remove some unused protocol bitmasks.

Update script to download Whatsapp IP list.
2022-07-27 11:46:45 +02:00