Commit graph

10 commits

Author SHA1 Message Date
Ivan Nardi
2740a4f4e3
Update all IP lists (#2515)
The `suffix_id` is simply an incremental index (see
`ndpi_load_domain_suffixes`), so its value might changes every time we
update the public suffix list.
2024-08-02 15:06:08 +02:00
Petr
2a3f4dc8b4
Performed some grammar and typo fixes (#2511) 2024-07-19 11:22:35 +02:00
Luca
162c38f18f Added new API calls
- ndpi_load_domain_suffixes()
- ndpi_get_host_domain_suffix()

whose goal is to find the domain name of a hostname. Example:

www.bbc.co.uk   -> co.uk
mail.apple.com  -> com
2024-01-15 19:03:46 +01:00
Toni
5fb631c8fe
Improved belgium gambling sites regex. (#2184)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-11-29 16:15:23 +01:00
Toni
6dcecd73d3
Added malicious sites from the polish cert. (#2121)
* added handling of parsing errors

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-11-02 09:04:04 +01:00
Ivan Nardi
611c3b66f0
ipv6: add support for ipv6 addresses lists (#2113) 2023-10-26 20:15:44 +02:00
Ivan Nardi
0aa1cf7245
Update every ip lists (#2079) 2023-09-10 12:13:20 +02:00
Luca Deri
3dbc4f65fc Added README file 2023-08-29 17:45:36 +02:00
Luca Deri
36abf06c6f Swap from Aho-Corasick to an experimental/home-grown algorithm that uses a probabilistic
approach for handling Internet domain names.

For switching back to Aho-Corasick it is necessary to edit
ndpi-typedefs.h and uncomment the line
// #define USE_LEGACY_AHO_CORASICK

[1] With Aho-Corasick
$ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory
nDPI Memory statistics:
nDPI Memory (once):      37.34 KB
Flow Memory (per flow):  960 B
Actual Memory:           33.09 MB
Peak Memory:             33.09 MB

[2] With the new algorithm
$ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory
nDPI Memory statistics:
nDPI Memory (once):      37.31 KB
Flow Memory (per flow):  960 B
Actual Memory:           7.42 MB
Peak Memory:             7.42 MB

In essence from ~33 MB to ~7 MB

This new algorithm will enable larger lists to be loaded (e.g. top 1M domans
https://s3-us-west-1.amazonaws.com/umbrella-static/index.html)

In ./lists there are file names that are named as <category>_<string>.list
With -G ndpiReader can load all of them at startup
2023-08-29 17:34:04 +02:00
Luca Deri
2c565c77c9 Added ndpi_domain_classify_XXX(0 API 2023-08-26 00:24:33 +02:00