Commit graph

2 commits

Author SHA1 Message Date
Ivan Nardi
2b883b93be
Fix some errors found by fuzzers (#2078)
Fix compilation on Windows.
"dirent.h" file has been taken from https://github.com/tronkko/dirent/

Fix Python bindings

Fix some warnings with x86_64-w64-mingw32-gcc:
```
protocols/dns.c: In function ‘ndpi_search_dns’:
protocols/dns.c:775:41: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  775 |       unsigned long first_element_len = (unsigned long)dot - (unsigned long)_hostname;
      |                                         ^
protocols/dns.c:775:62: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  775 |       unsigned long first_element_len = (unsigned long)dot - (unsigned long)_hostname;
      |
```
```
In file included from ndpi_bitmap64.c:31:
third_party/include/binaryfusefilter.h: In function ‘binary_fuse8_hash’:
third_party/include/binaryfusefilter.h:160:32: error: left shift count >= width of type [-Werror=shift-count-overflow]
  160 |     uint64_t hh = hash & ((1UL << 36) - 1);
```
```
In function ‘ndpi_match_custom_category’,
    inlined from ‘ndpi_fill_protocol_category.part.0’ at ndpi_main.c:7056:16:
ndpi_main.c:3419:3: error: ‘strncpy’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
 3419 |   strncpy(buf, name, name_len);
```
2023-09-10 11:09:59 +02:00
Luca Deri
36abf06c6f Swap from Aho-Corasick to an experimental/home-grown algorithm that uses a probabilistic
approach for handling Internet domain names.

For switching back to Aho-Corasick it is necessary to edit
ndpi-typedefs.h and uncomment the line
// #define USE_LEGACY_AHO_CORASICK

[1] With Aho-Corasick
$ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory
nDPI Memory statistics:
nDPI Memory (once):      37.34 KB
Flow Memory (per flow):  960 B
Actual Memory:           33.09 MB
Peak Memory:             33.09 MB

[2] With the new algorithm
$ ./example/ndpiReader -G ./lists/ -i tests/pcap/ookla.pcap | grep Memory
nDPI Memory statistics:
nDPI Memory (once):      37.31 KB
Flow Memory (per flow):  960 B
Actual Memory:           7.42 MB
Peak Memory:             7.42 MB

In essence from ~33 MB to ~7 MB

This new algorithm will enable larger lists to be loaded (e.g. top 1M domans
https://s3-us-west-1.amazonaws.com/umbrella-static/index.html)

In ./lists there are file names that are named as <category>_<string>.list
With -G ndpiReader can load all of them at startup
2023-08-29 17:34:04 +02:00