ndpiReader --cfg=dpi.address_cache_size,1000 -i <pcap>.pcap
In the above example the cache has up to 1000 entries.
In jcase ndpiReader exports data in JSON, the cache hostname (if found) is exported in the field server_hostname
Quick fix with latest Windows image on GitHub CI, where we got:
```
ndpiReader.c:2860:38: error: '%s' directive output may be truncated writing up to 64 bytes into a region of size 63 [-Werror=format-truncation=]
2860 | snprintf(srcip, sizeof(srcip), "[%s]", flow->src_name);
| ^~
ndpiReader.c:2860:5: note: 'snprintf' output between 3 and 67 bytes into a destination of size 64
2860 | snprintf(srcip, sizeof(srcip), "[%s]", flow->src_name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ndpiReader.c:2861:38: error: '%s' directive output may be truncated writing up to 64 bytes into a region of size 63 [-Werror=format-truncation=]
2861 | snprintf(dstip, sizeof(dstip), "[%s]", flow->dst_name);
| ^~
ndpiReader.c:2861:5: note: 'snprintf' output between 3 and 67 bytes into a destination of size 64
2861 | snprintf(dstip, sizeof(dstip), "[%s]", flow->dst_name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Based on the paper: "Fingerprinting Obfuscated Proxy Traffic with
Encapsulated TLS Handshakes".
See: https://www.usenix.org/conference/usenixsecurity24/presentation/xue-fingerprinting
Basic idea:
* the packets/bytes distribution of a TLS handshake is quite unique
* this fingerprint is still detectable if the handshake is
encrypted/proxied/obfuscated
All heuristics are disabled by default.
Some fuzzers don't really need a real and complete local context.
Try to avoid setting it up, creating a simpler fake version with only the
features really needed.
That is a kind of experiment: if it works, we can extend the same logic
to other fuzzers
Avoid forcing `DLT_EN10MB` but use the same data link type of the input
pcap.
This way, we can use extcap functionality with input traces having Linux
"cooked" capture encapsulation, i.e. traces captured on "any" interface
The `suffix_id` is simply an incremental index (see
`ndpi_load_domain_suffixes`), so its value might changes every time we
update the public suffix list.
Export some metadata (for the moment, SNI and TLS fingerprints) to
Wireshark/tshark via extcap.
Note that:
* metadata are exported only once per flow
* metadata are exported (all together) when nDPI stopped processing
the flow
Still room for a lot of improvements!
In particular:
* we need to add some boundary checks (if we are going to export other
attributes)
* we should try to have a variable length trailer
This cache was added in b6b4967aa, when there was no real Zoom support.
With 63f349319, a proper identification of multimedia stream has been
added, making this cache quite useless: any improvements on Zoom
classification should be properly done in Zoom dissector.
Tested for some months with a few 10Gbits links of residential traffic: the
cache pretty much never returned a valid hit.
* Added
size_t ndpi_compress_str(const char * in, size_t len, char * out, size_t bufsize);
size_t ndpi_decompress_str(const char * in, size_t len, char * out, size_t bufsize);
used to compress short strings such as domain names. This code is based on
https://github.com/Ed-von-Schleck/shoco
* Major code rewrite for ndpi_hash and ndpi_domain_classify
* Improvements to make sure custom categories are loaded and enabled
* Fixed string encoding
* Extended SalesForce/Cloudflare domains list