Commit graph

21 commits

Author SHA1 Message Date
Toni
c521595383
Add Elasticsearch protocol dissector. (#1782)
* all credits goes to @verzulli

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-10-21 20:01:54 +02:00
Toni Uhlig
29242cbcb6 Add Munin protocol dissector.
* all credits goes to @verzulli

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-10-20 22:10:35 +02:00
Ivan Nardi
a7c2734b38
Remove classification "by-ip" from protocol stack (#1743)
Basically:
* "classification by-ip" (i.e. `flow->guessed_protocol_id_by_ip` is
NEVER returned in the protocol stack (i.e.
`flow->detected_protocol_stack[]`);
* if the application is interested into such information, it can access
`ndpi_protocol->protocol_by_ip` itself.

There are mainly 4 points in the code that set the "classification
by-ip" in the protocol stack:  the generic `ndpi_set_detected_protocol()`/
`ndpi_detection_giveup()` functions and the HTTP/STUN  dissectors.

In the unit tests output, a print about `ndpi_protocol->protocol_by_ip`
has been added for each flow: the huge diff of this commit is mainly due
to that.

Strictly speaking, this change is NOT an API/ABI breakage, but there are
important differences in the classification results. For examples:
* TLS flows without the initial handshake (or without a matching
SNI/certificate) are simply classified as `TLS`;
* similar for HTTP or QUIC flows;
* DNS flows without a matching request domain are simply classified as
`DNS`; we don't have `DNS/Google` anymore just because the server is
8.8.8.8 (that was an outrageous behaviour...);
* flows previusoly classified only "by-ip" are now classified as
`NDPI_PROTOCOL_UNKNOWN`.

See #1425 for other examples of why adding the "classification by-ip" in
the protocol stack is a bad idea.

Please, note that IPV6 is not supported :(  (long standing issue in nDPI) i.e.
`ndpi_protocol->protocol_by_ip` wil be always `NDPI_PROTOCOL_UNKNOWN` for
IPv6 flows.

Define `NDPI_CONFIDENCE_MATCH_BY_IP` has been removed.

Close #1687
2022-09-20 22:24:47 +02:00
Ivan Nardi
4f584f78a0
Fix ndpi_do_guess() (#1731)
Avoid a double call of `ndpi_guess_host_protocol_id()`.
Some code paths work for ipv4/6 both
Remove some never used code.
2022-09-12 19:28:41 +02:00
Nardi Ivan
678dd61866 STUN: several improvements
Add detection over TCP and fix detection over IPv6.
Rename some variables since Stun dissector is no more "udp-centric".
Stun dissector should always classified the flow as `STUN` or
`STUN/Something`.
Don't touch `flow->guessed_host_protocol_id` field, which should be
always be related to "ip-classification" only.
2022-09-11 13:33:32 +02:00
Ivan Nardi
0a47f745cc
Avoid useless host automa lookup (#1724)
The host automa is used for two tasks:
* protocol sub-classification (obviously);
* DGA evaluation: the idea is that if a domain is present in this
automa, it can't be a DGA, regardless of its format/name.

In most dissectors both checks are executed, i.e. the code is something
like:

```
ndpi_match_host_subprotocol(..., flow->host_server_name, ...);
ndpi_check_dga_name(..., flow->host_server_name,...);

```

In that common case, we can perform only one automa lookup: if we check the
sub-classification before the DGA, we can avoid the second lookup in
the DGA function itself.
2022-09-05 13:59:51 +02:00
Nardi Ivan
b9cb391756 Add support to opportunistic TLS
A lot of protocols provide the feature to upgrade their plain text
connections to an encrypted one, via some kind of "STARTTLS" command.

Add generic code to support this extension, and allow dissection of the
entire TLS handshake.

As examples, SMTP, POP, IMAP and FTP dissectors have been updated.

Since this feature requires to process more packets per flow, add the
possibility to disable it.

Fix some log messages.

Slight improvement on TCP sequence number tracking.

As a side effect, this commit fix also a memory leak found by
oss-fuzzer
```
==108966==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 22 byte(s) in 1 object(s) allocated from:
    #0 0x55f8b367a0be in malloc (/home/ivan/svnrepos/nDPI/fuzz/fuzz_ndpi_reader_with_main+0x5480be) (BuildId: 94debacb4a6784c30420ab748c8bf3cc59621063)
    #1 0x55f8b36e1345 in ndpi_malloc_wrapper /home/ivan/svnrepos/nDPI/example/reader_util.c:321:10
    #2 0x55f8b379c7d2 in ndpi_malloc /home/ivan/svnrepos/nDPI/src/lib/ndpi_main.c:212:25
    #3 0x55f8b379cb18 in ndpi_strdup /home/ivan/svnrepos/nDPI/src/lib/ndpi_main.c:279:13
    #4 0x55f8b386ce46 in processClientServerHello /home/ivan/svnrepos/nDPI/src/lib/protocols/tls.c:2153:34
    #5 0x55f8b385ebf7 in processTLSBlock /home/ivan/svnrepos/nDPI/src/lib/protocols/tls.c:867:5
    #6 0x55f8b39e708c in ndpi_extra_search_mail_smtp_tcp /home/ivan/svnrepos/nDPI/src/lib/protocols/mail_smtp.c:422:9
    #7 0x55f8b37e636c in ndpi_process_extra_packet /home/ivan/svnrepos/nDPI/src/lib/ndpi_main.c:5884:9
    #8 0x55f8b37edc05 in ndpi_detection_process_packet /home/ivan/svnrepos/nDPI/src/lib/ndpi_main.c:6276:5
    #9 0x55f8b3701ffc in packet_processing /home/ivan/svnrepos/nDPI/example/reader_util.c:1619:31
    #10 0x55f8b36faf14 in ndpi_workflow_process_packet /home/ivan/svnrepos/nDPI/example/reader_util.c:2189:10
    #11 0x55f8b36b6a50 in LLVMFuzzerTestOneInput /home/ivan/svnrepos/nDPI/fuzz/fuzz_ndpi_reader.c:107:7

```
See: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=50765
2022-09-04 17:22:19 +02:00
Ivan Nardi
405a52ed65
Patricia tree, Ahocarasick automa, LRU cache: add statistics (#1683)
Add (basic) internal stats to the main data structures used by the
library; they might be usefull to check how effective these structures
are.

Add an option to `ndpiReader` to dump them; enabled by default in the
unit tests.
This new option enables/disables dumping of "num dissectors calls"
values, too (see b4cb14ec).
2022-07-29 15:25:00 +02:00
Ivan Nardi
172e698bb8
TINC: avoid processing SYN packets (#1676)
Since e6b332aa, we have proper support for detecting client/server
direction. So Tinc dissector is now able to properly initialize the
cache entry only when needed and not anymore at the SYN time; initializing
that entry for **every** SYN packets was a complete waste of resources.

Since 4896dabb, the various `struct ndpi_call_function_struct`
structures are not more separate objects and therefore comparing them
using only their pointers is bogus: this bug was triggered by this
change because `ndpi_str->callback_buffer_size_tcp_no_payload` is now 0.
2022-07-28 12:39:18 +02:00
Ivan Nardi
d8d525fff2
Update the protocol bitmask for some protocols (#1675)
Tcp retransmissions should be ignored.

Remove some unused protocol bitmasks.

Update script to download Whatsapp IP list.
2022-07-27 11:46:45 +02:00
Toni
ae2bedce3a
Improved Jabber/XMPP detection. (#1661)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-07-13 17:55:33 +02:00
Ivan Nardi
b4cb14ec19
Keep track of how many dissectors calls we made for each flow (#1657) 2022-07-11 09:47:47 +02:00
Toni
175f863665
Label SMTP w/ STARTTLS as SMTPS *and* dissect TLS clho. (#1639)
* Label SMTP w/ STARTTLS as SMTPS *and* dissect TLS clho.

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>

* Revert "SMTP with STARTTLS is now identified as SMTPS"

This reverts commit 52d987b603.

* Revert "Compilation fix"

This reverts commit c019946f60.

* Sync unit tests.

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-07-06 12:40:25 +02:00
Luca Deri
52d987b603 SMTP with STARTTLS is now identified as SMTPS 2022-07-05 17:00:21 +02:00
Toni
f4a1739f9c
Detect SMTPs w/ STARTTLS as TLS and dissect client/server hello. Fixes #1630. (#1637)
* FTP needs to get updated as well as it has similiar STARTTLS semantics -> follow-up

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-07-05 16:35:23 +02:00
Ivan Nardi
fdb1649a49
Fix category for mail sessions (#1621)
Close #629
2022-07-03 17:47:58 +02:00
Ivan Nardi
3a087e951d
Add a "confidence" field about the reliability of the classification. (#1395)
As a general rule, the higher the confidence value, the higher the
"reliability/precision" of the classification.

In other words, this new field provides an hint about "how" the flow
classification has been obtained.
For example, the application may want to ignore classification "by-port"
(they are not real DPI classifications, after all) or give a second
glance at flows classified via LRU caches (because of false positives).

Setting only one value for the confidence field is a bit tricky: more
work is probably needed in the next future to tweak/fix/improve the logic.
2022-01-11 15:23:39 +01:00
Ivan Nardi
b1e9245d94
ndpiReader: slight simplificaton of the output (#1378) 2021-11-27 17:32:23 +01:00
Ivan Nardi
0f168d9150
IMAP, POP3, SMTP: improve dissection (#1368)
Avoid NATS false positives
2021-11-11 11:55:56 +01:00
Luca Deri
9e97d20c25 Refreshed results list 2021-10-16 12:03:16 +02:00
Luca Deri
5c33fbf19b Added extraction of hostname in SMTP
Fixed mail incalid subprotocol calculation
2021-08-11 11:52:24 +02:00