Commit graph

11 commits

Author SHA1 Message Date
Ivan Nardi
a7c2734b38
Remove classification "by-ip" from protocol stack (#1743)
Basically:
* "classification by-ip" (i.e. `flow->guessed_protocol_id_by_ip` is
NEVER returned in the protocol stack (i.e.
`flow->detected_protocol_stack[]`);
* if the application is interested into such information, it can access
`ndpi_protocol->protocol_by_ip` itself.

There are mainly 4 points in the code that set the "classification
by-ip" in the protocol stack:  the generic `ndpi_set_detected_protocol()`/
`ndpi_detection_giveup()` functions and the HTTP/STUN  dissectors.

In the unit tests output, a print about `ndpi_protocol->protocol_by_ip`
has been added for each flow: the huge diff of this commit is mainly due
to that.

Strictly speaking, this change is NOT an API/ABI breakage, but there are
important differences in the classification results. For examples:
* TLS flows without the initial handshake (or without a matching
SNI/certificate) are simply classified as `TLS`;
* similar for HTTP or QUIC flows;
* DNS flows without a matching request domain are simply classified as
`DNS`; we don't have `DNS/Google` anymore just because the server is
8.8.8.8 (that was an outrageous behaviour...);
* flows previusoly classified only "by-ip" are now classified as
`NDPI_PROTOCOL_UNKNOWN`.

See #1425 for other examples of why adding the "classification by-ip" in
the protocol stack is a bad idea.

Please, note that IPV6 is not supported :(  (long standing issue in nDPI) i.e.
`ndpi_protocol->protocol_by_ip` wil be always `NDPI_PROTOCOL_UNKNOWN` for
IPv6 flows.

Define `NDPI_CONFIDENCE_MATCH_BY_IP` has been removed.

Close #1687
2022-09-20 22:24:47 +02:00
Ivan Nardi
4f584f78a0
Fix ndpi_do_guess() (#1731)
Avoid a double call of `ndpi_guess_host_protocol_id()`.
Some code paths work for ipv4/6 both
Remove some never used code.
2022-09-12 19:28:41 +02:00
Ivan Nardi
405a52ed65
Patricia tree, Ahocarasick automa, LRU cache: add statistics (#1683)
Add (basic) internal stats to the main data structures used by the
library; they might be usefull to check how effective these structures
are.

Add an option to `ndpiReader` to dump them; enabled by default in the
unit tests.
This new option enables/disables dumping of "num dissectors calls"
values, too (see b4cb14ec).
2022-07-29 15:25:00 +02:00
Ivan Nardi
b4cb14ec19
Keep track of how many dissectors calls we made for each flow (#1657) 2022-07-11 09:47:47 +02:00
Luca Deri
ab09b8ce2e Added unidirectional traffic flow risk 2022-06-20 00:22:13 +02:00
Ivan Nardi
7aee856aa0
Extend tests coverage (#1476)
Now there is at least one flow under `tests/pcap` for 249 protocols out
of the 284 ones supported by nDPI.

The 35 protocols without any tests are:

* P2P/sharing protocols: DIRECT_DOWNLOAD_LINK, OPENFT, FASTTRACK,
EDONKEY, SOPCAST, THUNDER, APPLEJUICE, DIRECTCONNECT, STEALTHNET

* games: CSGO, HALFLIFE2, ARMAGETRON, CROSSFIRE, DOFUS, FIESTA,
FLORENSIA, GUILDWARS, MAPLESTORY, WORLD_OF_KUNG_FU

* voip/streaming: VHUA, ICECAST, SHOUTCAST, TVUPLAYER, TRUPHONE

* other: AYIYA, SOAP, TARGUS_GETDATA, RPC, ZMQ, REDIS, VMWARE, NOE,
LOTUS_NOTES, EGP, SAP

Most of these protocols (expecially the P2P and games ones) have been
inherited by OpenDPI and have not been updated since then: even if they
are still used, the detection rules might be outdated.

However code coverage (of `lib/protocols`) only increases from 65.6% to
68.9%.

Improve Citrix, Corba, Fix, Aimini, Megaco, PPStream, SNMP and Some/IP
dissection.
Treat IPP as a HTTP sub protocol.
Fix Cassandra false positives.

Remove `NDPI_PROTOCOL_QQLIVE` and `NDPI_PROTOCOL_REMOTE_SCAN`:
these protocol ids are defined but they are never used.

Remove Collectd support: its code has never been called. If someone is
really interested in this protocol, we can re-add it later, updating the
dissector.

Add decoding of PPI (Per-Packet Information) data link type.
2022-03-09 22:37:35 +01:00
Luca Deri
7aef27f85e Added NDPI_ERROR_CODE_DETECTED risk 2022-02-03 13:20:54 +01:00
Ivan Nardi
3a087e951d
Add a "confidence" field about the reliability of the classification. (#1395)
As a general rule, the higher the confidence value, the higher the
"reliability/precision" of the classification.

In other words, this new field provides an hint about "how" the flow
classification has been obtained.
For example, the application may want to ignore classification "by-port"
(they are not real DPI classifications, after all) or give a second
glance at flows classified via LRU caches (because of false positives).

Setting only one value for the confidence field is a bit tricky: more
work is probably needed in the next future to tweak/fix/improve the logic.
2022-01-11 15:23:39 +01:00
Luca Deri
e8455236bd Updated output 2021-08-07 17:38:33 +02:00
Ivan Nardi
cccf794265
ndpiReader: add statistics about nDPI performance (#1240)
The goal is to have a (roughly) idea about how many packets nDPI needs
to properly classify a flow.

Log this information (and guessed flows number too) during unit tests,
to keep track of improvements/regressions across commits.
2021-07-13 12:28:39 +02:00
rafaliusz
1ecc6d323e
Add a connectionless DCE/RPC detection (#1078)
* Add connectionless DCE/RPC detection

* Add DCE/RPC pcap file as well as its test result

Co-authored-by: rafal <rafal.burzynski@cryptomage.com>
2020-12-08 15:48:53 +01:00