Commit graph

541 commits

Author SHA1 Message Date
Ivan Nardi
c85f2fb0f4
TLS: add basic, basic, detection of Encrypted ClientHello (#2053) 2023-07-21 03:41:43 +02:00
Ivan Nardi
890f17788b
ndpireader: fix detection of DoH traffic based on packet distributions (#2045) 2023-07-14 23:20:06 +02:00
Luca Deri
1f55dc511f Implemented Count-Min Sketch [count how many times a value has been observed]
- ndpi_cm_sketch_init()
- ndpi_cm_sketch_add()
- ndpi_cm_sketch_count()
- ndpi_cm_sketch_destroy()
2023-07-13 21:54:51 +02:00
Chiara Maggi
0b0f255cc2
added feature to extract filename from http attachment (#2037)
* added feature to extract filename from http attachment

* fixed some issues

* added check for filename format

* added check for filename format

* remove an unnecessary print

* changed the size from 952 to 960

* modified some test result files

* small changes string size

* comment removed and mallocs checked
2023-07-11 22:45:19 +02:00
Ivan Nardi
88425e0199
Simplify the report of streaming multimedia info (#2026)
The two fields `flow->flow_type` and `flow->protos.rtp.stream_type` are
pretty much identical: rename the former in `flow->flow_multimedia_type`
and remove the latter.
2023-06-26 12:05:16 +02:00
Ivan Nardi
3608ab01b6
STUN: keep monitoring/processing STUN flows (#2012)
Look for RTP packets in the STUN sessions.
TODO: tell RTP from RTCP
2023-06-21 09:16:20 +02:00
Luca Deri
d0609ea601 Implemented Zoom/Teams stream type detection 2023-06-14 23:44:57 +02:00
Toni
4e284b5e40
Set _DEFAULT_SOURCE and _GNU_SOURCE globally. (#2010)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-06-12 19:53:57 +02:00
Ivan Nardi
9987e5b482
ndpiReader: allow to configure LRU caches TTL and size (#2004) 2023-06-08 17:06:32 +02:00
Ivan Nardi
46ff069117
ndpiReader: improve printing of payload statistics (#1989)
Add a basic unit test

Fix an endianess issue
2023-05-29 16:53:11 +02:00
Luca Deri
5ca6f0ac62 Implemented ndpi_predict_linear() for predicting a timeseries value overtime 2023-05-19 11:46:03 +02:00
Ivan Nardi
0223d3c4f5
HTTP: improve extraction of metadata and of flow risks (#1959) 2023-05-05 13:35:20 +02:00
Ivan Nardi
02a2c80453
ndpiReader: fix print of flow payload (#1960) 2023-05-04 11:52:48 +02:00
Luca Deri
f7138f07a6 Missing module termination 2023-04-28 23:00:33 +02:00
Ivan Nardi
8934f7b45f
Add an heuristic to detect/ignore some anomalous TCP ACK packets (#1948)
In some networks, there are some anomalous TCP flows where the smallest
ACK packets have some kind of zero padding.
It looks like the IP and TCP headers in those frames wrongly consider the
0x00 Ethernet padding bytes as part of the TCP payload.
While this kind of packets is perfectly valid per-se, in some conditions
they might be treated by the TCP reassembler logic as (partial) overlaps,
deceiving the classification engine.
Add an heuristic to detect these packets and to ignore them, allowing
correct detection/classification.

This heuristic is configurable. Default value:
* in the library, it is disabled
* in `ndpiReader` and in the fuzzers, it is enabled (to ease testing)

Credit to @vel21ripn for the initial patch.

Close #1946
2023-04-25 19:25:07 +02:00
Toni
b6629ba2be
Improved debug output. (#1951)
* try to get rid of some `printf(..)`s as they do not belong to a shared library
 * replaced all `exit(..)`s with `abort()`s to indicate an abnormal process termination

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-04-21 12:40:26 +02:00
Ivan Nardi
ba99330031
ndpiReader: fix flow stats (#1943) 2023-04-14 11:17:37 +02:00
lns
77ea80862f Improved debug logging.
Signed-off-by: lns <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2023-04-11 15:39:09 +02:00
Luca Deri
c47e9d201d Implemented ndpi_XXX_reset() API calls whre XXX is ses, des, hw 2023-04-08 09:52:14 +02:00
Ivan Nardi
7a627296f0
Fix LRU/Patricia/Automa stats in ndpiReader with multiple threads (#1934) 2023-04-06 09:36:11 +02:00
Ivan Nardi
25c1111911
fuzz: add a new fuzzer triggering the payload analyzer function(s) (#1926) 2023-04-04 14:39:29 +02:00
Ivan Nardi
4d11941d32
Ookla: rework detection (#1922)
The logic of the LRU cache has been changed: once we know an ip has
connected to an Ookla server, all the following (unknown) flows (for
a short time interval) from the same ip to the port 8080 are treated
as Ookla ones.

Most of the changes in this commit are about introducing the concept of
"aggressive detection". In some cases, to properly detect a
protocol we might use some statistical/behavior logic that, from one
side, let us to identify the protocol more often but, from the other
side, might lead to some false positives.
To allow the user/application to easily detect when such logic has been
triggered, the new confidence value `NDPI_CONFIDENCE_DPI_AGGRESSIVE` has been
added.
It is always possible to disable/configure this kind of logic via the
API.

Detection of Ookla flows using plain TLS over port 8080 is the first
example of aggressive detection in nDPI.

Tested with:
* Android 9.0 with app 4.8.3
* Ubuntu 20.04 with Firefox 110
* Win 10 with app 1.15 and 1.16
* Win 10 with Chrome 108, Edge 108 and Firefox 106
2023-03-30 17:13:51 +02:00
Luca Deri
64ebf73b29 Added the ability to define custom protocols with arbitrary Ids in proto.txt
Example
- ip:213.75.170.11/32:443@CustomProtocol
nDPI assigns an is that can change based on protos.txt content

- ip:213.75.170.11/32:443@CustomProtocol=9999
nDPI assigns 9999 as protocolId to CustomProtocol and won't change when
protos.txt content will chaneg
2023-03-22 00:15:56 +01:00
Luca Deri
3585e2d201 Added ability to define an unlimited number of custom rules IP:port for the same IP (it used tobe limited to 2) 2023-03-13 21:57:14 +01:00
Ivan Nardi
22fb8349b9
ndpiReader: print how many packets (per flow) were needed to perform full DPI (#1891)
Average values are already printed, but this change should ease to
identify regressions/improvements.
2023-03-01 21:50:47 +01:00
Luca Deri
96f0f85e56 Indent fix 2023-02-27 12:20:06 +01:00
Ivan Nardi
b51a2ac72a
fuzz: some improvements and add two new fuzzers (#1881)
Remove `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` define from
`fuzz/Makefile.am`; it is already included by the main configure script
(when fuzzing).

Add a knob to force disabling of AESNI optimizations: this way we can
fuzz also no-aesni crypto code.

Move CRC32 algorithm into the library.

Add some fake traces to extend fuzzing coverage. Note that these traces
are hand-made (via scapy/curl) and must not be used as "proof" that the
dissectors are really able to identify this kind of traffic.

Some small updates to some dissectors:

CSGO: remove a wrong rule (never triggered, BTW). Any UDP packet starting
with "VS01" will be classified as STEAM (see steam.c around line 111).
Googling it, it seems right so.

XBOX: XBOX only analyses UDP flows while HTTP only TCP ones; therefore
that condition is false.

RTP, STUN: removed useless "break"s

Zattoo: `flow->zattoo_stage` is never set to any values greater or equal
to 5, so these checks are never true.

PPStream: `flow->l4.udp.ppstream_stage` is never read. Delete it.

TeamSpeak: we check for `flow->packet_counter == 3` just above, so the
following check `flow->packet_counter >= 3` is always false.
2023-02-09 20:02:12 +01:00
Ivan Nardi
9f27cd56b0
ndpiReader: fix packet dissection (CAPWAP and TSO) (#1878)
Fix decapsulation of CAPWAP; we are interested only in "real" user data
tunneled via CAPWAP.
When Tcp Segmentation Offload is enabled in the NIC, the received packet
might have 0 as "ip length" in the IPv4 header
(see
https://osqa-ask.wireshark.org/questions/16279/why-are-the-bytes-00-00-but-wireshark-shows-an-ip-total-length-of-2016/)

The effect of these two bugs was that some packets were discarded.

Be sure that flows order is deterministic
2023-01-30 10:59:18 +01:00
Ivan Nardi
29c5cc39fb
Some small changes (#1869)
All dissector callbacks should not be exported by the library; make static
some other local functions.
The callback logic in `ndpiReader` has never been used.
With internal libgcrypt, `gcry_control()` should always return no
errors.
We can check `categories` length at compilation time.
2023-01-25 11:44:09 +01:00
Luca Deri
5849863ef9 Added new risk NDPI_TCP_ISSUES 2023-01-24 22:58:17 +01:00
Ivan Nardi
1b98bec0ab
LRU caches: add a generic (optional and configurable) expiration logic (#1855)
Two caches already implemented a similar mechanism: make it generic.
2023-01-18 18:18:36 +01:00
Ivan Nardi
ad6bfbad4d
Add protocol disabling feature (#1808)
The application may enable only some protocols.
Disabling a protocol means:
*) don't register/use the protocol dissector code (if any)
*) disable classification by-port for such a protocol
*) disable string matchings for domains/certificates involving this protocol
*) disable subprotocol registration (if any)

This feature can be tested with `ndpiReader -B list_of_protocols_to_disable`.

Custom protocols are always enabled.

Technically speaking, this commit doesn't introduce any API/ABI
incompatibility. However, calling `ndpi_set_protocol_detection_bitmask2()`
is now mandatory, just after having called `ndpi_init_detection_module()`.

Most of the diffs (and all the diffs in `/src/lib/protocols/`) are due to
the removing of some function parameters.

Fix the low level macro `NDPI_LOG`. This issue hasn't been detected
sooner simply because almost all the code uses only the helpers `NDPI_LOG_*`
2022-12-18 08:10:57 +00:00
Ivan Nardi
5704e4c142
STUN: add detection of ZOOM peer-to-peer flows (#1825)
See: "Enabling Passive Measurement of Zoom Performance in Production Networks"
https://dl.acm.org/doi/pdf/10.1145/3517745.3561414
2022-12-11 23:07:35 +01:00
Luca Deri
eacc2b8e32 Added Zoom screen share detection 2022-12-09 21:32:45 +01:00
Luca Deri
fc7b070030 Added RTP stream type in flow metadata 2022-12-09 14:26:53 +01:00
Luca Deri
e0afc16aa2 Exported HTTP server in metadata 2022-12-05 21:27:30 +01:00
Ivan Nardi
cd41ab7c8f
Improve export/print of L4 protocol information (#1799)
Close #1797
2022-11-13 22:35:46 +01:00
Ivan Nardi
db9f6ec1b4
Add basic profiling of memory allocations on data-path (#1789)
The goal is to have an idea of the memory allocation sizes performed in
the **library data-path**, i.e. excluding init/deinit phases and all
the allocations made by the application itself.
In other words, how much memory is needed per-flow, by nDPI, other than
`struct ndpi_flow_struct`?

It works only on single-thread configurations.

It is not enabled by default (in the unit tests) since different
canfiguration options (example: `--enable-pcre`) lead to diffferent
results.

See: #1781
2022-10-28 20:41:37 +02:00
Ivan Nardi
ca5ffc4988
TLS: improve handling of ALPN(s) (#1784)
Tell "Advertised" ALPN list from "Negotiated" ALPN; the former is
extracted from the CH, the latter from the SH.

Add some entries to the known ALPN list.

Fix printing of "TLS Supported Versions" field.
2022-10-25 17:06:29 +02:00
Ivan Nardi
6c84ce85e4
ndpiReader: fix help message. There isn't a 'J' option (#1770) 2022-10-14 20:16:47 +02:00
Nardi Ivan
cca585053e Fix compilation and sync utests results 2022-10-04 22:17:05 +02:00
Luca
de59eb8237 Added the ability to track the payload via -E and via the new option 'ndpi_track_flow_payload' 2022-10-04 11:26:44 +02:00
Nardi Ivan
1f345b311f Sizes of LRU caches are now configurable
0 as size value disable the cache.

The diffs in unit tests are due to the fact that some lookups are
performed before the first insert: before this change these lookups
weren't counted because the cache was not yet initialized, now they are.
2022-09-23 18:33:48 +02:00
Toni
644ad34962
Improved NATPMP dissection. (#1745)
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>

Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-09-21 18:24:04 +02:00
Toni Uhlig
d6701e8979 Build ndpiReader and run regression tests.
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Signed-off-by: lns <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-09-21 18:03:22 +02:00
Ivan Nardi
a7c2734b38
Remove classification "by-ip" from protocol stack (#1743)
Basically:
* "classification by-ip" (i.e. `flow->guessed_protocol_id_by_ip` is
NEVER returned in the protocol stack (i.e.
`flow->detected_protocol_stack[]`);
* if the application is interested into such information, it can access
`ndpi_protocol->protocol_by_ip` itself.

There are mainly 4 points in the code that set the "classification
by-ip" in the protocol stack:  the generic `ndpi_set_detected_protocol()`/
`ndpi_detection_giveup()` functions and the HTTP/STUN  dissectors.

In the unit tests output, a print about `ndpi_protocol->protocol_by_ip`
has been added for each flow: the huge diff of this commit is mainly due
to that.

Strictly speaking, this change is NOT an API/ABI breakage, but there are
important differences in the classification results. For examples:
* TLS flows without the initial handshake (or without a matching
SNI/certificate) are simply classified as `TLS`;
* similar for HTTP or QUIC flows;
* DNS flows without a matching request domain are simply classified as
`DNS`; we don't have `DNS/Google` anymore just because the server is
8.8.8.8 (that was an outrageous behaviour...);
* flows previusoly classified only "by-ip" are now classified as
`NDPI_PROTOCOL_UNKNOWN`.

See #1425 for other examples of why adding the "classification by-ip" in
the protocol stack is a bad idea.

Please, note that IPV6 is not supported :(  (long standing issue in nDPI) i.e.
`ndpi_protocol->protocol_by_ip` wil be always `NDPI_PROTOCOL_UNKNOWN` for
IPv6 flows.

Define `NDPI_CONFIDENCE_MATCH_BY_IP` has been removed.

Close #1687
2022-09-20 22:24:47 +02:00
Alfredo Cardigliano
973950d881 Replace obsolete linux macro 2022-09-13 10:41:44 +02:00
Ivan Nardi
0a47f745cc
Avoid useless host automa lookup (#1724)
The host automa is used for two tasks:
* protocol sub-classification (obviously);
* DGA evaluation: the idea is that if a domain is present in this
automa, it can't be a DGA, regardless of its format/name.

In most dissectors both checks are executed, i.e. the code is something
like:

```
ndpi_match_host_subprotocol(..., flow->host_server_name, ...);
ndpi_check_dga_name(..., flow->host_server_name,...);

```

In that common case, we can perform only one automa lookup: if we check the
sub-classification before the DGA, we can avoid the second lookup in
the DGA function itself.
2022-09-05 13:59:51 +02:00
lns
93d65ed650 Support serialization of double-precision floating-point numbers. Fixes #1702.
Signed-off-by: lns <matzeton@googlemail.com>
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
2022-08-24 10:49:27 +02:00
Toni
2e25c36396
Add TiVoConnect dissector. Fixes #1697. (#1699)
* added static assert if supported, to complain if the flow struct changes

Signed-off-by: lns <matzeton@googlemail.com>
2022-08-08 19:04:20 +02:00