vrr/nDPI

mirror of https://github.com/vel21ripn/nDPI.git synced 2026-04-29 15:39:42 +00:00

Author	SHA1	Message	Date
Ivan Nardi	87440c59bb	fuzz: extend fuzzing coverage and fix loading of TCP fingerprints from file (#3059 )	2025-12-09 14:03:46 +01:00
Ivan Nardi	1b6db29080	fuzz: fix	2025-11-13 14:14:33 +01:00
Ivan Nardi	83d85775a8	Provide an explicit state for the flow classification process (#2942 ) Application should keep calling nDPI until flow state became `NDPI_STATE_CLASSIFIED`. The main loop in the application is simplified to something like: ``` res = ndpi_detection_process_packet(...); if(res->state == NDPI_STATE_CLASSIFIED) { /* Done: you can get finale classification and all metadata. nDPI doesn't need more packets for this flow / } else { / nDPI needs more packets for this flow. The provided classification is not final and more metadata might be extracted. If `res->state` is `NDPI_STATE_PARTIAL`, partial/initial classification is available in `res->proto` as usual but it can be updated later. / } / Example A (QUIC flow): pkt 1: proto QUIC state NDPI_STATE_PARTIAL pkt 2: proto QUIC/Youtube state NDPI_STATE_CLASSIFIED Example B (GoogleMeet call): pkt 1: proto STUN state NDPI_STATE_PARTIAL pkt N: proto DTLS state NDPI_STATE_PARTIAL pkt N+M: proto DTLS/GoogleCall state NDPI_STATE_CLASSIFIED Example C (standard TLS flow): pkt 1: proto Unknown state NDPI_STATE_INSPECTING pkt 2: proto Unknown state NDPI_STATE_INSPECTING pkt 3: proto Unknown state NDPI_STATE_INSPECTING pkt 4: proto TLS/Facebook state NDPI_STATE_PARTIAL pkt N: proto TLS/Facebook state NDPI_STATE_CLASSIFIED / } ``` You can take a look at `ndpiReader` for a slightly more complex example. API changes: remove the third parameter from `ndpi_detection_giveup()`. If you need to know if the classification flow has been guessed, you can access `flow->protocol_was_guessed` * remove `ndpi_extra_dissection_possible()` * change some prototypes from accepting `ndpi_protocol foo` to `ndpi_master_app_protocol bar`. The update is trivial: from `foo` to `foo.proto`	2025-11-03 12:08:15 +01:00
Ivan Nardi	20892cf4fc	Extend values saved in hash data structure to `u_int64_t` (#3013 ) Move from `u_int32_t` to `u_int64_t`. We want to be able to save protocol + category + breed in the same entry.	2025-10-24 17:58:08 +02:00
Ivan Nardi	9d22805954	Add statistics about hash data structures (#2995 )	2025-10-17 20:39:15 +02:00
Ivan Nardi	a07d55005d	fuzz: try to improve fuzzing coverage (#2981 )	2025-10-06 20:44:31 +02:00
Ivan Nardi	ddd277fc44	HTTP: add further configuration to enable/disable metadata extraction (#2972 ) Rename existing configuration knobs, to better separate metadata from requests, from metadata from responses	2025-09-23 15:11:25 +02:00
Ivan Nardi	efccc7d5e4	Rework flow breed (#2926 ) Right now, there is, in essence, a static mapping between flow protocols and flow breeds. Make it dynamic: allow to have different flows, with the same classification but differents breeds. This is the same logic that we already have for categories.... Preliminary work to support breed in category lists. API change from the app POV: to get the flow breed don't use anymore `ndpi_get_proto_breed()`, but access directly `struct ndpi_proto->breed` The functions `ndpi_domain_classify_*()` and `ndpi_get_host_domain_suffix()` now have a `u_int32_t` parameter as `class_id` (instead of `u_int_16_t`), with the following logic: ``` class_id = (breed << 16) \| category ``` instead of the old: ``` class_id = category ``` Please note that this change is back-compatible: if you are not interested into breeds, you don't need to update the application code.	2025-09-02 16:54:34 +02:00
Ivan Nardi	44c94e924f	fuzz: extend fuzzing coverage (#2951 )	2025-08-31 20:12:53 +02:00
Ivan Nardi	b7cb6cf408	Follow-up of `8e1b17215`: `NDPI_UNRESOLVED_HOSTNAME` (#2933 ) Add fuzzing, documentation and unit tests	2025-08-05 11:32:29 +02:00
Ivan Nardi	8dd2220116	Add the concept of protocols stack: more than 2 protocols per flow (#2913 ) The idea is to remove the limitation of only two protocols ("master" and "app") in the flow classifcation. This is quite handy expecially for STUN flows and, in general, for any flows where there is some kind of transitionf from a cleartext protocol to TLS: HTTP_PROXY -> TLS/Youtube; SMTP -> SMTPS (via STARTTLS msg). In the vast majority of the cases, the protocol stack is simply Master/Application. Examples of real stacks (from the unit tests) different from the standard "master/app": * "STUN.WhatsAppCall.SRTP": a WA call * "STUN.DTLS.GoogleCall": a Meet call * "Telegram.STUN.DTLS.TelegramVoip": a Telegram call * "SMTP.SMTPS.Google": a SMTP connection to Google server started in cleartext and updated to TLS * "HTTP.Google.ntop": a HTTP connection to a Google domain (match via "Host" header) and to a ntop server (match via "Server" header) The logic to create the stack is still a bit coarse: we have a decade of code try to push everything in only ywo protocols... Therefore, the content of the stack is still highly experimental and might change in the next future; do you have any suggestions? It is quite likely that the legacy fields "master_protocol" and "app_protocol" will be there for a long time. Add some helper to use the stack: ``` ndpi_stack_get_upper_proto(); ndpi_stack_get_lower_proto(); bool ndpi_stack_contains(struct ndpi_proto_stack s, u_int16_t proto_id); bool ndpi_stack_is_tls_like(struct ndpi_proto_stack s); bool ndpi_stack_is_http_like(struct ndpi_proto_stack *s); ``` Be sure new stack logic is compatible with legacy code: ``` assert(ndpi_stack_get_upper_proto(&flow->detected_protocol.protocol_stack) == ndpi_get_upper_proto(flow->detected_protocol)); assert(ndpi_stack_get_lower_proto(&flow->detected_protocol.protocol_stack) == ndpi_get_lower_proto(flow->detected_protocol)); ```	2025-08-01 10:05:50 +02:00
Ivan Nardi	c216c09e2c	fuzz: extend fuzzing coverage Remove some unused code	2025-06-24 15:04:35 +02:00
Ivan Nardi	978ca1ba1a	New API to enable/disable protocols. Removed `NDPI_LAST_IMPLEMENTED_PROTOCOL` (#2894 ) Change the API to enable/disable protocols: you can set that via the standard `ndpi_set_config()` function, as every configuration parameters. By default, all protocols are enabled. Split the (local) context initialization into two phases: * `ndpi_init_detection_module()`: generic part. It does not depend on the configuration and on the protocols being enabled or not. It also calculates the real number of internal protocols * `ndpi_finalize_initialization()`: apply the configuration. All the initialization stuff that depend on protocols being enabled or not must be put here This is the last step to have the protocols number fully calculated at runtime Remove a (now) useless fuzzer. Important API changes: * remove `NDPI_LAST_IMPLEMENTED_PROTOCOL` define * remove `ndpi_get_num_internal_protocols()`. To get the number of configured protocols (internal and custom) you must use `ndpi_get_num_protocols()` after having called `ndpi_finalize_initialization()`	2025-06-23 11:24:18 +02:00
Ivan Nardi	2b3fdb4f8a	fuzz: try to improve coverage (#2883 ) Revert of `2b14b46df3`	2025-06-14 10:48:16 +02:00
Ivan Nardi	6da6991320	Rework sanity checks and remove some functions from API (#2882 )	2025-06-12 16:07:56 +02:00
Ivan Nardi	bcfa3f5477	Rename `ndpi_bitmask_dealloc` into `ndpi_bitmask_free`	2025-06-09 09:30:30 +02:00
Ivan Nardi	cbd7136b34	Remove `NDPI_PROTOCOL_BITMASK`; add a new generic bitmask data structure (#2871 ) The main difference is that the memory is allocated at runtime Typical usercase: ``` struct ndpi_bitmask b; ndpi_bitmask_alloc(&b, ndpi_get_num_internal_protocols()); ndpi_bitmask_set(&b, $BIT); ndpi_bitmask_is_set(&b, $BIT); [...] ndpi_bitmask_dealloc(&b); ``` See #2136	2025-06-09 09:00:17 +02:00
Ivan Nardi	5e54531282	Remove `ndpi_set_proto_defaults()` from the API (#2863 ) Add an explicit field to indicate if the protocol is custom or internal	2025-06-03 17:43:28 +02:00
Ivan Nardi	ed21057710	First step into a dynamic number of protocols (#2857 ) We want to get rid of the defines `NDPI_MAX_SUPPORTED_PROTOCOLS` and `NDPI_MAX_NUM_CUSTOM_PROTOCOLS`. You can use: ``` ndpi_get_num_protocols() ``` See #2136 Removed some unused functions from public API	2025-06-03 10:22:15 +02:00
Ivan Nardi	70a72f1638	New API to enable/disable protocols; remove `ndpi_set_protocol_detection_bitmask2()` (#2853 ) The main goal is not to have the bitmask depending on the total number of protocols anymore: `NDPI_INTERNAL_PROTOCOL_BITMASK` depends only on internal protocols, i.e. on `NDPI_MAX_INTERNAL_PROTOCOLS`, i.e. custom-defined protocols are not counted. See #2136 Keep the old data structure `NDPI_PROTOCOL_BITMASK` with the old semantic. Since we need to change the API (and all the application code...) anyway, simplify the API: by default all the protocols are enabled. If you need otherwise, please use `ndpi_init_detection_module_ext()` instead of `ndpi_init_detection_module()` (you can find an example in the `ndpiReader` code). To update the application code you likely only need to remove these 3 lines from your code: ``` - NDPI_PROTOCOL_BITMASK all; - NDPI_BITMASK_SET_ALL(all); - ndpi_set_protocol_detection_bitmask2(ndpi_str, &all); ``` Removed an unused field and struct definition.	2025-06-03 09:45:46 +02:00
Ivan Nardi	8df79a7354	Follow-up of `c1d372860` (TCP fingerprint format) (#2850 )	2025-05-26 12:32:47 +02:00
Ivan Nardi	3e2d69b92a	Follow-up of latest Signal call change (see: `4d41588a7`)	2025-04-05 14:22:05 +02:00
Ivan Nardi	f4691c518a	fuzz: extend coverage (#2786 )	2025-03-31 17:54:14 +02:00
Ivan Nardi	29eb89a88f	Improved configuration to enable/disable export of flow risk info (#2780 ) Follow-up of `f568313363`: now the configuration is for flow-risk, not global	2025-03-25 21:35:01 +01:00
Leonardo Teixeira Alves	c49d126d36	Add Autonomous System Organization to geoip (#2763 ) Co-authored-by: Leonardo Teixeira Alves <leonardo.alves@zerum.com>	2025-03-06 14:47:17 +01:00
Ivan Nardi	f568313363	Add configuration parameter to enable/disable export of flow risk info (#2761 ) For the most common protocols, avoid creating the string message if we are not going to use it	2025-03-05 16:14:03 +01:00
Ivan Nardi	8ee59bb9b9	fuzz: extend fuzzing coverage (#2750 )	2025-02-28 12:38:15 +01:00
Leonardo Teixeira Alves	3d0bfc7bfe	Add city as a geoip possibility (#2746 )	2025-02-24 19:41:02 +01:00
Ivan Nardi	2d3f08362e	RTP: payload type info should be set only for real RTP flows (#2742 )	2025-02-22 13:35:40 +01:00
Ivan Nardi	5f8545d97a	SSDP: add configuration for disabling metadata extraction (#2736 )	2025-02-17 15:16:37 +01:00
Luca	64d536752e	Compilation fix	2025-02-07 10:58:24 +01:00
Ivan Nardi	dd4807f8ee	bittorrent: add configuration for "hash" metadata (#2706 ) Fix confidence value for same TCP flows	2025-01-31 17:42:47 +01:00
Ivan Nardi	cf8f761b93	HTTP: add configuration for some metadata (#2704 ) Extend file configuration for just subclassification.	2025-01-31 16:26:53 +01:00
Ivan Nardi	ecf0f8ace3	Create a specific configuration for classification only (#2689 ) In some scenarios, you might not be interested in flow metadata or flow-risks at all, but you might want only flow (sub-)classification. Examples: you only want to forward the traffic according to the classification or you are only interested in some protocol statistics. Create a new configuration file (for `ndpiReader`, but you can trivially adapt it for the library itself) allowing exactly that. You can use it via: `ndpiReader --conf=example/only_classification.conf ...` Note that this way, the nDPI overhead is lower because it might need less packets per flow: * TLS: nDPI processes only the CH (in most cases) and not also the SH and certificates * DNS: only the request is processed (instead of both request and response) We might extend the same "shortcut-logic" (stop processing the flow immediately when there is a final sub-classification) for others protocols. Add the configuration options to enable/disable the extraction of some TLS metadata.	2025-01-31 15:10:30 +01:00
Ivan Nardi	d4fb7b0aa1	fuzz: extend fuzzing coverage (#2696 )	2025-01-23 15:23:01 +01:00
Ivan Nardi	af011e338e	TLS: remove JA3C (#2679 ) Last step of removing JA3C fingerprint Remove some duplicate tests: testing with ja4c/ja3s disabled is already performed by `disable_metadata_and_flowrisks` configuration. Close:#2551	2025-01-14 15:02:20 +01:00
Ivan Nardi	63a3547f99	Add (kind of) support for loading a list of JA4C malicious fingerprints (#2678 ) It might be usefull to be able to match traffic against a list of suspicious JA4C fingerprints Use the same code/logic/infrastructure used for JA3C (note that we are going to remove JA3C...) See: #2551	2025-01-14 12:05:03 +01:00
Ivan Nardi	bf830b4236	Add the ability to enable/disable every specific flow risks (#2653 )	2025-01-06 16:53:29 +01:00
Ivan Nardi	f20cec4985	fuzz: improve fuzzing coverage (#2642 ) Updtae pl7m code (Fix swap-direction mutation)	2024-12-11 16:41:35 +01:00
Ivan Nardi	cff8bd1bb2	Update `flow->flow_multimedia_types` to a bitmask (#2625 ) In the same flow, we can have multiple multimedia types	2024-11-25 10:12:48 +01:00
Ivan Nardi	43f7dc9ba0	fuzz: extend fuzzing coverage (#2626 )	2024-11-20 13:36:41 +01:00
Ivan Nardi	1bda2bf414	SIP: extract some basic metadata	2024-11-12 13:34:25 +01:00
Ivan Nardi	819291b7e4	Add configuration of TCP fingerprint computation (#2598 ) Extend configuration of raw format of JA4C fingerprint	2024-10-18 16:58:06 +02:00
Ivan Nardi	521d0ca7a0	Add monitoring capability (#2588 ) Allow nDPI to process the entire flows and not only the first N packets. Usefull when the application is interested in some metadata spanning the entire life of the session. As initial step, only STUN flows can be put in monitoring. See `doc/monitoring.md` for further details. This feature is disabled by default. Close #2583	2024-10-14 18:05:35 +02:00
Liam Wilson	cdda369e92	Add enable/disable guessing using client IP/port (#2569 ) Add configurable options for whether to include client port or client IP in the flow's protocol guesses. This defaults to include both client port/IP if the protocol is not guessed with the server IP/port. This is intended for when flow direction detection is enabled, so we know that sport = client port, dport = server port.	2024-09-27 09:23:22 +02:00
Ivan Nardi	ddd08f913c	Add some heuristics to detect encrypted/obfuscated/proxied TLS flows (#2553 ) Based on the paper: "Fingerprinting Obfuscated Proxy Traffic with Encapsulated TLS Handshakes". See: https://www.usenix.org/conference/usenixsecurity24/presentation/xue-fingerprinting Basic idea: * the packets/bytes distribution of a TLS handshake is quite unique * this fingerprint is still detectable if the handshake is encrypted/proxied/obfuscated All heuristics are disabled by default.	2024-09-24 14:20:31 +02:00
Liam Wilson	80971e4a17	Allow IP guess before port in ndpi_detection_giveup (#2562 ) Add dpi.guess_ip_before_port which when enabled uses classification by-ip before classification by-port.	2024-09-20 10:25:41 +02:00
Ivan Nardi	0ddbda1f82	Add an heuristic to detect encrypted/obfuscated OpenVPN flows (#2547 ) Based on the paper: "OpenVPN is Open to VPN Fingerprinting" See: https://www.usenix.org/conference/usenixsecurity22/presentation/xue-diwen Basic idea: * the distribution of the first byte of the messages (i.e. the distribution of the op-codes) is quite unique * this fingerprint might be still detectable even if the OpenVPN packets are somehow fully encrypted/obfuscated The heuristic is disabled by default.	2024-09-16 18:38:26 +02:00
Nardi Ivan	85ebda434d	OpenVPN, Wireguard: improve sub-classification Allow sub-classification of OpenVPN/Wireguard flows using their server IP. That is useful to detect the specific VPN application/app used. At the moment, the supported protocols are: Mullvad, NordVPN, ProtonVPN. This feature is configurable.	2024-09-05 16:36:50 +02:00
Ivan Nardi	767f403e0d	fuzz: improve fuzzing coverage (#2535 ) Updtae pl7m code (fix a Use-of-uninitialized-value error and add GTP support)	2024-09-03 12:40:45 +02:00

1 2

99 commits