nDPI/python
Ivan Nardi 400cd516b5
Allow multiple struct ndpi_detection_module_struct to share some state (#2271)
Add the concept of "global context".

Right now every instance of `struct ndpi_detection_module_struct` (we
will call it "local context" in this description) is completely
independent from each other. This provide optimal performances in
multithreaded environment, where we pin each local context to a thread,
and each thread to a specific CPU core: we don't have any data shared
across the cores.

Each local context has, internally, also some information correlating
**different** flows; something like:
```
if flow1 (PeerA <-> Peer B) is PROTOCOL_X; then
  flow2 (PeerC <-> PeerD) will be PROTOCOL_Y
```
To get optimal classification results, both flow1 and flow2 must be
processed by the same local context. This is not an issue at all in the far
most common scenario where there is only one local context, but it might
be impractical in some more complex scenarios.

Create the concept of "global context": multiple local contexts can use
the same global context and share some data (structures) using it.
This way the data correlating multiple flows can be read/write from
different local contexts.
This is an optional feature, disabled by default.

Obviously data structures shared in a global context must be thread safe.
This PR updates the code of the LRU implementation to be, optionally,
thread safe.

Right now, only the LRU caches can be shared; the other main structures
(trees and automas) are basically read-only: there is little sense in
sharing them. Furthermore, these structures don't have any information
correlating multiple flows.

Every LRU cache can be shared, independently from the others, via
`ndpi_set_config(ndpi_struct, NULL, "lru.$CACHE_NAME.scope", "1")`.

It's up to the user to find the right trade-off between performances
(i.e. without shared data) and classification results (i.e. with some
shared data among the local contexts), depending on the specific traffic
patterns and on the algorithms used to balance the flows across the
threads/cores/local contexts.

Add some basic examples of library initialization in
`doc/library_initialization.md`.

This code needs libpthread as external dependency. It shouldn't be a big
issue; however a configure flag has been added to disable global context
support. A new CI job has been added to test it.

TODO: we should need to find a proper way to add some tests on
multithreaded enviroment... not an easy task...

*** API changes ***

If you are not interested in this feature, simply add a NULL parameter to
any `ndpi_init_detection_module()` calls.
2024-02-01 15:33:11 +01:00
..
ndpi Allow multiple struct ndpi_detection_module_struct to share some state (#2271) 2024-02-01 15:33:11 +01:00
DEV_GUIDE.md Update Python bindings guide. 2022-03-22 15:01:55 +01:00
dev_requirements.txt Complete rework of nDPI Python bindings (cffi API, automatic generation, packaging and CI integration) 2022-03-22 13:19:27 +01:00
ndpi_example.py Add support for flow client/server information (#1671) 2022-07-24 17:46:24 +02:00
README.md Add a note about required Python version. 2022-10-31 13:50:59 +01:00
requirements.txt Complete rework of nDPI Python bindings (cffi API, automatic generation, packaging and CI integration) 2022-03-22 13:19:27 +01:00
setup.py Fix supported versions. 2022-10-31 13:53:23 +01:00
tests.py Add support for flow client/server information (#1671) 2022-07-24 17:46:24 +02:00

ndpi

This package contains Python bindings for nDPI. nDPI is an Open and Extensible LGPLv3 Deep Packet Inspection Library.

ndpi is implemented using CFFI (out-of-line API mode). Consequently, it is fast and PyPy compliant.

Installation

Build nDPI

git clone --branch dev https://github.com/ntop/nDPI.git
cd nDPI
./autogen.sh
./configure
make
sudo make install

Install ndpi package

cd python
# IMPORTANT: nDPI Bindings requires Python version >= 3.7
python3 -m pip install --upgrade pip
python3 -m install -r dev_requirements.txt
python3 -m pip install .

Usage

API

from ndpi import NDPI, NDPIFlow

nDPI = NDPI()

# You per flow processing here 
# ...

ndpi_flow = NDPIFlow()
nDPI.process_packet(ndpi_flow, ip_bytes, time_ms)
nDPI.giveup(ndpi_flow) # If you want to guess it instead (DPI fallback)

Example Application

ndpi_example.py is provided to demonstrate how ndpi can be integrated within your Python application.

Using nDPI 4.3.0-3532-8dd70b70
usage: ndpi_example.py [-h] [-u] input

positional arguments:
  input                 input pcap file path

optional arguments:
  -h, --help            show this help message and exit
  -u, --include-unknowns

Example with a Skype capture file

python3 ndpi_example.py -u ../tests/pcap/skype.pcap

The provided example is for demo purposes only, For additional features (live capture, multiplatform support, multiprocessing, ML based classification, system visibility, etc.), please check nDPI based framework, NFStream.

License

This project is licensed under the LGPLv3 License - see the License file for details.