The issue about `config.txt` files is that they contains paths:
* to configuration files, which are in the source tree
* to the dynamic plugins, which are in the build tree
Solution:
* copy all configuration files into the build tree
* all those paths are about the build tree
* tests run from the build tree, no from the source tree anymore
We should pay attention to tell ndpiReader configuration files and
libnDPI configuration files!! Better solution?
Be sure that configuration files are located where they are expected.
In oss-fuzz enviroment we can't make any assumptions about the current
working directory of your fuzz target.
Initial work to support out-of-tree builds
```
./autogen.sh
mkdir build
cd build
../configure
make
make check
```
IMPORTANT: `autogen.sh` doesn't call `configure` automatically anymore!!
You have to do: `./autogen.sh && ./configure --$OPTIONS`.
A little bit annoying but the pattern `autogen && configure && make` is
very common on Linux.
Known issues:
* `make doc` doesn't work in out-of-tree builds, yet
* Windows/MinGW/DPDK (out-of-tree) builds have not been tested, so it is unlikely they work
See: #2992
It might be usefull to be able to match traffic against a list of
suspicious JA4C fingerprints
Use the same code/logic/infrastructure used for JA3C (note that we are
going to remove JA3C...)
See: #2551
Right now the CI takes ~30 minutes; the goal is to have it ending in
< 15 min.
The basic trick is to run the longer jobs (no_x86_64 and masan) only
with the recently updated pcaps. The same jobs will run again on schedule
(every night) testing all the traces.
This way the CI will be "green" (hopefully!) earlier while pushing new
commit/PR; full tests are simply delayed.
Details: when `NDPI_TEST_ONLY_RECENTLY_UPDATED_PCAPS` is set,
`tests/do.sh` checks only the latest 10 pcaps (i.e. the more recent pcap
added/updated) for *every* configuration.
Notes that no_x86_64 and masan jobs run twice: when pushing/merging and
on schedule (every night)
TODO: enable parallel tests when using docker with no-x86_64 archs.
When I tried the obviuos solutions:
```
NDPI_FORCE_PARALLEL_UTESTS=1 NDPI_SKIP_PARALLEL_BAR=1 make check VERBOSE=1
```
I got:
```
Run configuration "caches_cfg" [--cfg=lru.ookla.size,0 --cfg=lru.msteams.ttl,1]
ookla.pcap /bin/sh: 1: run_single_pcap: not found
teams.pcap /bin/sh: 1: run_single_pcap: not found
Run configuration "caches_global" [--cfg=lru.ookla.scope,1 --cfg=lru.bittorrent.scope,1 --cfg=lru.stun.scope,1 --cfg=lru.tls_cert.scope,1 --cfg=lru.mining.scope,1 --cfg=lru.msteams.scope,1 --cfg=lru.stun_zoom.scope,1]
bittorrent.pcap /bin/sh: 1: run_single_pcap: not found
lru_ipv6_caches.pcapng /bin/sh: 1: run_single_pcap: not found
mining.pcapng /bin/sh: 1: run_single_pcap: not found
...
```
Running unit tests is quite a bottleneck while developing or while
waiting for GitHub CI results...
Try to run the tests in parallel, using the `parallel` tool.
By default, tests still run one after the other, as usual; to enable
parallel execution you need `NDPI_FORCE_PARALLEL_UTESTS=1 ./tests/do.sh`
Please note that the output is quite different in parallel mode!
A big part of the script has been rewritten to avoid code dupication
between "serial" and "parallel" path
On my notebook:
```
ivan@ivan-Latitude-E6540:~/svnrepos/nDPI(parallel)$ time ./tests/do.sh
[...]
real 3m12,684s
[...]
ivan@ivan-Latitude-E6540:~/svnrepos/nDPI(parallel)$ time NDPI_FORCE_PARALLEL_UTESTS=1 ./tests/do.sh
[...]
real 0m58,463s
```
Add the concept of "global context".
Right now every instance of `struct ndpi_detection_module_struct` (we
will call it "local context" in this description) is completely
independent from each other. This provide optimal performances in
multithreaded environment, where we pin each local context to a thread,
and each thread to a specific CPU core: we don't have any data shared
across the cores.
Each local context has, internally, also some information correlating
**different** flows; something like:
```
if flow1 (PeerA <-> Peer B) is PROTOCOL_X; then
flow2 (PeerC <-> PeerD) will be PROTOCOL_Y
```
To get optimal classification results, both flow1 and flow2 must be
processed by the same local context. This is not an issue at all in the far
most common scenario where there is only one local context, but it might
be impractical in some more complex scenarios.
Create the concept of "global context": multiple local contexts can use
the same global context and share some data (structures) using it.
This way the data correlating multiple flows can be read/write from
different local contexts.
This is an optional feature, disabled by default.
Obviously data structures shared in a global context must be thread safe.
This PR updates the code of the LRU implementation to be, optionally,
thread safe.
Right now, only the LRU caches can be shared; the other main structures
(trees and automas) are basically read-only: there is little sense in
sharing them. Furthermore, these structures don't have any information
correlating multiple flows.
Every LRU cache can be shared, independently from the others, via
`ndpi_set_config(ndpi_struct, NULL, "lru.$CACHE_NAME.scope", "1")`.
It's up to the user to find the right trade-off between performances
(i.e. without shared data) and classification results (i.e. with some
shared data among the local contexts), depending on the specific traffic
patterns and on the algorithms used to balance the flows across the
threads/cores/local contexts.
Add some basic examples of library initialization in
`doc/library_initialization.md`.
This code needs libpthread as external dependency. It shouldn't be a big
issue; however a configure flag has been added to disable global context
support. A new CI job has been added to test it.
TODO: we should need to find a proper way to add some tests on
multithreaded enviroment... not an easy task...
*** API changes ***
If you are not interested in this feature, simply add a NULL parameter to
any `ndpi_init_detection_module()` calls.
This is the first step into providing (more) configuration options in nDPI.
The idea is to have a simple way to configure (most of) nDPI: only one
function (`ndpi_set_config()`) to set any configuration parameters
(in the present or on in the future) and we try to keep this function
prototype as agnostic as possible.
You can configure the library:
* via API, using `ndpi_set_config()`
* via a configuration file, in a text format
This way, anytime we need to add a new configuration parameter:
* we don't need to add two public functions (a getter and a setter)
* we don't break API/ABI compatibility of the library; even changing
the parameter type (from integer to a list of integer, for example)
doesn't break the compatibility.
The complete list of configuration options is provided in
`doc/configuration_parameters.md`.
As a first example, two configuration knobs are provided:
* the ability to enable/disable the extraction of the sha1 fingerprint of
the TLS certificates.
* the upper limit on the number of packets per flow that will be subject
to inspection
Move from PCRE to PCRE2. PCRE is EOL and won't receive any security
updates anymore. Convert to PCRE2 by converting any function PCRE2 new
API.
Also update every entry in github workflows and README to point to the
new configure flag. (--with-pcre2)
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Extend internal unit tests to handle multiple configurations.
As some examples, add tests about:
* disabling some protocols
* disabling Ookla aggressiveness
Every configurations data is stored in a dedicated directory under
`tests\cfgs`
Remove `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` define from
`fuzz/Makefile.am`; it is already included by the main configure script
(when fuzzing).
Add a knob to force disabling of AESNI optimizations: this way we can
fuzz also no-aesni crypto code.
Move CRC32 algorithm into the library.
Add some fake traces to extend fuzzing coverage. Note that these traces
are hand-made (via scapy/curl) and must not be used as "proof" that the
dissectors are really able to identify this kind of traffic.
Some small updates to some dissectors:
CSGO: remove a wrong rule (never triggered, BTW). Any UDP packet starting
with "VS01" will be classified as STEAM (see steam.c around line 111).
Googling it, it seems right so.
XBOX: XBOX only analyses UDP flows while HTTP only TCP ones; therefore
that condition is false.
RTP, STUN: removed useless "break"s
Zattoo: `flow->zattoo_stage` is never set to any values greater or equal
to 5, so these checks are never true.
PPStream: `flow->l4.udp.ppstream_stage` is never read. Delete it.
TeamSpeak: we check for `flow->packet_counter == 3` just above, so the
following check `flow->packet_counter >= 3` is always false.
Start using a dictionary for fuzzing (see:
https://llvm.org/docs/LibFuzzer.html#dictionaries).
Remove some dead code.
Fuzzing with debug enabled is not usually a great idea (from performance
POV). Keep the code since it might be useful while debugging.
Load some custom configuration (like in the unit tests) and factorize some
(fuzzing) common code.
There is no way to pass file paths to the fuzzers as parameters. The safe
solution seems to be to load them from the process working dir. Anyway,
missing file is not a blocking error.
Remove some dead code (found looking at the coverage report)
* fixed autoconf CFLAGS/LDFLAGS MSAN issue which could lead to build errors
* introduced portable version of gmtime_r aka ndpi_gmtime_r
* do as most as possible of the serialization work in ndpi_utils.c
* use flow2json in ndpiReader
Signed-off-by: lns <matzeton@googlemail.com>
Add (basic) internal stats to the main data structures used by the
library; they might be usefull to check how effective these structures
are.
Add an option to `ndpiReader` to dump them; enabled by default in the
unit tests.
This new option enables/disables dumping of "num dissectors calls"
values, too (see b4cb14ec).
* use -ltcmalloc_and_profiler and try to get rid of LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libprofiler.so
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
* fixed possible memory leak caused by an invalid call to `node_proto_guess_walker()` during serialization
* execute serialization code while running regression tests
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
GCC analyzer won't complain about possible use-after-free (false positive).
* tests/do.sh prints word diff's only once and not the same over and over again
* sync unit tests
Signed-off-by: lns <matzeton@googlemail.com>
* Improved ASN update script
* Ran `utils/update_every_lists.sh'
* `tests/do.sh.in' prints the amount of failed pcap(s)
* `utils/asn_update.sh' prints the amount of failed download(s)
Signed-off-by: lns <matzeton@googlemail.com>
* Removed Visual Studio leftovers. Maintaining an autotools project with VS integration requires some additional overhead.
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Signed-off-by: lns <matzeton@googlemail.com>
Now there is at least one flow under `tests/pcap` for 249 protocols out
of the 284 ones supported by nDPI.
The 35 protocols without any tests are:
* P2P/sharing protocols: DIRECT_DOWNLOAD_LINK, OPENFT, FASTTRACK,
EDONKEY, SOPCAST, THUNDER, APPLEJUICE, DIRECTCONNECT, STEALTHNET
* games: CSGO, HALFLIFE2, ARMAGETRON, CROSSFIRE, DOFUS, FIESTA,
FLORENSIA, GUILDWARS, MAPLESTORY, WORLD_OF_KUNG_FU
* voip/streaming: VHUA, ICECAST, SHOUTCAST, TVUPLAYER, TRUPHONE
* other: AYIYA, SOAP, TARGUS_GETDATA, RPC, ZMQ, REDIS, VMWARE, NOE,
LOTUS_NOTES, EGP, SAP
Most of these protocols (expecially the P2P and games ones) have been
inherited by OpenDPI and have not been updated since then: even if they
are still used, the detection rules might be outdated.
However code coverage (of `lib/protocols`) only increases from 65.6% to
68.9%.
Improve Citrix, Corba, Fix, Aimini, Megaco, PPStream, SNMP and Some/IP
dissection.
Treat IPP as a HTTP sub protocol.
Fix Cassandra false positives.
Remove `NDPI_PROTOCOL_QQLIVE` and `NDPI_PROTOCOL_REMOTE_SCAN`:
these protocol ids are defined but they are never used.
Remove Collectd support: its code has never been called. If someone is
really interested in this protocol, we can re-add it later, updating the
dissector.
Add decoding of PPI (Per-Packet Information) data link type.
* As there is now a builtin, lightweight libgcrypt
there is no need to disable tls-clho decryption.
* It is still possible to use a host libgcrypt
with `--with-local-libgcrypt'.
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Implementation borrowed from the
https://github.com/ARMmbed/mbedtls.git project (v3.1.0)
Speed testing (Xeon(R) CPU E3-1230 V2 @ 3.30GHz):
gcrypt-gnu Test md 2897 ms enc 2777 ms dec 942 ms
gcrypt-int Test md 3668 ms enc 1312 ms dec 2836 ms
gcrypt-int-noaesni Test md 3652 ms enc 1916 ms dec 4458 ms
gcrypt-gnu-nonopt Test md 3763 ms enc 4978 ms dec 3999 ms
gcrypt-gnu-nonopt - libgcrypt compiled without hardware acceleration
--disable-padlock-support --disable-aesni-support \
--disable-shaext-support --disable-pclmul-support \
--disable-sse41-support --disable-drng-support \
--disable-avx-support --disable-avx2-support \
--disable-neon-support --disable-arm-crypto-support \
--disable-ppc-crypto-support
--disable-amd64-as-feature-detection
* fixed several memory errors (heap-overflow, unitialized memory, etc)
* ability to build fuzz_process_packet with a main()
allowing to replay crash data generated with fuzz_process_packet
by LLVMs libfuzzer
* temporarily disable fuzzing if `tests/do.sh`
executed with env FUZZY_TESTING_ENABLED=1
Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Zeroing large structures (i.e. size > KB) is quite costly (from a CPU point
of view): we can safely avoid doing that for a couple of big structures.
Standard and Valgrind tests have been diverging quite a lot: it is time
to re-sync them. Use the same script and enable Valgrind via an
enviroment variable:
NDPI_TESTS_VALGRIND=1 ./tests/do.sh