Move docs to nfstream website.

This commit is contained in:
Zied Aouini 2020-03-29 21:44:32 +02:00
parent b48b33bb6c
commit 477143d7e8
14 changed files with 6 additions and 913 deletions

View file

@ -1,4 +1,4 @@
<p align="center"><a href="https://nfstream.github.io/"><img width=35% alt="" src="https://raw.githubusercontent.com/aouinizied/nfstream/master/docs/source/asset/logo_main.png?raw=true"></a></p>
<p align="center"><a href="https://nfstream.github.io/"><img width=35% alt="" src="https://raw.githubusercontent.com/aouinizied/nfstream/master/logo_main.png?raw=true"></a></p>
<h1 align="center">nfstream: a flexible network data analysis framework</h1>
[**nfstream**][repo] is a Python package providing fast, flexible, and expressive data structures designed to make working with **online** or **offline** network data both easy and intuitive. It aims to be the fundamental high-level building block for
@ -67,14 +67,6 @@ the broader goal of becoming **a common network data processing framework for re
</a>
</td>
</tr>
<tr>
<td><b>Documentation Status</b></td>
<td>
<a href="https://nfstream.readthedocs.io/en/latest/?badge=latest">
<img src="https://readthedocs.org/projects/nfstream/badge/?version=latest" alt="ReadTheDocs" />
</a>
</td>
</tr>
<tr>
<td><b>Code Quality</b></td>
<td>
@ -235,13 +227,13 @@ As with any packet monitoring tool, **nfstream** could potentially be misused.
This project is licensed under the GPLv3 License - see the [**License**][license] file for details
[license]: https://github.com/aouinizied/nfstream/blob/master/LICENSE
[contribute]: https://nfstream.readthedocs.io/en/latest/contributing.html
[contribute]: https://nfstream.github.io/docs/community
[contributors]: https://github.com/aouinizied/nfstream/graphs/contributors
[linkedin]: https://www.linkedin.com/in/dr-zied-aouini
[github]: https://github.com/aouinizied
[documentation]: https://nfstream.readthedocs.io/en/latest/index.html
[ndpi]: https://github.com/ntop/nDPI
[nfplugin]: https://nfstream.readthedocs.io/en/latest/plugins.html
[documentation]: https://nfstream.github.io/
[ndpi]: https://nfstream.github.io/docs/visibility
[nfplugin]: https://nfstream.github.io/docs/api#nfplugin
[reliable]: http://people.ac.upc.edu/pbarlet/papers/ground-truth.pam2014.pdf
[repo]: https://github.com/aouinizied/nfstream
[demo]: https://mybinder.org/v2/gh/aouinizied/nfstream-tutorials/master?filepath=demo_notebook.ipynb

View file

@ -1,20 +0,0 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

View file

@ -1,35 +0,0 @@
@ECHO OFF
pushd %~dp0
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build
if "%1" == "" goto help
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end
:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
:end
popd

View file

@ -1,94 +0,0 @@
############
Architecture
############
.. image:: asset/arch.png
:scale: 100%
:align: center
A step by step walk through each process involved when performing flow monitoring is
developed in this section. Our aim is to provide you with a reminder about how
things works in theory. Consequently, an easier understanding of nfstream features
and implementation is possible.
******************
Packet observation
******************
Packet observation is a key stage in a flow monitoring architecture as it is the
starting point. Consequently, we detail in the following each step involved at this
phase:
**Packet capture:** This step is performed on the Network Interface Card (NIC) level.
After passing various checks such as checksum error, packets stored in on-card
reception buffers are moved to the hosting device memory. Several libraries are
available to capture network traffic such as libpcap for UNIX based operating systems
Winpcap for Windows. These libraries are running on the top of the operating system
stack which may reduce performances passing through several layers.
To overcome such limitation in a high speed network context, software optimization
technique are proposed and could be considered (e.g. Intel DPDK, PF-RING, netmap).
**Timestamping:** As packets may come from several observation points, reordering
process is based on packets timestamp. While hardware timestamping provides a high
accuracy up to 100 nanoseconds in case of the IEEE 1588 protocol, its not supported
by most of commodity NIC. Software timestamping is widely used to outcome this lack
providing an accuracy up to 100 microseconds.
**Truncation (optional):** Defining a snapshot length, the process selects precise
bytes from the packet. It is performed in some cases to reduce the amount of data
captured by the probe and therefore CPU and bus bandwidth load.
**Packet sampling (optional):** is generally performed to reduce load on subsequent
stages. It can be systematic (periodic sampling scheme) or random. The latter is
recommended as periodic scheme may introduce unwanted correlation in the observed
network data.
**Packet filtering (optional):** performs filtering of packets to separate packets
having specific properties from those not having them. A packet is selected if
some specific fields are equal or in the range of given values. Another technique is
a hash based filtering, applying a hash function on a portion of the packet,
the result is compared to a value or a range of values.
*************
Flow metering
*************
It includes packets aggregation into flows and flow entry expiration management.
Second, the metering process associates a packet to a flow entry using a defined key.
Third, it performs the aggregation of packets into flow entry based on a set of metrics.
Then, a flow entry is cached until it is considered as terminated (entry expiration).
Finally, optional steps such as flow sampling and filtering may be performed.
**Flow Cache:** Flow cache consist of table in which the metering process stores
information regarding active flows in the network. A flow key (typically IP source
and destination addresses, source and destination ports, protocol and the VLAN
identifier) determines whether a packet is matching an existing flow entry in the cache
or not. In the first case, flows counters are updated. In the latter one, a new entry
is created. Non-key fields are utilized to collect flow metrics (e.g. packets/bytes
count, etc.). If IP addresses are part of flows key, and that traffic between two
pairs generates flows on both directions. We define a flow as bidirectional when we consider that pair and it reverse
belongs to same entry.The caches size depends on exporter device memory capacity
and should be configured based on criteria such as key/non-key fields, maximum number
of flows expected and expiration policy.
**Entry expiration:** Cache entries are maintained in the cache table until they are
considered as terminated. Termination of a flow is triggered by an expiration event.
The metering process should consider an entry as expired based on:
* Natural expiration: observed TCP packet belonging to a flow with FIN/RST flag.
* Active timeout: a flow entry expires after being considered active during a certain period.
* Idle timeout: a flow entry expires if no packets belonging to it are observed during a specific period..
It is possible to configure our metering process based on expiration policy to
reduce the amount of records exported.
**Flow Sampling and Filtering:** Flow sampling and filtering processes are quite like packet sampling and filtering
process explained above. The major differences are the processed unit; while packet sampling and filtering process
packets, flow sampling and filtering process flow records coming from the metering process.
******
Export
******
Export involves two steps which are mainly **formatting** and **export protocol**. While the first decide how an export is
formatted (number of flow per export, json or other, etc.), the latter determine the used
export protocol (file, mqtt, zmq, etc.).

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

View file

@ -1,166 +0,0 @@
#########
Changelog
#########
**3.2.2 (2020-02-29)**
* Add to_pandas method.
* Live demo using binder
* Fix previous broken package.
**3.2.1 (2020-02-29)**
* Add libpcap to builded libs.
* Packet parsing improvements (expose several packet size levels).
* Observer code refactoring.
* Improve nDPI structures handling.
**3.2.0 (2020-02-02)**
* nDPI 3.2 support.
* Fix metadata extraction issues.
**3.1.2 (2020-01-07)**
* Fix tests workflow.
* Update nDPI (commit: 73c7ccdb65a1e13e3fb1726af7882dd34534906f).
**3.1.1 (2019-12-29)**
* Fix generated wheels. (drop sdist)
**3.1.0 (2019-12-29)**
* Initial support for nDPI3.1 (commit: 73c7ccdb65a1e13e3fb1726af7882dd34534906f).
* Add wrapping for pandas.
* pypy7.2 support.
* Add py36, py38 for macOS wheels.
* Move continous integration to GitHub Actions.
**3.0.4 (2019-12-18)**
* Fix pypi description rendering.
**3.0.3 (2019-12-18)**
* MacOS Catalina support.
* Implement random port selection for zmq.
**3.0.2 (2019-12-06)**
* ether type double stacking implementation.
* Minor fixes.
**3.0.1 (2019-12-04)**
* Fix macOS wheels 10.14
**3.0.0 (2019-12-04)**
* Sync with nDPI major.minor versions.
* New NFPlugin API definition.
* Fix macOS wheels for 10.13 and 10.14
**2.0.1 (2019-11-29)**
* Fix pypy3 wheel.
**2.0.0 (2019-11-28)**
* Pypy support.
* Major performances improvements.
* NFPlugin as main extension API.
* nDPI memory usage improved.
* nDPI implemented using cffi.
* tcp_max_dissections, udp_max_dissections options.
* NFFlow dynamic attributes creation.
* HTTP, SSH, DNS client and server informations extraction.
* FlowCache management implemented in pure Python.
**1.2.1 (2019-11-15)**
* Fix ndpi padding and alignement issues.
* nDPI3.1 compatibility.
**1.2.0 (2019-11-14)**
* Fix ndpi bindings.
* Add TLS dissection features (server sni, client sni, version, organization, expiration dates)
* Improve documentation.
**1.1.8 (2019-11-07)**
* Fix ndpi wrap missing fields.
* Add host_server_name metric.
* Update doc.
**1.1.7 (2019-11-07)**
* Fix minor bugs.
**1.1.6 (2019-11-03)**
* TCP flags extraction.
* Minor bug fixes.
**1.1.5 (2019-11-02)**
* Add BPF filtering feature.
* Fix radiotap parsing.
**1.1.2-3-4 (2019-11-01)**
* Fix broken macos wheels on pypi.
**1.1.1 (2019-11-01)**
* Fix broken linux wheels on pypi.
* Py38 compatibility.
**1.1.0 (2019-11-01)**
* Add OSX support.
**1.0.1-2-3 (2019-10-31)**
* Fix deployment CI
**1.0.0 (2019-10-30)**
* cffi based packet capture.
* fast parsing mechanism.
* Minor bug fixes.
* auto-generate binaries.
**0.5.0 (2019-10-21)**
* Classifier mechanism introduced.
* Custom export_reason.
* Fix minor bugs.
* Improve documentation.
**0.4.0 (2019-10-20)**
* Pypi package description readable.
**0.3.1 (2019-10-20)**
* Add category_name as flow feature.
**0.3.0 (2019-10-20)**
* Add user defined callbacks feature.
* Fix live capture handling.
* Fix library loading path.
* Json support for flow printing.
* Add examples.
**0.2.0 (2019-10-19)**
* Add nDPI bindings as part of the released package
* Documentation improvement
**0.1.0 (2019-10-19)**
* First release on PyPI.

View file

@ -1,53 +0,0 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
# -- Project information -----------------------------------------------------
project = 'nfstream'
copyright = '2019, Zied Aouini'
author = 'Zied Aouini'
# The full version, including alpha/beta/rc tags
release = '3.2.2'
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = []
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_rtd_theme"
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
master_doc = 'index'

View file

@ -1,113 +0,0 @@
############
Contributing
############
Contributions are welcome, and they are greatly appreciated! Every little bit
helps, and credit will always be given.
You can contribute in many ways:
*********************
Types of contribution
*********************
**Report bugs**
Report bugs at https://github.com/aouinizied/nfstream/issues.
If you are reporting a bug, please include:
* Your operating system name and version.
* Any details about your local setup that might be helpful in troubleshooting.
* Detailed steps to reproduce the bug.
* pcap file if you are reporting a bug on offline mode
**Fix bugs**
Look through the GitHub issues for bugs. Anything tagged with "bug" and "help
wanted" is open to whoever wants to implement it.
**Implement features**
Look through the GitHub issues for features. Anything tagged with "enhancement"
and "help wanted" is open to whoever wants to implement it.
**Write documentation**
nfstream could always use more documentation, whether as part of the
official nfstream docs, in docstrings, or even on the web in blog posts,
articles, and such.
**Submit feedback**
The best way to send feedback is to file an issue at https://github.com/aouinizied/nfstream/issues.
If you are proposing a feature:
* Explain in detail how it would work.
* Keep the scope as narrow as possible, to make it easier to implement.
* Remember that this is a volunteer-driven project, and that contributions
are welcome.
***********
Get started
***********
Ready to contribute? Here's how to set up nfstream for local development.
1. Fork the nfstream repo on GitHub.
2. Clone your fork locally::
$ git clone git@github.com:your_name_here/nfstream.git
3. Install your local copy into a virtualenv. This is an example how you set up your fork for local development for Python3.6::
$ cd nfstream
$ virtualenv venv-nfstream-py36 -p /usr/bin/python3.6
$ source venv-nfstream-py36/bin/activate
$ python setup.py develop
4. Create a branch for local development::
$ git checkout -b name-of-your-bugfix-or-feature
5. When you're done making changes, check that your changes pass the
tests (run it as root to trigger live capture testing)::
$ python tests.py
6. Commit your changes and push your branch to GitHub::
$ git add .
$ git commit -m "Your detailed description of your changes."
$ git push origin name-of-your-bugfix-or-feature
7. Submit a pull request through the GitHub website.
***********************
Pull request guidelines
***********************
Before you submit a pull request, check that it meets these guidelines:
1. The pull request should include tests.
2. If the pull request adds functionality, the docs should be updated. Put
your new functionality into a function with a docstring, and add the
feature to the list in README.rst.
3. The pull request should work for 3.6 and 3.7 and 3.8 Check
https://travis-ci.org/aouinizied/nfstream/pull_requests
and make sure that the tests pass for all supported Python versions.
*********
Deploying
*********
A reminder for the maintainers on how to deploy.
Make sure all your changes are committed (including an entry in /docs/source/changelog.rst).
Then run::
$ bumpversion patch
$ git push
$ git push --tags
Travis will then deploy to PyPI if tests pass.

View file

@ -1,227 +0,0 @@
#########################
Get started with nfstream
#########################
Dealing with a big pcap file and just want to aggregate it as network flows?
nfstream make this path easier in few lines:
.. code-block:: python
from nfstream import NFStreamer
my_capture_streamer = NFStreamer(source="facebook.pcap",
snaplen=65535,
idle_timeout=30,
active_timeout=300,
plugins=(),
dissect=True,
max_tcp_dissections=10,
max_udp_dissections=16)
my_live_streamer = NFStreamer(source="eth1") # or capture from a network interface
for flow in my_capture_streamer: # or for flow in my_live_streamer
print(flow) # print, send it to Kafka or whatever you want :)!
From pcap to pandas dataframe?
.. code-block:: python
from nfstream import NFStreamer
my_df = NFStreamer(source="facebook.pcap").to_pandas()
my_df.head()
*****************
NFStreamer object
*****************
* ``source`` [default= ``None`` ]
- Source of packets. Can be ``live_interface_name`` or ``pcap_file_path``.
* ``snaplen`` [default= ``65535`` ]
- Packet capture length.
* ``idle_timeout`` [default= ``30`` ]
- Flows that are inactive for more than this value in seconds will be exported.
* ``active_timeout`` [default= ``300`` ]
- Flows that are active for more than this value in seconds will be exported.
* ``plugins`` [default= ``()`` ]
- Set of user defined NFPlugins.
* ``dissect`` [default= ``True`` ]
- Enable nDPI deep packet inspection library for Layer 7 visibility.
* ``max_tcp_dissections`` [default= ``10`` ]
- Maximum per flow TCP packets to dissect (ignored when dissect=False).
* ``max_udp_dissections`` [default= ``16`` ]
- Maximum per flow UDP packets to dissect (ignored when dissect=False).
NFStreamer returns an iterator of **NFEntry** object.
**************
NFEntry object
**************
.. list-table:: NFEntry object
:widths: 25 25 50
:header-rows: 1
* - attribute name
- attribute type
- attribute description
* - id
- int
- Flow identifier.
* - first_seen
- int
- First packet timestamp in milliseconds.
* - last_seen
- int
- Last packet timestamp in milliseconds.
* - version
- int
- IP version.
* - src_port
- int
- Transport layer source port.
* - dst_port
- int
- Transport layer destination port.
* - protocol
- int
- Transport layer protocol.
* - vlan_id
- int
- Virtual LAN identifier.
* - src_ip
- str
- Source IP address string representation.
* - dst_ip
- str
- Destination IP address string representation.
* - ip_src
- int
- Source IP address int value. [``volatile``]
* - ip_dst
- int
- Destination IP address int value. [``volatile``]
* - total_packets
- int
- Flow packets accumulator.
* - total_bytes
- int
- Flow bytes (full packet lentgh) accumulator.
* - duration
- int
- Flow duration in milliseconds.
* - src2dst_packets
- int
- Flow packets accumulator (source->destination).
* - src2dst_bytes
- int
- Flow bytes (full packet lentgh) accumulator (source->destination).
* - dst2src_packets
- int
- Flow packets accumulator (destination->source).
* - dst2src_bytes
- int
- Flow bytes (full packet lentgh) accumulator (destination->source).
* - expiration_id
- int
- Identifier of flow expiration trigger. Can be ``0`` for idle_timeout, ``1`` for active_timeout or 'negative' for custom expiration.
* - master_protocol
- int
- nDPI master protocol identifier.
* - app_protocol
- int
- nDPI app protocol identifier.
* - application_name
- str
- nDPI application name.
* - category_name
- str
- nDPI application category name.
* - client_info
- str
- Dissected client informations. Can be ``http_detected_os`` for HTTP, ``client_signature`` for SSH or ``client_requested_server_name`` for SSL.
* - server_info
- str
- Dissected server informations. Can be ``host_server_name`` for HTTP or DNS, ``server_signature`` for SSH or ``server_names`` for SSL.
* - j3a_client
- str
- J3A_ client fingerprint.
* - j3a_server
- str
- J3A_ server fingerprint.
**NFEntry** is an aggregation of **NFPacket** objects.
***************
NFPacket object
***************
.. list-table:: NFPacket object
:widths: 25 25 50
:header-rows: 1
* - attribute name
- attribute type
- attribute description
* - time
- int
- Packet timestamp in milliseconds.
* - raw_size
- int
- Packet raw size.
* - ip_size
- int
- IP packet size.
* - transport_size
- int
- Transport packet size.
* - payload_size
- int
- Packet payload size.
* - ip_src
- int
- Source IP address int value.
* - ip_dst
- int
- Destination IP address int value.
* - src_port
- int
- Transport layer source port.
* - dst_port
- int
- Transport layer destination port.
* - protocol
- int
- Transport layer protocol.
* - vlan_id
- int
- Virtual LAN identifier.
* - version
- int
- IP version.
* - tcp_flags
- int
- Packet observed TCP flags.
* - ip_packet
- bytes
- Raw content starting from IP Header.
* - direction
- int
- Packet direction: ``0`` for src_to_dst and ``1`` for dst_to_src.
.. _J3A: https://github.com/salesforce/ja3

View file

@ -1,42 +0,0 @@
.. nfstream documentation master file, created by
sphinx-quickstart on Sat Oct 19 16:26:59 2019.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
####################################################
nfstream: a flexible network data analysis framework
####################################################
.. image:: asset/logo_main.png
:width: 140
:height: 140
:align: right
**nfstream** is a Python package providing fast, flexible, and expressive data structures designed to make working with **online** or **offline** network data both easy and intuitive. It aims to be the fundamental high-level building block for
doing practical, **real world** network data analysis in Python. Additionally, it has
the broader goal of becoming **a common network data processing framework for researchers** providing data reproducibility across experiments.
**Main Features**
* **Performance:** **nfstream** is designed to be fast (x10 faster with pypy3 support) with a small CPU and memory footprint.
* **Layer-7 visibility:** **nfstream** deep packet inspection engine is based on nDPI_ library. It allows nfstream to perform reliable_ encrypted applications identification and metadata extraction (e.g. TLS, SSH, DNS, HTTP).
* **Flexibility:** add a flow feature in 2 lines as an NFPlugin_.
* **Machine Learning oriented:** add your trained model as an NFPlugin_.
.. toctree::
:maxdepth: 2
:caption: Table of Contents:
installation
architecture
get_started
plugins
contributing
changelog
.. _nDPI: https://www.ntop.org/products/deep-packet-inspection/ndpi/
.. _NFPlugin: https://nfstream.readthedocs.io/en/latest/plugins.html
.. _reliable: http://people.ac.upc.edu/pbarlet/papers/ground-truth.pam2014.pdf

View file

@ -1,36 +0,0 @@
###################
Installing nfstream
###################
************
Installation
************
**using pip**
Binary installers for the latest released version are available:
.. code-block:: bash
python3 -m pip install nfstream
**from source: linux**
.. code-block:: bash
sudo apt-get install autoconf automake libtool pkg-config libpcap-dev
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel
**from source: macos**
.. code-block:: bash
brew install autoconf automake libtool pkg-config
git clone https://github.com/aouinizied/nfstream.git
cd nfstream
python3 -m pip install -r requirements.txt
python3 setup.py bdist_wheel

View file

@ -1,109 +0,0 @@
##################
Extending nfstream
##################
nfstream is designed to be flexible and machine learning oriented. In the following section, we depict the use of NFPlugin
in both cases.
.. code-block:: python
from nfstream import NFPlugin
class my_awesome_plugin(NFPlugin):
def on_update(self, obs, entry):
if obs.raw_size >= 666:
entry.my_awesome_plugin += 1
streamer_awesome = NFStreamer(source='devil.pcap', plugins=[my_awesome_plugin()])
for flow in streamer_awesome:
print(flow.my_awesome_plugin) # now you will see your dynamically created metric in generated flows
*******************
NFPlugin parameters
*******************
* ``name`` [default= ``class name`` ]
- Plugin name. Must be unique as it's dynamically created as a flow attribute.
* ``volatile`` [default= ``False`` ]
- Volatile plugin is available only when flow is processed. At flow expiration level, plugin is automatically removed (will not appear as flow attribute).
* ``user_data`` [default= ``None`` ]
- user_data passed to the plugin. Example: external module, pickled sklearn model, etc.
****************
NFPlugin methods
****************
* ``on_init(self, obs)`` [default= ``return 0`` ]
- Method called at entry creation). When aggregating packets into flows, this method is called on ``NFEntry`` object creation based on first ``NFPacket`` object belonging to it.
* ``on_update(self, obs, entry)`` [default= ``pass`` ]
- Method called to update each entry with its belonging obs. When aggregating packets into flows, the entry is an ``NFEntry`` object and the obs is an ``NFPacket`` object.
* ``on_expire(self, entry)`` [default= ``pass`` ]
- Method called at entry expiration. When aggregating packets into flows, the entry is an ``NFEntry``
* ``cleanup(self)`` [default= ``pass`` ]
- Method called for plugin cleanup.
In the following, we want to run an early classification of flows based on a trained machine learning model than takes
as features the 3 first packets size of a flow.
***************************
Computing required features
***************************
.. code-block:: python
from nfstream import NFPlugin
class feat_1(NFPlugin):
def on_init(self, obs):
entry.feat_1 == obs.raw_size
class feat_2(NFPlugin):
def on_update(self, obs, entry):
if entry.total_packets == 2:
entry.feat_2 == obs.raw_size
class feat_3(NFPlugin):
def on_update(self, obs, entry):
if entry.total_packets == 3:
entry.feat_3 == obs.raw_size
************************
Trained model prediction
************************
.. code-block:: python
class model_prediction(NFPlugin):
def on_update(self, obs, entry):
if entry.total_packets == 3:
entry.model_prediction = self.user_data.predict_proba([entry.feat_1 , entry.feat_2 , entry.feat_3])
# optionally we can force NFStreamer to immediately expires the flow
# entry.expiration_id = -1
***********************
Start your new streamer
***********************
.. code-block:: python
my_model = function_to_load_your_model() # or whatever
ml_streamer = NFStreamer(source='devil.pcap',
plugins=[feat_1(volatile=True),
feat_2(volatile=True),
feat_3(volatile=True),
model_prediction(user_data=my_model)
])
for flow in ml_streamer:
print(flow.model_prediction) # now you will see your trained model prediction as part of the flow :)

View file

Before

Width:  |  Height:  |  Size: 110 KiB

After

Width:  |  Height:  |  Size: 110 KiB

Before After
Before After

View file

@ -94,10 +94,6 @@ install_requires = ['cffi>=1.14.0',
'pyzmq>=19.0.0',
'pandas>=1.0.1']
if os.getenv('READTHEDOCS'):
install_requires.append('numpydoc>=0.8')
install_requires.append('sphinx_rtd_theme>=0.4.3')
try:
from wheel.bdist_wheel import bdist_wheel as _bdist_wheel
@ -152,6 +148,6 @@ setup(
'Topic :: Scientific/Engineering :: Artificial Intelligence'
],
project_urls={
'Documentation': 'https://nfstream.readthedocs.io',
'GitHub': 'https://nfstream.github.io',
}
)