diff --git a/README.md b/README.md index c74e084..f38c87c 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -

+

nfstream: a flexible network data analysis framework

[**nfstream**][repo] is a Python package providing fast, flexible, and expressive data structures designed to make working with **online** or **offline** network data both easy and intuitive. It aims to be the fundamental high-level building block for @@ -67,14 +67,6 @@ the broader goal of becoming **a common network data processing framework for re - - Documentation Status - - - ReadTheDocs - - - Code Quality @@ -235,13 +227,13 @@ As with any packet monitoring tool, **nfstream** could potentially be misused. This project is licensed under the GPLv3 License - see the [**License**][license] file for details [license]: https://github.com/aouinizied/nfstream/blob/master/LICENSE -[contribute]: https://nfstream.readthedocs.io/en/latest/contributing.html +[contribute]: https://nfstream.github.io/docs/community [contributors]: https://github.com/aouinizied/nfstream/graphs/contributors [linkedin]: https://www.linkedin.com/in/dr-zied-aouini [github]: https://github.com/aouinizied -[documentation]: https://nfstream.readthedocs.io/en/latest/index.html -[ndpi]: https://github.com/ntop/nDPI -[nfplugin]: https://nfstream.readthedocs.io/en/latest/plugins.html +[documentation]: https://nfstream.github.io/ +[ndpi]: https://nfstream.github.io/docs/visibility +[nfplugin]: https://nfstream.github.io/docs/api#nfplugin [reliable]: http://people.ac.upc.edu/pbarlet/papers/ground-truth.pam2014.pdf [repo]: https://github.com/aouinizied/nfstream [demo]: https://mybinder.org/v2/gh/aouinizied/nfstream-tutorials/master?filepath=demo_notebook.ipynb diff --git a/docs/Makefile b/docs/Makefile deleted file mode 100644 index d0c3cbf..0000000 --- a/docs/Makefile +++ /dev/null @@ -1,20 +0,0 @@ -# Minimal makefile for Sphinx documentation -# - -# You can set these variables from the command line, and also -# from the environment for the first two. -SPHINXOPTS ?= -SPHINXBUILD ?= sphinx-build -SOURCEDIR = source -BUILDDIR = build - -# Put it first so that "make" without argument is like "make help". -help: - @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) - -.PHONY: help Makefile - -# Catch-all target: route all unknown targets to Sphinx using the new -# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). -%: Makefile - @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) diff --git a/docs/make.bat b/docs/make.bat deleted file mode 100644 index 9534b01..0000000 --- a/docs/make.bat +++ /dev/null @@ -1,35 +0,0 @@ -@ECHO OFF - -pushd %~dp0 - -REM Command file for Sphinx documentation - -if "%SPHINXBUILD%" == "" ( - set SPHINXBUILD=sphinx-build -) -set SOURCEDIR=source -set BUILDDIR=build - -if "%1" == "" goto help - -%SPHINXBUILD% >NUL 2>NUL -if errorlevel 9009 ( - echo. - echo.The 'sphinx-build' command was not found. Make sure you have Sphinx - echo.installed, then set the SPHINXBUILD environment variable to point - echo.to the full path of the 'sphinx-build' executable. Alternatively you - echo.may add the Sphinx directory to PATH. - echo. - echo.If you don't have Sphinx installed, grab it from - echo.http://sphinx-doc.org/ - exit /b 1 -) - -%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% -goto end - -:help -%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% - -:end -popd diff --git a/docs/source/architecture.rst b/docs/source/architecture.rst deleted file mode 100644 index 6bed527..0000000 --- a/docs/source/architecture.rst +++ /dev/null @@ -1,94 +0,0 @@ -############ -Architecture -############ - -.. image:: asset/arch.png - :scale: 100% - :align: center - - -A step by step walk through each process involved when performing flow monitoring is -developed in this section. Our aim is to provide you with a reminder about how -things works in theory. Consequently, an easier understanding of nfstream features -and implementation is possible. - -****************** -Packet observation -****************** - -Packet observation is a key stage in a flow monitoring architecture as it is the -starting point. Consequently, we detail in the following each step involved at this -phase: - -**Packet capture:** This step is performed on the Network Interface Card (NIC) level. -After passing various checks such as checksum error, packets stored in on-card -reception buffers are moved to the hosting device memory. Several libraries are -available to capture network traffic such as libpcap for UNIX based operating systems -Winpcap for Windows. These libraries are running on the top of the operating system -stack which may reduce performances passing through several layers. -To overcome such limitation in a high speed network context, software optimization -technique are proposed and could be considered (e.g. Intel DPDK, PF-RING, netmap). - -**Timestamping:** As packets may come from several observation points, reordering -process is based on packet’s timestamp. While hardware timestamping provides a high -accuracy up to 100 nanoseconds in case of the IEEE 1588 protocol, it’s not supported -by most of commodity NIC. Software timestamping is widely used to outcome this lack -providing an accuracy up to 100 microseconds. - -**Truncation (optional):** Defining a snapshot length, the process selects precise -bytes from the packet. It is performed in some cases to reduce the amount of data -captured by the probe and therefore CPU and bus bandwidth load. - -**Packet sampling (optional):** is generally performed to reduce load on subsequent -stages. It can be systematic (periodic sampling scheme) or random. The latter is -recommended as periodic scheme may introduce unwanted correlation in the observed -network data. - -**Packet filtering (optional):** performs filtering of packets to separate packets -having specific properties from those not having them. A packet is selected if -some specific fields are equal or in the range of given values. Another technique is -a hash based filtering, applying a hash function on a portion of the packet, -the result is compared to a value or a range of values. - -************* -Flow metering -************* -It includes packets aggregation into flows and flow entry expiration management. -Second, the metering process associates a packet to a flow entry using a defined key. -Third, it performs the aggregation of packets into flow entry based on a set of metrics. -Then, a flow entry is cached until it is considered as terminated (entry expiration). -Finally, optional steps such as flow sampling and filtering may be performed. - -**Flow Cache:** Flow cache consist of table in which the metering process stores -information regarding active flows in the network. A flow key (typically IP source -and destination addresses, source and destination ports, protocol and the VLAN -identifier) determines whether a packet is matching an existing flow entry in the cache -or not. In the first case, flow’s counters are updated. In the latter one, a new entry -is created. Non-key fields are utilized to collect flow metrics (e.g. packets/bytes -count, etc.). If IP addresses are part of flows key, and that traffic between two -pairs generates flows on both directions. We define a flow as bidirectional when we consider that pair and it reverse -belongs to same entry.The cache’s size depends on exporter device memory capacity -and should be configured based on criteria such as key/non-key fields, maximum number -of flows expected and expiration policy. - -**Entry expiration:** Cache entries are maintained in the cache table until they are -considered as terminated. Termination of a flow is triggered by an expiration event. -The metering process should consider an entry as expired based on: - -* Natural expiration: observed TCP packet belonging to a flow with FIN/RST flag. -* Active timeout: a flow entry expires after being considered active during a certain period. -* Idle timeout: a flow entry expires if no packets belonging to it are observed during a specific period.. - -It is possible to configure our metering process based on expiration policy to -reduce the amount of records exported. - -**Flow Sampling and Filtering:** Flow sampling and filtering processes are quite like packet sampling and filtering -process explained above. The major differences are the processed unit; while packet sampling and filtering process -packets, flow sampling and filtering process flow records coming from the metering process. - -****** -Export -****** -Export involves two steps which are mainly **formatting** and **export protocol**. While the first decide how an export is -formatted (number of flow per export, json or other, etc.), the latter determine the used -export protocol (file, mqtt, zmq, etc.). \ No newline at end of file diff --git a/docs/source/asset/arch.png b/docs/source/asset/arch.png deleted file mode 100644 index 95a2839..0000000 Binary files a/docs/source/asset/arch.png and /dev/null differ diff --git a/docs/source/changelog.rst b/docs/source/changelog.rst deleted file mode 100644 index 8a7cbef..0000000 --- a/docs/source/changelog.rst +++ /dev/null @@ -1,166 +0,0 @@ -######### -Changelog -######### - -**3.2.2 (2020-02-29)** - -* Add to_pandas method. -* Live demo using binder -* Fix previous broken package. - -**3.2.1 (2020-02-29)** - -* Add libpcap to builded libs. -* Packet parsing improvements (expose several packet size levels). -* Observer code refactoring. -* Improve nDPI structures handling. - -**3.2.0 (2020-02-02)** - -* nDPI 3.2 support. -* Fix metadata extraction issues. - -**3.1.2 (2020-01-07)** - -* Fix tests workflow. -* Update nDPI (commit: 73c7ccdb65a1e13e3fb1726af7882dd34534906f). - -**3.1.1 (2019-12-29)** - -* Fix generated wheels. (drop sdist) - -**3.1.0 (2019-12-29)** - -* Initial support for nDPI3.1 (commit: 73c7ccdb65a1e13e3fb1726af7882dd34534906f). -* Add wrapping for pandas. -* pypy7.2 support. -* Add py36, py38 for macOS wheels. -* Move continous integration to GitHub Actions. - -**3.0.4 (2019-12-18)** - -* Fix pypi description rendering. - -**3.0.3 (2019-12-18)** - -* MacOS Catalina support. -* Implement random port selection for zmq. - -**3.0.2 (2019-12-06)** - -* ether type double stacking implementation. -* Minor fixes. - -**3.0.1 (2019-12-04)** - -* Fix macOS wheels 10.14 - -**3.0.0 (2019-12-04)** - -* Sync with nDPI major.minor versions. -* New NFPlugin API definition. -* Fix macOS wheels for 10.13 and 10.14 - -**2.0.1 (2019-11-29)** - -* Fix pypy3 wheel. - -**2.0.0 (2019-11-28)** - -* Pypy support. -* Major performances improvements. -* NFPlugin as main extension API. -* nDPI memory usage improved. -* nDPI implemented using cffi. -* tcp_max_dissections, udp_max_dissections options. -* NFFlow dynamic attributes creation. -* HTTP, SSH, DNS client and server informations extraction. -* FlowCache management implemented in pure Python. - -**1.2.1 (2019-11-15)** - -* Fix ndpi padding and alignement issues. -* nDPI3.1 compatibility. - -**1.2.0 (2019-11-14)** - -* Fix ndpi bindings. -* Add TLS dissection features (server sni, client sni, version, organization, expiration dates) -* Improve documentation. - -**1.1.8 (2019-11-07)** - -* Fix ndpi wrap missing fields. -* Add host_server_name metric. -* Update doc. - -**1.1.7 (2019-11-07)** - -* Fix minor bugs. - -**1.1.6 (2019-11-03)** - -* TCP flags extraction. -* Minor bug fixes. - -**1.1.5 (2019-11-02)** - -* Add BPF filtering feature. -* Fix radiotap parsing. - -**1.1.2-3-4 (2019-11-01)** - -* Fix broken macos wheels on pypi. - -**1.1.1 (2019-11-01)** - -* Fix broken linux wheels on pypi. -* Py38 compatibility. - -**1.1.0 (2019-11-01)** - -* Add OSX support. - -**1.0.1-2-3 (2019-10-31)** - -* Fix deployment CI - - -**1.0.0 (2019-10-30)** - -* cffi based packet capture. -* fast parsing mechanism. -* Minor bug fixes. -* auto-generate binaries. - -**0.5.0 (2019-10-21)** - -* Classifier mechanism introduced. -* Custom export_reason. -* Fix minor bugs. -* Improve documentation. - -**0.4.0 (2019-10-20)** - -* Pypi package description readable. - -**0.3.1 (2019-10-20)** - -* Add category_name as flow feature. - -**0.3.0 (2019-10-20)** - -* Add user defined callbacks feature. -* Fix live capture handling. -* Fix library loading path. -* Json support for flow printing. -* Add examples. - -**0.2.0 (2019-10-19)** - -* Add nDPI bindings as part of the released package -* Documentation improvement - -**0.1.0 (2019-10-19)** - -* First release on PyPI. diff --git a/docs/source/conf.py b/docs/source/conf.py deleted file mode 100644 index f0f605f..0000000 --- a/docs/source/conf.py +++ /dev/null @@ -1,53 +0,0 @@ -# Configuration file for the Sphinx documentation builder. -# -# This file only contains a selection of the most common options. For a full -# list see the documentation: -# https://www.sphinx-doc.org/en/master/usage/configuration.html - -# -- Path setup -------------------------------------------------------------- - -# If extensions (or modules to document with autodoc) are in another directory, -# add these directories to sys.path here. If the directory is relative to the -# documentation root, use os.path.abspath to make it absolute, like shown here. -# -# import os -# import sys -# sys.path.insert(0, os.path.abspath('.')) - - -# -- Project information ----------------------------------------------------- - -project = 'nfstream' -copyright = '2019, Zied Aouini' -author = 'Zied Aouini' - -# The full version, including alpha/beta/rc tags -release = '3.2.2' - -# -- General configuration --------------------------------------------------- - -# Add any Sphinx extension module names here, as strings. They can be -# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom -# ones. -extensions = [] - -# Add any paths that contain templates here, relative to this directory. -templates_path = ['_templates'] - -# List of patterns, relative to source directory, that match files and -# directories to ignore when looking for source files. -# This pattern also affects html_static_path and html_extra_path. -exclude_patterns = [] - -# -- Options for HTML output ------------------------------------------------- - -# The theme to use for HTML and HTML Help pages. See the documentation for -# a list of builtin themes. -# -html_theme = "sphinx_rtd_theme" -# Add any paths that contain custom static files (such as style sheets) here, -# relative to this directory. They are copied after the builtin static files, -# so a file named "default.css" will overwrite the builtin "default.css". -html_static_path = ['_static'] - -master_doc = 'index' \ No newline at end of file diff --git a/docs/source/contributing.rst b/docs/source/contributing.rst deleted file mode 100644 index e46e849..0000000 --- a/docs/source/contributing.rst +++ /dev/null @@ -1,113 +0,0 @@ -############ -Contributing -############ - -Contributions are welcome, and they are greatly appreciated! Every little bit -helps, and credit will always be given. - -You can contribute in many ways: - -********************* -Types of contribution -********************* - -**Report bugs** - -Report bugs at https://github.com/aouinizied/nfstream/issues. - -If you are reporting a bug, please include: - -* Your operating system name and version. -* Any details about your local setup that might be helpful in troubleshooting. -* Detailed steps to reproduce the bug. -* pcap file if you are reporting a bug on offline mode - -**Fix bugs** - -Look through the GitHub issues for bugs. Anything tagged with "bug" and "help -wanted" is open to whoever wants to implement it. - -**Implement features** - -Look through the GitHub issues for features. Anything tagged with "enhancement" -and "help wanted" is open to whoever wants to implement it. - -**Write documentation** - -nfstream could always use more documentation, whether as part of the -official nfstream docs, in docstrings, or even on the web in blog posts, -articles, and such. - -**Submit feedback** - -The best way to send feedback is to file an issue at https://github.com/aouinizied/nfstream/issues. - -If you are proposing a feature: - -* Explain in detail how it would work. -* Keep the scope as narrow as possible, to make it easier to implement. -* Remember that this is a volunteer-driven project, and that contributions - are welcome. - -*********** -Get started -*********** - -Ready to contribute? Here's how to set up nfstream for local development. - -1. Fork the nfstream repo on GitHub. -2. Clone your fork locally:: - - $ git clone git@github.com:your_name_here/nfstream.git - -3. Install your local copy into a virtualenv. This is an example how you set up your fork for local development for Python3.6:: - - $ cd nfstream - $ virtualenv venv-nfstream-py36 -p /usr/bin/python3.6 - $ source venv-nfstream-py36/bin/activate - $ python setup.py develop - -4. Create a branch for local development:: - - $ git checkout -b name-of-your-bugfix-or-feature - -5. When you're done making changes, check that your changes pass the - tests (run it as root to trigger live capture testing):: - - $ python tests.py - -6. Commit your changes and push your branch to GitHub:: - - $ git add . - $ git commit -m "Your detailed description of your changes." - $ git push origin name-of-your-bugfix-or-feature - -7. Submit a pull request through the GitHub website. - -*********************** -Pull request guidelines -*********************** - -Before you submit a pull request, check that it meets these guidelines: - -1. The pull request should include tests. -2. If the pull request adds functionality, the docs should be updated. Put - your new functionality into a function with a docstring, and add the - feature to the list in README.rst. -3. The pull request should work for 3.6 and 3.7 and 3.8 Check - https://travis-ci.org/aouinizied/nfstream/pull_requests - and make sure that the tests pass for all supported Python versions. - -********* -Deploying -********* - -A reminder for the maintainers on how to deploy. -Make sure all your changes are committed (including an entry in /docs/source/changelog.rst). -Then run:: - -$ bumpversion patch -$ git push -$ git push --tags - -Travis will then deploy to PyPI if tests pass. diff --git a/docs/source/get_started.rst b/docs/source/get_started.rst deleted file mode 100644 index 8d0c2c6..0000000 --- a/docs/source/get_started.rst +++ /dev/null @@ -1,227 +0,0 @@ -######################### -Get started with nfstream -######################### - - -Dealing with a big pcap file and just want to aggregate it as network flows? -nfstream make this path easier in few lines: - -.. code-block:: python - - from nfstream import NFStreamer - my_capture_streamer = NFStreamer(source="facebook.pcap", - snaplen=65535, - idle_timeout=30, - active_timeout=300, - plugins=(), - dissect=True, - max_tcp_dissections=10, - max_udp_dissections=16) - - my_live_streamer = NFStreamer(source="eth1") # or capture from a network interface - for flow in my_capture_streamer: # or for flow in my_live_streamer - print(flow) # print, send it to Kafka or whatever you want :)! - - -From pcap to pandas dataframe? - -.. code-block:: python - - from nfstream import NFStreamer - my_df = NFStreamer(source="facebook.pcap").to_pandas() - my_df.head() - -***************** -NFStreamer object -***************** -* ``source`` [default= ``None`` ] - - - Source of packets. Can be ``live_interface_name`` or ``pcap_file_path``. - -* ``snaplen`` [default= ``65535`` ] - - - Packet capture length. - -* ``idle_timeout`` [default= ``30`` ] - - - Flows that are inactive for more than this value in seconds will be exported. - -* ``active_timeout`` [default= ``300`` ] - - - Flows that are active for more than this value in seconds will be exported. - -* ``plugins`` [default= ``()`` ] - - - Set of user defined NFPlugins. - -* ``dissect`` [default= ``True`` ] - - - Enable nDPI deep packet inspection library for Layer 7 visibility. - -* ``max_tcp_dissections`` [default= ``10`` ] - - - Maximum per flow TCP packets to dissect (ignored when dissect=False). - -* ``max_udp_dissections`` [default= ``16`` ] - - - Maximum per flow UDP packets to dissect (ignored when dissect=False). - -NFStreamer returns an iterator of **NFEntry** object. - -************** -NFEntry object -************** - -.. list-table:: NFEntry object - :widths: 25 25 50 - :header-rows: 1 - - * - attribute name - - attribute type - - attribute description - * - id - - int - - Flow identifier. - * - first_seen - - int - - First packet timestamp in milliseconds. - * - last_seen - - int - - Last packet timestamp in milliseconds. - * - version - - int - - IP version. - * - src_port - - int - - Transport layer source port. - * - dst_port - - int - - Transport layer destination port. - * - protocol - - int - - Transport layer protocol. - * - vlan_id - - int - - Virtual LAN identifier. - * - src_ip - - str - - Source IP address string representation. - * - dst_ip - - str - - Destination IP address string representation. - * - ip_src - - int - - Source IP address int value. [``volatile``] - * - ip_dst - - int - - Destination IP address int value. [``volatile``] - * - total_packets - - int - - Flow packets accumulator. - * - total_bytes - - int - - Flow bytes (full packet lentgh) accumulator. - * - duration - - int - - Flow duration in milliseconds. - * - src2dst_packets - - int - - Flow packets accumulator (source->destination). - * - src2dst_bytes - - int - - Flow bytes (full packet lentgh) accumulator (source->destination). - * - dst2src_packets - - int - - Flow packets accumulator (destination->source). - * - dst2src_bytes - - int - - Flow bytes (full packet lentgh) accumulator (destination->source). - * - expiration_id - - int - - Identifier of flow expiration trigger. Can be ``0`` for idle_timeout, ``1`` for active_timeout or 'negative' for custom expiration. - * - master_protocol - - int - - nDPI master protocol identifier. - * - app_protocol - - int - - nDPI app protocol identifier. - * - application_name - - str - - nDPI application name. - * - category_name - - str - - nDPI application category name. - * - client_info - - str - - Dissected client informations. Can be ``http_detected_os`` for HTTP, ``client_signature`` for SSH or ``client_requested_server_name`` for SSL. - * - server_info - - str - - Dissected server informations. Can be ``host_server_name`` for HTTP or DNS, ``server_signature`` for SSH or ``server_names`` for SSL. - * - j3a_client - - str - - J3A_ client fingerprint. - * - j3a_server - - str - - J3A_ server fingerprint. - -**NFEntry** is an aggregation of **NFPacket** objects. - -*************** -NFPacket object -*************** - -.. list-table:: NFPacket object - :widths: 25 25 50 - :header-rows: 1 - - * - attribute name - - attribute type - - attribute description - * - time - - int - - Packet timestamp in milliseconds. - * - raw_size - - int - - Packet raw size. - * - ip_size - - int - - IP packet size. - * - transport_size - - int - - Transport packet size. - * - payload_size - - int - - Packet payload size. - * - ip_src - - int - - Source IP address int value. - * - ip_dst - - int - - Destination IP address int value. - * - src_port - - int - - Transport layer source port. - * - dst_port - - int - - Transport layer destination port. - * - protocol - - int - - Transport layer protocol. - * - vlan_id - - int - - Virtual LAN identifier. - * - version - - int - - IP version. - * - tcp_flags - - int - - Packet observed TCP flags. - * - ip_packet - - bytes - - Raw content starting from IP Header. - * - direction - - int - - Packet direction: ``0`` for src_to_dst and ``1`` for dst_to_src. - - -.. _J3A: https://github.com/salesforce/ja3 diff --git a/docs/source/index.rst b/docs/source/index.rst deleted file mode 100644 index f850e56..0000000 --- a/docs/source/index.rst +++ /dev/null @@ -1,42 +0,0 @@ -.. nfstream documentation master file, created by - sphinx-quickstart on Sat Oct 19 16:26:59 2019. - You can adapt this file completely to your liking, but it should at least - contain the root `toctree` directive. - -#################################################### -nfstream: a flexible network data analysis framework -#################################################### - -.. image:: asset/logo_main.png - :width: 140 - :height: 140 - :align: right - -**nfstream** is a Python package providing fast, flexible, and expressive data structures designed to make working with **online** or **offline** network data both easy and intuitive. It aims to be the fundamental high-level building block for -doing practical, **real world** network data analysis in Python. Additionally, it has -the broader goal of becoming **a common network data processing framework for researchers** providing data reproducibility across experiments. - -**Main Features** - -* **Performance:** **nfstream** is designed to be fast (x10 faster with pypy3 support) with a small CPU and memory footprint. -* **Layer-7 visibility:** **nfstream** deep packet inspection engine is based on nDPI_ library. It allows nfstream to perform reliable_ encrypted applications identification and metadata extraction (e.g. TLS, SSH, DNS, HTTP). -* **Flexibility:** add a flow feature in 2 lines as an NFPlugin_. -* **Machine Learning oriented:** add your trained model as an NFPlugin_. - - - -.. toctree:: - :maxdepth: 2 - :caption: Table of Contents: - - installation - architecture - get_started - plugins - contributing - changelog - - -.. _nDPI: https://www.ntop.org/products/deep-packet-inspection/ndpi/ -.. _NFPlugin: https://nfstream.readthedocs.io/en/latest/plugins.html -.. _reliable: http://people.ac.upc.edu/pbarlet/papers/ground-truth.pam2014.pdf diff --git a/docs/source/installation.rst b/docs/source/installation.rst deleted file mode 100644 index eb51c1d..0000000 --- a/docs/source/installation.rst +++ /dev/null @@ -1,36 +0,0 @@ -################### -Installing nfstream -################### - -************ -Installation -************ - -**using pip** - -Binary installers for the latest released version are available: - -.. code-block:: bash - - python3 -m pip install nfstream - - -**from source: linux** - -.. code-block:: bash - - sudo apt-get install autoconf automake libtool pkg-config libpcap-dev - git clone https://github.com/aouinizied/nfstream.git - cd nfstream - python3 -m pip install -r requirements.txt - python3 setup.py bdist_wheel - -**from source: macos** - -.. code-block:: bash - - brew install autoconf automake libtool pkg-config - git clone https://github.com/aouinizied/nfstream.git - cd nfstream - python3 -m pip install -r requirements.txt - python3 setup.py bdist_wheel diff --git a/docs/source/plugins.rst b/docs/source/plugins.rst deleted file mode 100644 index 5461683..0000000 --- a/docs/source/plugins.rst +++ /dev/null @@ -1,109 +0,0 @@ -################## -Extending nfstream -################## - -nfstream is designed to be flexible and machine learning oriented. In the following section, we depict the use of NFPlugin -in both cases. - -.. code-block:: python - - from nfstream import NFPlugin - - class my_awesome_plugin(NFPlugin): - def on_update(self, obs, entry): - if obs.raw_size >= 666: - entry.my_awesome_plugin += 1 - - - streamer_awesome = NFStreamer(source='devil.pcap', plugins=[my_awesome_plugin()]) - for flow in streamer_awesome: - print(flow.my_awesome_plugin) # now you will see your dynamically created metric in generated flows - -******************* -NFPlugin parameters -******************* -* ``name`` [default= ``class name`` ] - - - Plugin name. Must be unique as it's dynamically created as a flow attribute. - -* ``volatile`` [default= ``False`` ] - - - Volatile plugin is available only when flow is processed. At flow expiration level, plugin is automatically removed (will not appear as flow attribute). - -* ``user_data`` [default= ``None`` ] - - - user_data passed to the plugin. Example: external module, pickled sklearn model, etc. - -**************** -NFPlugin methods -**************** -* ``on_init(self, obs)`` [default= ``return 0`` ] - - - Method called at entry creation). When aggregating packets into flows, this method is called on ``NFEntry`` object creation based on first ``NFPacket`` object belonging to it. - -* ``on_update(self, obs, entry)`` [default= ``pass`` ] - - - Method called to update each entry with its belonging obs. When aggregating packets into flows, the entry is an ``NFEntry`` object and the obs is an ``NFPacket`` object. - -* ``on_expire(self, entry)`` [default= ``pass`` ] - - - Method called at entry expiration. When aggregating packets into flows, the entry is an ``NFEntry`` - -* ``cleanup(self)`` [default= ``pass`` ] - - - Method called for plugin cleanup. - -In the following, we want to run an early classification of flows based on a trained machine learning model than takes -as features the 3 first packets size of a flow. - -*************************** -Computing required features -*************************** - -.. code-block:: python - - from nfstream import NFPlugin - - class feat_1(NFPlugin): - def on_init(self, obs): - entry.feat_1 == obs.raw_size - - class feat_2(NFPlugin): - def on_update(self, obs, entry): - if entry.total_packets == 2: - entry.feat_2 == obs.raw_size - - class feat_3(NFPlugin): - def on_update(self, obs, entry): - if entry.total_packets == 3: - entry.feat_3 == obs.raw_size - -************************ -Trained model prediction -************************ - -.. code-block:: python - - class model_prediction(NFPlugin): - def on_update(self, obs, entry): - if entry.total_packets == 3: - entry.model_prediction = self.user_data.predict_proba([entry.feat_1 , entry.feat_2 , entry.feat_3]) - # optionally we can force NFStreamer to immediately expires the flow - # entry.expiration_id = -1 - - -*********************** -Start your new streamer -*********************** - -.. code-block:: python - - my_model = function_to_load_your_model() # or whatever - ml_streamer = NFStreamer(source='devil.pcap', - plugins=[feat_1(volatile=True), - feat_2(volatile=True), - feat_3(volatile=True), - model_prediction(user_data=my_model) - ]) - for flow in ml_streamer: - print(flow.model_prediction) # now you will see your trained model prediction as part of the flow :) \ No newline at end of file diff --git a/docs/source/asset/logo_main.png b/logo_main.png similarity index 100% rename from docs/source/asset/logo_main.png rename to logo_main.png diff --git a/setup.py b/setup.py index 2e0079c..1f0f126 100644 --- a/setup.py +++ b/setup.py @@ -94,10 +94,6 @@ install_requires = ['cffi>=1.14.0', 'pyzmq>=19.0.0', 'pandas>=1.0.1'] -if os.getenv('READTHEDOCS'): - install_requires.append('numpydoc>=0.8') - install_requires.append('sphinx_rtd_theme>=0.4.3') - try: from wheel.bdist_wheel import bdist_wheel as _bdist_wheel @@ -152,6 +148,6 @@ setup( 'Topic :: Scientific/Engineering :: Artificial Intelligence' ], project_urls={ - 'Documentation': 'https://nfstream.readthedocs.io', + 'GitHub': 'https://nfstream.github.io', } )