Skip to content

duetector🔍: Data Usage Extensible Detector for data usage observability.

License

Notifications You must be signed in to change notification settings

NinZeige/duetector

 
 

Repository files navigation

DataUCon

duetector🔍: Data Usage Extensible detector(eBPF Support)

Actions Status Documentation Status pre-commit.ci status LICENSE Releases Pre Releases Last Commit Python version contributors slack

English | 中文

Introduction

duetector is one of the components in the DataUCON project, which is designed to provide support for data usage control. Intro DataUCON.

duetector🔍 is an extensible data usage control detector that provides support for data usage control by probing for data usage behavior in the Linux kernel(based on eBPF).

🐛🐞🧪 The project is under heavy development, looking forward to any bug reports, feature requests, pull requests!

In the ABAUC control model, duetector can be used as a PIP (Policy Information Point) to obtain data usage behavior, so as to provide information about data usage behavior for PDP (Policy Decision Point). Provide information on data usage behavior to PDP (Policy Decision Point).

Try simple user case: Simplest Open Count.

Join our slack channel.

Table of Contents

Feature

  • Plug-in system support, see examples for more details
    • Custom Tracer and TracerManager
    • Custom Filters and FilterManager
    • Custom Collector and CollectorManager
    • Custom Analyzer and AnalyzerManager
  • Configuration Management
    • Configuration using a single configuration file
    • Generate Plugin Configuration
    • Support for dynamically loading configurations
  • Tracer Support
    • eBPF-based tracer
    • Shell command tracer
    • Subprocess tracer
  • Filter Support
    • Pattern matching, based on regular expressions
  • Data Collection and Analysis
    • Analyzer Support SQL database
    • Collector Support SQL database and OpenTelemetry(Experimental)
  • User Interface
    • CLI Tools
    • PIP Service
    • Control Panel
  • Enhancements
    • RunC containers identification

The eBPF program requires kernel support, see Kernel Support

Installation

The code is distributed via Pypi, and you can install it with the following command

pip install duetector

Currently, the code relies on BCC for on-the-fly compilation of eBPF code, we recommend installing the latest BCC compiler

Or use the Docker image that we provide, which uses JupyterLab as the example user application, or you can modify the Dockerfile and startup script to customize the user application.

docker pull dataucon/duetector:latest

Pre-releases will not be updated to latest, you can specify the tag to pull, e.g. v0.0.1a

docker pull dataucon/duetector:v0.0.1a

For more details on running with docker images see here

Quick start

More documentation and examples can be found here.

Start detector

Start monitor using the command line, since bcc requires root privileges, we use the sudo command, which will start all probes and collect the probes into the duetector-dbcollector.sqlite3 file in the current directory

sudo duectl start

Press CRTL+C to exit monitoring and you will see a summary output on the screen

{'DBCollector': {'OpenTracer': {'count': 31, 'first at': 249920233249912, 'last': Tracking(tracer='OpenTracer', pid=641616, uid=1000, gid= 1000, comm='node', cwd=None, fname='SOME-FILE', timestamp=249923762308577, extended={})}}}

Enable DEBUG log

sudo DUETECTOR_LOG_LEVEL=DEBUG duectl start

At startup, the configuration file will be automatically generated at ~/.config/duetector, and you can specify the configuration file to use with --config.

sudo duectl start --config <config-file-path>

Configuration using environment variables is also supported:

Usage: duectl start [OPTIONS]

  Start A bcc monitor and wait for KeyboardInterrupt

Options:
  ...
  --load_env BOOLEAN            Weather load env variables,Prefix: DUETECTOR_,
                                Separator:__, e.g. DUETECTOR_config__a means
                                config.a, default: True
  ...

When using a plugin, the default configuration file will not contain the plugin's configuration, use the dynamically-generated configuration directive to generate a configuration file with the plugin's configuration, this directive also supports merging existing configuration files and environment variables.

duectl generate-dynamic-config --help

Use generate-config to restore the default state in case of configuration file errors.

duectl generate-config

Going a step further, running in the background you can use the duectl-daemon start command, which will run a daemon in the background, which you can stop using duectl-daemon stop

Use duectl-daemon --help for more details:

Usage: duectl-daemon [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  start   Start a background process of command `duectl start`.
  status  Show status of process.
  stop    Stop the process.

Analyzing with analyzer

We provide an Analyzer that can query the data in storage, try it in user case

Using duetector server

We provide a Duetector Server as an external PIP service and control interface

A Duetector Server can be started using duectl-server and will listen on 0.0.0.0:8120 by default, you can modify it using --host and --port.

$ duectl-server start --help
Usage: duectl-server start [OPTIONS]

  Start duetector server

Options:
  --config TEXT       Config file path, default:
                      ``~/.config/duetector/config.toml``.
  --load_env BOOLEAN  Weather load env variables, Prefix: ``DUETECTOR_``,
                      Separator:``__``, e.g. ``DUETECTOR_config__a`` means
                      ``config.a``, default: True
  --workdir TEXT      Working directory, default: ``.``.
  --host TEXT         Host to listen, default: ``0.0.0.0``.
  --port INTEGER      Port to listen, default: ``8120``.
  --workers INTEGER   Number of worker processes, default: ``1``.
  --help              Show this message and exit.

After the service has started, visit http://{ip}:{port}/docs to see the API documentation.

Similarly, using duectl-server-daemon start you can run a Duetector Server in the background, and you can stop it using duectl-server-daemon stop

$ duectl-server-daemon
Usage: duectl-server-daemon [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  start   Start a background process of command ``duectl-server start``.
  status  Show status of process.
  stop    Stop the process.

API documentation

See docs of duetector

Maintainers

This project is initiated by Institute of Data Security, Harbin Institute of Technology (Shen Zhen), if you are interested in this project and DataUCON project and willing to work together to improve it, welcome to join our open source community.

Contributors

wunder957
wunder957

💻
MayDown
MayDown

💻
tsdsnk
tsdsnk

📖
zhemulin
zhemulin

📖
Mortal
Mortal

📖
mingzhedream
mingzhedream

📖

How to contribute

Starting with the good first issue and reading our contributing guidelines.

Learn about the designing and architecture of this project here: docs/design.

License

This project uses Apache-2.0 license, please refer to LICENSE.

About

duetector🔍: Data Usage Extensible Detector for data usage observability.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.3%
  • Other 0.7%