Getting Started

Veil is a framework for building and running text processing and masking pipelines.

Installation

Environment installation (support for all included entity detectors)

make build

Activate the environment with:

mamba activate ./env

Development packages

You might also need the development requirements (to build documentation, run tests, etc.). Inside the environment:

python3 -m pip install -r requirements_dev.txt

Documentation

You can build the documentation with:

make docs/html

and serve it locally with:

make docs/serve

which will start a local server at http://localhost:5500.

Running from the CLI

Veil is highly configurable. All configuration classes, defined in veil/config, have a 1-1 mapping to CLI parameters. You can see the available ones with:

python3 -m veil --help

Running from a file

For example, create a configuration file like run_configs/example_offline.yml:

mode: offline
dataloader:
  path: data/input/example.jsonl
entity_detectors:
  - type: regex
    min_confidence: 0.3

And run:

python3 -m veil --pipeline-config-from-file run_configs/example_offline.yml

The input data must contain at least an input field with the text to be processed.

See docs/architecture.md for more details.

Docker

Running the Veil API with Docker

We provide a Docker image for a reproducible deployment of the API. See the configuration used in the image at run_configs/prod_pipeline_v1.yml. You can also build the image yourself:

# Replace docker-username with your Docker Hub username
make docker/build

or download it from Docker Hub:

docker pull docker-username/veil:gpu-latest

Next, run:

docker run --gpus all -t -e HUGGINGFACE_HUB_TOKEN=hf_your_token -p 8000:8000 docker-username/veil:gpu-latest

This will start the API server on port 8000. See the API details in veil/api_server.py.