Table of contents

WIPP deployment: General information

Version 1.0.0

Disclaimer

This software was developed at the National Institute of Standards and Technology by employees of the Federal Government in the course of their official duties. Pursuant to title 17 Section 105 of the United States Code this software is not subject to copyright protection and is in the public domain. This software is an experimental system. NIST assumes no responsibility whatsoever for its use by other parties, and makes no guarantees, expressed or implied, about its quality, reliability, or any other characteristic. We would appreciate acknowledgement if the software is used.

Important security information

YOU SHOULD NOT DEPLOY THIS SYSTEM ON A PUBLIC SERVER SINCE THE SOFTWARE DOES NOT INCLUDE ANY ACCOUNT AND UPLOAD ACCESS MANAGEMENT. THE CURRENT IMAGE UPLOAD IS COMPLETELY UNRESTRICTED AND COULD BE USED TO UPLOAD MALWARE, VIRUSES OR INNAPROPRIATE CONTENT.

The Web Image Processing Pipelines system (WIPP) version 1.0.0 does not include any web security management. WIPP 1.0.0 allows unrestricted uploading of files via the web browser interface and the uploaded files interact with the file system as well as with an instance of MongoDB database.
WIPP 1.0.0 is intended for deployment on private networks behind a firewall. Future releases will include account and upload access management.

WIPP deployment using Docker

We recommend using Docker containers for deploying the WIPP system.
Due to the inclusion of many libraries and their configurations in WIPP, we pre-installed and packaged all software in Docker containers. The containers simplify the WIPP deployment and make it more reproducible with consistent configurations.

What is Docker?

Docker is the container software platform with libraries and settings packaged in containers. One can deploy multiple instances of the Docker container to a set of machines running the Docker Engine.
The Docker containers can form a cluster of nodes by using Docker Swarm technology. Docker Swarm consists of a manager node and worker nodes providing services, as well as, an overlay network for multi-host networking. The manager node assigns tasks to the worker nodes in the form of Docker containers that can perform specific services. For more information about Docker Swarm, visit the Docker Swarm documentation page.

Requirements

Please make sure that the system hosting WIPP meets the following requirements:
  • - Unix operating system meeting the requirements for running the Docker Engine (we tested on Ubuntu 16.04 LTS x86_64, with Docker version 17.03.1-ce).
  • - At least 16 GB of available RAM (some algorithms require the RAM size to be up to ten times the size of the input image).
  • - At least 50 GB of available disk space (should be scaled according to the expected total amount of uploaded and computed data).
The WIPP system deployed using Docker on a single host will take 6 GB of disk space.

WIPP deployment on a single host

Deploying WIPP on a single host consists of six steps:
  • 1. installing Docker,
  • 2. configuring a Docker Swarm,
  • 3. configuring Docker volumes for database and file system storage,
  • 4. deploying WIPP using the provided script,
  • 5. opening firewall ports for the WIPP system, and
  • 6. accessing WIPP system web interface.

Step-by-step instructions for Linux

Download WIPP Zip file containing the installation script and a README file here.

1. Installing Docker

To install Docker, please, follow the instructions from the Docker web site (menu "Get Docker") according to the operating system the host is running.

Installing Docker Community Edition (CE), for example, Ubuntu Xenial 16.04

Official Docker instructions available here.

a. Update the apt package index and upgrade the installed packages:
sudo apt-get update
sudo apt-get upgrade
b. Set up the Docker CE apt-get repository:
- Install packages to allow apt to use a repository over HTTPS:
sudo apt-get install \
	apt-transport-https \
	ca-certificates \
	curl \
	software-properties-common
- Add Docker’s official GPG key:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
- Set up the Docker stable repository (for architecture amd64):
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
c. Install Docker:
- Update the apt package index:
sudo apt-get update
- Install the latest version of Docker:
sudo apt-get install docker-ce
- Verify that Docker CE is properly installed (will print an informational message):
sudo docker run hello-world

Running Docker as non-root user

After installing Docker, you will have to use sudo for all Docker commands. To avoid that, follow the instructions from the Docker Linux postinstall web page:

- Create the docker group:
sudo groupadd docker
- Add your user to the docker group:
sudo usermod -aG docker $USER
- Log out and log back in so that your group membership is re-evaluated, then verify that you can run Docker commands without sudo:
docker run hello-world

Configure Docker to start on boot

In order to have the Docker service automatically started when the system boots, follow the instructions from the Docker Linux postinstall web page:
- For Ubuntu 16.04 and higher, run:

sudo systemctl enable docker

2. Configuring a Docker Swarm for WIPP

The WIPP Docker deployment is using Docker Swarm to create a cluster of Docker nodes on a single or multiple hosts configuration.

- Initialize the swarm:
IP_MGR=129.1.2.3 (replace 129.1.2.3 by the IP address of the host)
docker swarm init --advertise-addr ${IP_MGR} --listen-addr ${IP_MGR}
- Configure the swarm network overlay:
docker network create --driver overlay wippnet

3. Configuring Docker volumes for WIPP database and file system storage

The WIPP Docker containers are using Docker volumes to store data on the host (see Docker volumes web page for more information about Docker volumes).

- Create a Docker volume for the WIPP database:
docker volume create --name wippdbvolume
- Create a Docker volume for the WIPP data storage:
docker volume create --name wippdatavolume

4. Deploying WIPP

A deployment script setup.sh is provided in the WIPP Zip file.

- Make sure that the execution permissions are set for the script:
chmod a+x setup.sh
- Run the script to deploy WIPP on the Docker Swarm:
./setup.sh wippnet wippdbvolume wippdatavolume 4G
Replace "4G" by the maximum amount of RAM you want to allow for the image processing algorithms. This amount should be scaled according to the available RAM on the host and the size of the data you want to process, knowing that algorithms may require up to 10 times the size of a single image in RAM size, i.e., a 1GB image may need up to 10GB of RAM. Format: size[g|G|m|M|k|K]
This script will pull the WIPP Docker images from the WIPP Docker Hub repository, as well as the public MongoDB (database) and HTCondor (job scheduler) images, and deploy them as services in the Docker Swarm.
After running the script, the services will start their initialization. To get the status of the WIPP services, run:
docker service ls
The output should be similar to the following:
ID            NAME     MODE        REPLICAS  IMAGE
1x4y8ee96kn2  master   replicated  0/1       dscnaf/htcondor-debian:release-0.2.0
3wx2zpmbj6om  wipp     replicated  0/1       wipp/wipp:latest
iyyrjoln0t0t  mongodb  replicated  0/1       mongo:3.4
n3bgifaet7vf  exec     replicated  0/1       wipp/wipp_executor:latest
REPLICAS 0/1 mean services are not yet started (either loading or failing to start).

For detailed information about a service:

docker service ps $NAME
For example, to check the state of the mongodb (database) service: docker service ps mongodb 
ID            NAME       IMAGE         NODE            DESIRED STATE  CURRENT STATE           ERROR  PORTS
02zpjbxq42a7  mongodb.1  mongo:latest  vm-itl-ssd-063  Running        Running 44 seconds ago

WIPP system services will start in this order: mongodb, master, wipp, exec.
Once everything is started (after a few minutes), the state of the services should be similar to the following:
docker service ls

ID            NAME     MODE        REPLICAS  IMAGE
1x4y8ee96kn2  master   replicated  1/1       dscnaf/htcondor-debian:release-0.2.0
3wx2zpmbj6om  wipp     replicated  1/1       wipp/wipp:latest
iyyrjoln0t0t  mongodb  replicated  1/1       mongo:3.4
n3bgifaet7vf  exec     replicated  1/1       wipp/wipp_executor:latest

5. Opening firewall ports for the WIPP system

WIPP is running through the ports 18080 (web interface) and 15005 (Pegasus dashboard). In order to access the WIPP system, these ports have to be open for communication.

- On Ubuntu 16.04 and up, one option is to use the uncomplicated firewall (ufw):
sudo ufw allow 18080
sudo ufw allow 15005
- Check that the ports are open:
sudo ufw status verbose

6. Accessing WIPP system web interface

Once deployed, the WIPP web interface will be accessible from http://host-ip:18080 and the Pegasus dashboard (for troubleshooting any failing jobs) will be accessible from https://host-ip:15005 (replace host-ip by the IP address of the host.)

The Pegasus dashboard is accessible via HTTPS, but does not ship with a SSL certificate. Some web browsers will complain about the lack of valid certificate and ask the user to add a security exception, or confirm "Proceed to the website (unsafe)" before being able to access the dashboard. The dashboard is set up to be accessible with the credentials "wipp/zaq123".

WIPP deployment on multiple hosts

Under construction.

WIPP deployment without Docker (manual installation)

Under construction