Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The container must be able to perform DNS lookups of the APIs that will be scanned. For simple deployments, if the host system for a container can resolve hostnames in the private network, so can the container.

  • The container must be configured with an SSH private key and a port assigned to the appliance by Data Theorem. The sections below discuss coordinating this configuration with Data Theorem.

  • A particular appliance’s container should only exist once – it should not be scaled or replicated across a cluster (eg, Docker Swarm or Kubernetes). The container is an appliance, and that instance represents where network traffic from Data Theorem will originate within the private network.

  • If you have multiple isolated private networks where there you have APIs to scan, then each network will need its own appliance/container configured with Data Theorem.

  • The container currently logs all output to STDOUT and STDERR.

The appliance’s Docker container image is available at: gcr.io/datatheorem-public-images/private-network-proxy-client-v1

...

Run the appliance on GCP’s Compute Engine, within a Container Optimized OS

GCP GCP’s Compute Engine provides a convenient way to spin up a VM that runs a single Docker container without having to worry about orchestrating container orchestration. This example shows how to launch such a VM, in order to test-launch the appliance.

Create a somefilename.env file that contains the ENV configuration for the appliance. Note that the newlines in the private key have been replaced with \n to include it on a single line, due to a limitation of this ENV file format (the file formatdeployed container will handle \n and \r character sequences within a key file or data by automatically by replacing the former with a newline and by removing the latter):

Code Block
PROXY_PORT=10123
SSH_PRIVATE_KEY_DATA=-----BEGIN OPENSSH PRIVATE_KEY-----\n...\n-----END OPENSSH PRIVATE KEY-----

...

The appliance container’s security primarily depends on how/where it is deployed and configured. However, Data Theorem has take taken steps to minimize the attack surface of the client and follow container best practices:

  • The Docker image is based on Alpine Linux, which is known for having a significantly smaller footprint compared to other popular distributions. It also relies on Linux hardening features like PIE, and it uses MUSL as its libc instead of GNU’s libc.

  • The service that runs in the container does not run as root. The Docker image runs commands as a normal, non-root user, minimizing what code running in the container can modify, and minimizing concerns about root processes running within containers.The Docker image is based on Alpine Linux, which is known for having a significantly smaller footprint compared to other popular distributions. It also relies on Linux hardening features like PIE, and it uses MUSL as its libc instead of GNU’s libc.

  • The SSH client is configured to only trust a specific server key, instead of the default of prompting or auto-trusting new keys for a new host . (via a UserKnownHostsFile and the StrictHostKeyChecking option). The trusted server key can be overridden using the SERVER_PUBKEY ENV variable.

Server Security

The server component is also run in a container in a VM on GCP.

  • The VM runs Google’s Container Optimized OS (COS), which is a hardened, minimal OS optimized for deploying individual docker containers on GCP. GCP also builds its GKE (Kubernetes) offering and other container-based offerings on COS.

  • The VM is firewalled to only publicly expose a non-standard SSHD port

  • The server’s container runs SSHD and nothing else

  • The server’s container runs SSHD as a non-root user

  • SSHD is configured to lock down what SSH features and services it offers

    • In addition to being run as a non-root user, it is configured to disallow root logins and to disable password-based authentication entirely

    • It disables all SSH sub-services except for remote/reverse port forwarding

    • It only allows authorized keys to authenticate

    • It restricts each authorized key to disable running commands, and to disable all services except reverse port forwarding

    • Each authorized key is granted a single port that it can open that SSHD will listen on to receive traffic meant for the proxy running in the client

  • The SSHD server’s private key is kept out of source code. Instead, it is protected using GCP’s Secrets Manager, and it is only accessed in order to deploy it to the VM and provide it to the container.

  • The proxy ports are only accessible to a VPC that is restricted to the security scanner components of Data Theorem’s analyzer engine that need traffic to originate from a static IP address, or that may need to go through the private-network-proxy

  • The VM is hosted on a GCP project separate from any other Data Theorem services, isolating it from other services, and making it easier to manage who has internal access