...
In order to analyze APIs and services on private networks and VPCs, Data Theorem needs a proxy/connector that gives Data Theorem’s analyzer engine access to private networks. Data Theorem’s Private Network Proxy appliance, provided as a Docker image, creates an SSH port forwarding “tunnel” between Data Theorem and private networks to proxy the analyzer engine's network traffic.
...
...
The diagram below shows the architecture for how this works:
A Private Network Proxy appliance is deployed within a private network
It establishes an SSH tunnel/port forward back to Data Theorem that connects to a proxy
Data Theorem’s analyzer engine uses the tunnel to connect to the proxy and scan APIs within the private network
...
Setting up a Private Network Proxy Appliance
These instructions are for the initial “v1” implementation. Data Theorem expects to refine and improve the setup flow with future releases.
...
Plan out how you want to deploy the appliance
Create an SSH keypair, and provide the public key to DT support.
DT will prepare the private-network-proxy service for the appliance and will provide some additional configuration parameters
Set up a host to run the appliance as a Docker container
Configure the container with the SSH keypair and the other necessary parameters
Upload API definitions for the private APIs, and notify DT support that they should use the Private Network Proxy
Planning out the Deployment
Data Theorem’s private-network-proxy Private Network Proxy appliance is provided as a Docker image that can be deployed to a host with access to a private environment that contains APIs that are not publicly addressable. There are several requirements for how this image should be deployed:
The container must be able to perform DNS lookups of the APIs that will be scanned. For simple deployments, if the host system for a container can resolve hostnames in the private network, so can the container.
The container must be configured with an SSH private key and a port assigned to the appliance by Data Theorem. The sections below discuss coordinating this configuration with Data Theorem.
A particular appliance’s container should only exist once – it should not be scaled or replicated across a cluster (eg, Docker Swarm or Kubernetes). The container is an appliance, and that instance represents where network traffic from Data Theorem will originate within the private network.
If you have multiple isolated private networks where there you have APIs to scan, then each network will need its own appliance/container configured with Data Theorem.
...
Creating a ED25519 keypair:
Code Block | ||
---|---|---|
| ||
# This will create my_keyfile and my_keyfile.pub ssh-keygen -t ed25519 -C "namedescription for my appliance" -f my_keyfile |
Alternately, create an RSA keypair:
Code Block | ||
---|---|---|
| ||
# This will create my_keyfile and my_keyfile.pub |
Alternately, create an RSA keypair:
Code Block | ||
---|---|---|
| ||
ssh-keygen -t rsa -b 3072 -C "namedescription for my appliance" -f my_keyfile # This will create my_keyfile and my_keyfile.pub |
When prompted to set a password, leave it blank and press enter. You can set a password to encrypt the private key if you want the extra protection, but you will need to remove the password later when the private key is provided to the appliance’s Docker container.
...
Just one of the following ENV variables should must be specified:
SSH_PRIVATE_KEY_DATA
– The raw data of the SSH private key for the appliance. It uniquely identifies the appliance and the proxy it provides. The key data must not be encrypted with a password.If you are unable to specify newline characters when setting this key’s value, the container has special support for replacing a
\n
character sequence with a newline character – if you use this ENV variable you can replace any newlines in the private key with a\n
. Container orchestration that specifies configuration through YAML files usually can specify newlines, but specifying values through command line arguments or.env
files may not be able to easily include newlines in the ENV values.
SSH_PRIVATE_KEY_FILE
– A path to the private key file within the container. Offered as an alternative to specifying the contents of the private key file directly, this allows you to interoperate with various key/secrets managers, or if you want to mount the key file from a volume or from the host.The file must be readable by the
appliance
user within the container because the container does not run as root. The container’s logs will print the user’s UID when it first starts up.
...
Code Block | ||
---|---|---|
| ||
gcloud --project=${PROJECT} compute instances create-with-container \
my-vm-name \
--container-image gcr.io/datatheorem-public-images/private-network-proxy-client-v1:latest \
--container-env-file=client1_vm.env \
... # Any additional flags for creating the VM, such as network tags, zone, etc. |
Configuring Private APIs
Scanning a private API requires providing an API definition for the private API
Upload API definitions for any private APIs you want scanned. The hostnames must be resolvable to the Private Network Proxy appliance’s container.
Notify support@datatheorem.com about the private APIs/hostnames and the appliance they should be scanned with
Data Theorem will configure the analyzer engine to use the Private Network Proxy for those APIs
Security Architecture
Client Security
...