...
If you prefer not to be limited by the above, we recommend utilizing our dedicated Github / Bitbucket / Gitlab integrations, which are built around Data Theorem’s Cloud infrastructure and provide the most polished developer experience (see onboarding instructions at DevSecOps > SAST Code Analysis).
Table of Contents | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Requirements
The machine running the scanner must have
docker
installedThe machine running the scanner must have internet access
We can recommend a base of 8GB RAM / 4 CPUs to run the scans, but note that scan time is proportional to the code base Here are our base spec recommendations for running the on-prem scanner
Repository Size | CPUs | RAM | Disk Size (SSD) |
---|---|---|---|
0-5 GB | 4 CPUs | 8 GB | 16 GB |
5-10 GB | 8 CPUs | 16 GB | 32 GB |
10-20 GB | 16 CPUs | 32 GB | 64 GB |
Note: Scan time is relative to the repository size so the specs that fit your needs may vary based on the size of your
...
repository.
Step 1: Generate a SAST Security Results API Key
...
set
DT_SAST_FAIL_MODE=true
if set, the process will return a non-zero status when issues are found. This can be used to make Data Theorem SAST a blocking step of your workflow.set
DT_SAST_NO_FORWARD_MODE=true
if you want to skip forwarding scan results/metadata to Data Theorem, note that this will mean that no scan results will be visible from the Data Theorem Portal
Local Scanning Example
...
set
DT_SAST_INCLUDE_CODE_SNIPPETS=false
if you want to hide code snippets from the printed scan result in the output (you will still see the issue location in the code from the file path and line)
Local Scanning example
The Data Theorem on-prem scanner can run from your local machine.
From the root of the git repository you wish to scan, run the following command
Code Block |
---|
docker run -it \ -e DT_SAST_API_KEY=$DT_SAST_API_KEY \ -e DT_SAST_REPOSITORY_NAME="<my_org>/<my_repo>" \ -e DT_SAST_NO_FORWARD_MODE=true \ --mount type=bind,source="$(pwd)"/,target=/target \ us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast \ data_theorem_sast_analyzer scan /target |
Sample output:
...
Example with inputs to forward scan results to the [Data Theorem Portal](https://www.securetheorem.com/api/v2/security/sast )
Code Block |
---|
docker run -it \
-e DT_SAST_API_KEY=$DT_SAST_API_KEY \
-e DT_SAST_REPOSITORY_NAME="<my_org>/<my_repo>" \
-e DT_SAST_REPOSITORY_PLATFORM=BITBUCKET \
-e DT_SAST_REPOSITORY_ID={1e734a1b-8d0e-4787-a205-aba048c00a89} \
-e DT_SAST_REPOSITORY_HTML_URL="https://bitbucket.org/<my_org>/<my_repo>" \
-e DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME="main" \
-e DT_SAST_SCANNED_BRANCH="main" \
--mount type=bind,source="$(pwd)"/,target=/target \
us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast \
data_theorem_sast_analyzer scan /target |
Sample output:
Code Block |
---|
Scanning completed in 15.65 seconds Scan results: 1 issues on commit=f719d004ef98254b46187c53ef1b3ed2f8643082 Total Issues: 1 Issues per types: - First Party Code: 1 - SCA: 1 Issues per severity: - High Severity: 1 - Medium Severity: 1 [ { "issue_title": "Unauthenticated Route Found for Flask API", "issue_description": "The security of this code is compromised due to the presence of unauthenticated access to specific routes within the Flask API. This vulnerability poses a significant risk as it can potentially expose sensitive data or allow unauthorized actions to be performed. To mitigate this risk, it is crucial to implement robust authentication mechanisms that ensure only authorized users can access the protected routes.\n\nBy allowing unauthenticated access, the code fails to validate the identity of users before granting them access to protectedcertain routes. This canlack beof achievedauthentication throughopens variousthe methodsdoor suchfor asmalicious username/password authentication, token-based authentication, or integration with third-party authentication providers.\n\nAdditionally, it is important to consider implementing other security measures such as encryption of sensitive data, input validation to prevent injection attacks, and proper error handling to avoid leaking sensitive information.\n\nBy implementing these security measures, the code can ensure that only authenticated and authorized users can access the protected routes, significantly reducing the risk of unauthorized access or data breaches. It is essential to prioritize security in the development process to safeguard sensitive data and protect the integrity of the system.", "issue_type": "FIRST_PARTY_CODE", "severity": "HIGH", "detected_in_file_path": "sample_code/bad_python.py", "detected_on_line": 7, "issue_code_snippet": "@app.route(\"/\")\ndef index():\n cmd = request.args.get(\"cmd\", \"\")\n exec(cmd)\n return \"\"" }, {actors to exploit the system and gain unauthorized access to sensitive information or perform actions that they should not be able to.\n\nTo address this issue, it is recommended to implement a secure authentication process that verifies the identity of users before granting them access to protected routes. This can be achieved through various methods such as username/password authentication, token-based authentication, or integration with third-party authentication providers.\n\nAdditionally, it is important to consider implementing other security measures such as encryption of sensitive data, input validation to prevent injection attacks, and proper error handling to avoid leaking sensitive information.\n\nBy implementing these security measures, the code can ensure that only authenticated and authorized users can access the protected routes, significantly reducing the risk of unauthorized access or data breaches. It is essential to prioritize security in the development process to safeguard sensitive data and protect the integrity of the system.", "issue_titletype": "jinja2 version 3.1.2 contains a known vulnerability (via PyPI dependency): Jinja vulnerable to HTML attribute injection when passing user input as keys to xmlattr filter", "issue_description": "Jinja vulnerable to HTML attribute injection when passing user input as keys to xmlattr filter", "issue_type": "SCA", "severity": "MEDIUM",FIRST_PARTY_CODE", "severity": "HIGH", "detected_in_file_path": "sample_code/bad_python.py", "detected_on_line": 7, "issue_code_snippet": "@app.route(\"/\")\ndef index():\n cmd = request.args.get(\"cmd\", \"\")\n exec(cmd)\n return \"\"" }, { "detected_in_file_pathissue_title": "sample_code/requirements.txt", "detected_on_line": 1, "issue_code_snippet": "jinja2==3.1.2\n" } ] Visit https://www.securetheorem.com/api/v2/security/sast for more details |
GitHub Actions examples
Set the Data Theorem API Key as a secret variable
Go to your repository > Settings
> Security
> Secrets and variables
> Actions
> Secrets
Click on New Repository Secret
and create a secret variable named DT_SAST_API_KEY
with the value retrieved in Step 1
Scans on pushes
Code Block |
---|
name: Data Theorem SAST # Controls when the workflow will run, adapt to your own needs on: # Triggers the workflow on push or pull request events but only for the "main" branch push: branches: [ "main" ] # Allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: scan: runs-on: ubuntu-latest container: image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest env: DT_SAST_API_KEY: ${{ secrets.DT_SAST_API_KEY }} DT_SAST_REPOSITORY_NAME: ${{ github.event.repository.full_name }} DT_SAST_REPOSITORY_PLATFORM: GITHUB DT_SAST_REPOSITORY_ID: ${{ github.event.repository.id }} DT_SAST_REPOSITORY_HTML_URL: ${{ github.event.repository.html_url }} DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME: ${{ github.event.repository.default_branch }} DT_SAST_OUTPUT_DIR: ./ steps: - uses: actions/checkout@v4 - name: Start Data Theorem SAST Scanjinja2 version 3.1.2 contains a known vulnerability (via PyPI dependency): Jinja vulnerable to HTML attribute injection when passing user input as keys to xmlattr filter", "issue_description": "Jinja vulnerable to HTML attribute injection when passing user input as keys to xmlattr filter", "issue_type": "SCA", "severity": "MEDIUM", "detected_in_file_path": "sample_code/requirements.txt", "detected_on_line": 1, "issue_code_snippet": "jinja2==3.1.2\n" } ] Visit https://www.securetheorem.com/api/v2/security/sast for more details |
GitHub Actions example
Set the Data Theorem API Key as a secret variable
Go to your repository > Settings
> Security
> Secrets and variables
> Actions
> Secrets
Click on New Repository Secret
and create a secret variable named DT_SAST_API_KEY
with the value retrieved in Step 1
Scans on pushes
Code Block |
---|
name: Data Theorem SAST # Controls when the workflow will run, adapt to your own needs on: # Triggers the workflow on push or pull request events but only for the "main" branch # Adapt triggers to your own needs push: branches: [ "main" ] # Allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: scan: continue-on-error: true timeout-minutes: 30 runs-on: ubuntu-latest container: image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest env: DT_SAST_API_KEY: ${{ secrets.DT_SAST_API_KEY }} run: data_theorem_sast_analyzer scan ./DT_SAST_REPOSITORY_NAME: ${{ github.event.repository.full_name }} - uses: actions/upload-artifact@v4 DT_SAST_REPOSITORY_PLATFORM: GITHUB withDT_SAST_REPOSITORY_ID: ${{ github.event.repository.id }} name: dt-sast-scan-result DT_SAST_REPOSITORY_HTML_URL: ${{ github.event.repository.html_url }} path: ./scan-results-sarif.json |
Scans on pull requests
Code Block |
---|
name: Data Theorem SAST # Controls when the workflow will run, adapt to your own needs on: # Triggers the workflow on push or pull request events but only for the "main" branch pull_request jobs: scan: runs-on: ubuntu-latest container: image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latestDT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME: ${{ github.event.repository.default_branch }} DT_SAST_OUTPUT_DIR: ./ steps: - uses: actions/checkout@v4 - name: Start Data Theorem SAST Scan run: data_theorem_sast_analyzer scan ./ - uses: actions/upload-artifact@v4 envwith: DT_SAST_API_KEY: ${{ secrets.DT_SAST_API_KEY }}name: dt-sast-scan-result DT_SAST_REPOSITORY_NAME: ${{ github.event.repository.full_name }} DT_SAST_REPOSITORY_PLATFORM: GITHUB DT_SAST_REPOSITORY_ID: ${{ github.event.repository.id }} DT_SAST_REPOSITORY_HTML_URL: ${{ github.event.repository.html_url }} DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME: ${{ github.event.repository.default_branch }} DT_SAST_SCAN_HEAD_REF: "refs/remotes/origin/${{ github.head_ref }}" DT_SAST_SCAN_TARGET_REF: "refs/remotes/origin/${{ github.base_ref }}" path: ./scan-results-sarif.json |
Scans on pull requests
Code Block |
---|
name: Data Theorem SAST # Controls when the workflow will run, adapt to your own needs on: # Triggers the workflow on push or pull request events but only for the "main" branch # Adapt triggers to your own needs pull_request jobs: scan: continue-on-error: true timeout-minutes: 30 runs-on: ubuntu-latest container: DT_SAST_FAIL_MODE: trueimage: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest stepsenv: - uses: actions/checkout@v4 DT_SAST_API_KEY: ${{ secrets.DT_SAST_API_KEY }} with: DT_SAST_REPOSITORY_NAME: ${{ github.event.repository.full_name }} fetch-depthDT_SAST_REPOSITORY_PLATFORM: 0GITHUB # IMPORTANT: Needed because by default, actions/checkout@v4 doesn't load the full git history/refs DT_SAST_REPOSITORY_ID: ${{ github.event.repository.id }} - name: Start Data Theorem SAST ScanDT_SAST_REPOSITORY_HTML_URL: ${{ github.event.repository.html_url }} run: data_theorem_sast_analyzer scan ./ |
Bitbucket pipeline example
Set the Data Theorem API Key as a secret variable
Go to your repository > Repository Settings
> Repository Variables
Add a variable named DT_SAST_API_KEY
with the value retrieved in step 1 and make sure the Secured
option is checked
Code Block |
---|
image: atlassian/default-image:3 pipelines: branches: main: - step: name: 'Data Theorem SAST' image: us-central1-docker.pkg.dev/dev-scandal-us/datatheorem-sast-dev/datatheorem-sast-dev:latest script: - echo "Your security scan goes here..." - export DT_SAST_API_KEY=$DT_SAST_API_KEY - export DT_SAST_REPOSITORY_NAME=$BITBUCKET_REPO_FULL_NAME - export DT_SAST_REPOSITORY_PLATFORM=BITBUCKET - export DT_SAST_REPOSITORY_ID=$BITBUCKET_REPO_UUID - export DT_SAST_REPOSITORY_HTML_URL=$BITBUCKET_GIT_HTTP_ORIGIN DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME: ${{ github.event.repository.default_branch }} DT_SAST_SCAN_HEAD_REF: "refs/remotes/origin/${{ github.head_ref }}" DT_SAST_SCAN_TARGET_REF: "refs/remotes/origin/${{ github.base_ref }}" DT_SAST_FAIL_MODE: true steps: - uses: actions/checkout@v4 with: fetch-depth: 0 # IMPORTANT: Needed because by default, actions/checkout@v4 doesn't load the full git history/refs - name: Start Data Theorem SAST Scan run: data_theorem_sast_analyzer scan ./ |
Bitbucket pipeline example
Set the Data Theorem API Key as a secret variable
Go to your repository > Repository Settings
> Repository Variables
Add a variable named DT_SAST_API_KEY
with the value retrieved in step 1 and make sure the Secured
option is checked
Code Block |
---|
image: atlassian/default-image:3
pipelines:
# Triggers the pipeline on push events but only for the "main" branch
# Adapt triggers to your own needs
branches:
main:
- step:
name: 'Data Theorem SAST'
image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest
script:
- echo "Your security scan goes here..."
- export DT_SAST_API_KEY=$DT_SAST_API_KEY
- export DT_SAST_REPOSITORY_NAME=$BITBUCKET_REPO_FULL_NAME
- export DT_SAST_REPOSITORY_PLATFORM=BITBUCKET
- export DT_SAST_REPOSITORY_ID=$BITBUCKET_REPO_UUID
- export DT_SAST_REPOSITORY_HTML_URL=$BITBUCKET_GIT_HTTP_ORIGIN
- export DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME="main"
- data_theorem_sast_analyzer scan ./
pull-requests:
# Triggers the pipeline on pull request events
# Adapt triggers to your own needs
"**":
- step:
name: 'Data Theorem SAST'
image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest
script:
- echo "Your security scan goes here..."
- export DT_SAST_API_KEY=$DT_SAST_API_KEY
- export DT_SAST_REPOSITORY_NAME=$BITBUCKET_REPO_FULL_NAME
- export DT_SAST_REPOSITORY_PLATFORM=BITBUCKET
- export DT_SAST_REPOSITORY_ID=$BITBUCKET_REPO_UUID
- export DT_SAST_REPOSITORY_HTML_URL=$BITBUCKET_GIT_HTTP_ORIGIN
- export DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME="main"
- export DT_SAST_SCAN_HEAD_REF=$BITBUCKET_COMMIT
- export DT_SAST_SCAN_TARGET_REF=$BITBUCKET_PR_DESTINATION_COMMIT
- export DT_SAST_FAIL_MODE=true
- data_theorem_sast_analyzer scan ./ |
Gitlab pipeline example
Set the Data Theorem API Key as a secret variable
Go to your project > Settings
> CI/CD
> Variables
Add a variable named DT_SAST_API_KEY
with the value retrieved in step 1 and make sure the Masked
option is checked
Note: the Gitlab pipeline must run the Data Theorem SAST step on an executor that supports the image
feature.
See https://docs.gitlab.com/runner/executors/#compatibility-chart for more information on compatible executors
Code Block |
---|
stages: - security-scan datatheorem-sast-scan-branch-job: only: - main # Trigger on default branch push, replace 'main' with the name of your default branch tags: - gitlab-runner-docker # Needs to be an executor compatible with the`image` feature stage: security-scan image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest script: - export DT_SAST_API_KEY=$DT_SAST_API_KEY - export DT_SAST_REPOSITORY_NAME=$CI_PROJECT_PATH - export DT_SAST_REPOSITORY_PLATFORM="GITLAB_ON_PREM" - export DT_SAST_REPOSITORY_ID=$CI_PROJECT_ID - export DT_SAST_REPOSITORY_HTML_URL=$CI_PROJECT_URL - export DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME=$CI_DEFAULT_BRANCH - export DT_SAST_SCAN_HEAD_REF=$CI_COMMIT_REF_NAME - data_theorem_sast_analyzer scan ./ datatheorem-sast-scan-merge-request-job: only: - merge_requests tags: - gitlab-runner-docker # Needs to be an executor compatible with the`image` feature stage: security-scan image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest script: - export DT_SAST_API_KEY=$DT_SAST_API_KEY - export DT_SAST_REPOSITORY_NAME=$CI_PROJECT_PATH - export DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME="main"PLATFORM="GITLAB_ON_PREM" - - data_theorem_sast_analyzer scan ./export DT_SAST_REPOSITORY_ID=$CI_PROJECT_ID - pull-requests:export DT_SAST_REPOSITORY_HTML_URL=$CI_PROJECT_URL "**":- export DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME=$CI_DEFAULT_BRANCH - step: export DT_SAST_SCAN_TARGET_REF=$CI_MERGE_REQUEST_TARGET_BRANCH_NAME - data_theorem_sast_analyzer scan name: 'Data Theorem SAST' image: us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest script: - echo "Your security scan goes here..." - export./ |
Azure DevOps Pipeline Example
Create a new Azure DevOps Pipeline
Add a variable named DT_SAST_API_KEY
with the value retrieved in step 1 and make sure the Keep this value secret
option is checked. (See https://learn.microsoft.com/en-us/azure/devops/pipelines/process/set-secret-variables?view=azure-devops&tabs=yaml%2Cbash )
The Azure Pipeline definition should look like this:
Code Block |
---|
trigger: - main pool: vmImage: ubuntu-latest steps: - script: | docker run \ -e DT_SAST_API_KEY=$DT'$(DT_SAST_API_KEY)' \ - export-e DT_SAST_REPOSITORY_NAME=$BITBUCKET_REPO_FULL_NAME $(Build.Repository.Name) \ -e export DT_SAST_REPOSITORY_PLATFORM=BITBUCKET AZURE_DEVOPS \ -e export DT_SAST_REPOSITORY_ID=$BITBUCKET_REPO_UUID $(Build.Repository.ID) \ - export-e DT_SAST_REPOSITORY_HTML_URL=$BITBUCKET_GIT_HTTP_ORIGIN $(Build.Repository.Uri) \ - exporte DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME="main" \ -e DT_SAST_SCANNED_BRANCH=$(Build.SourceBranchName) \ -e export DT_SAST_SCAN_HEAD_REF=$BITBUCKET_COMMIT "HEAD" \ - export DT_SAST_SCAN_TARGET_REF=$BITBUCKET_PR_DESTINATION_COMMIT --mount type=bind,source="$(pwd)"/,target=/target \ us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest \ - export DT_SAST_FAIL_MODE=true data_theorem_sast_analyzer scan /target displayName: 'Data Theorem - data_theorem_sast_analyzer scan ./ |
Gitlab pipeline example
Set the Data Theorem API Key as a secret variable
Go to your project > Settings
> CI/CD
> Variables
Add a variable named DT_SAST_API_KEY
with the value retrieved in step 1 and make sure the Protected
and Masked
options are checked
Note: the Gitlab pipeline must run the Data Theorem SAST step on an that supports the image
feature.
See https://docs.gitlab.com/runner/executors/#compatibility-chart for more information on compatible executors
Code Block |
---|
stages: - security-scan datatheorem-sast-scan-job: tags: - gitlab-runner-docker # Needs to be an executor compatible with the`image` feature stage: security-scan image: us-central1-docker.pkg.dev/dev-scandal-us/datatheorem-sast-dev/datatheorem-sast-dev:latest script: - echo $CI_COMMIT_REF_NAME $CI_COMMIT_BRANCH - export DT_SAST_API_KEY=$DT_SAST_API_KEY - export DT_SAST_REPOSITORY_NAME=$CI_PROJECT_PATH - export DT_SAST_REPOSITORY_PLATFORM="GITLAB_ON_PREM" - export DT_SAST_REPOSITORY_ID=$CI_PROJECT_ID - export DT_SAST_REPOSITORY_HTML_URL=$CI_PROJECT_URL - export DT_SAST_REPOSITORY_DEFAULT_BRANCH_NAME=$CI_DEFAULT_BRANCH - export DT_SAST_SCAN_HEAD_REF=$CI_COMMIT_BRANCH - data_theorem_sast_analyzer scan ./ On-Prem SAST' |
Troubleshooting
SSL Errors
If the scanner if failing because of SSL errors, it may be because you are running the scanner behind a proxy that is making SSL verification fail.
If this is the case, we recommend to do the following:
You can build a custom Docker images that embeds your own valid SSL certificates
Make sure you have valid certificates that are able to call api.securetheorem.com
from the machine that is running the Data Theorem On-Prem Scanner
The Dockerfile would look like this
Code Block |
---|
FROM us-central1-docker.pkg.dev/prod-scandal-us/datatheorem-sast/datatheorem-sast:latest
# Copy SSL certificates to add
COPY --from=<YOUR_SOURCE> <PATH_TO_YOUR_SSL_CERTS> /usr/share/ca-certificates/custom
# Add certificates to /etc/ca-certificates.conf
RUN for crt in /usr/share/ca-certificates/custom/*.crt; do \
echo "Adding $crt" && echo "custom/$(basename "$crt")" >> /etc/ca-certificates.conf; \
done
# Update bundled CA certificates at /etc/ssl/certs/ca-certificates.crt
RUN update-ca-certificates
ENV DT_SAST_PATH_TO_SSL_CERTS_FILE=/etc/ssl/certs/ca-certificates.crt |
If this is not working, please contact
support@datatheorem.com
for help