Overview
The Data Theorem Splunk application is a private Splunk App distributed by Data Theorem that automates log analysis. It searches for Web events as defined by the Splunk Common Information Model (CIM) add-on and sends the resulting access logs to Data Theorem for analysis.
System Requirements
Splunk Enterprise version 9.x (Python 3.9 recommended)
Operating System
Requirements
Linux
Architecture:
x86_64
Dependencies
Supported Splunk Deployment Types
Standalone deployment: The app can be be deployed in a standalone Splunk instance, where Splunk performs both the search head and indexer roles.
Distributed deployment: In a distributed setup, where the search head is separate from indexers, ensure the app is deployed on the search head to perform the scheduled queries.
App Permissions
Access to run scheduled queries
Access to the indexes storing web-related data(tag='web').
Access to export the data, making REST API calls is required.
Access to write and read the API key that is stored in Splunk secrets store.
Data Protection
The Data Theorem Splunk App searches only indexes containing Web CIM data, and uses only the below list of fields from the Web CIM model. The Web CIM fields contain metdata about requests,e similar to the information in an nginx or webserver access log. By using only defined fields on indexed Web CIM logs, there is minimal risk of accidental disclosure of sensitive data.
The Web CIM model include a`cookie` field that may contain HTTP Cookie values. We exclude this field and do not store or transmit any HTTP cookies.
Complete list of Web CIM fields
Dataset name | Field name | Data type | Description | Abbreviated list of example values |
---|---|---|---|---|
Web |
| string | The action taken by the server or proxy. |
|
Web |
| string | The application detected or hosted by the server/site such as WordPress, Splunk, or Facebook. | |
Web |
| number | The total number of bytes transferred ( |
|
Web |
| number | The number of inbound bytes transferred. |
|
Web |
| number | The number of outbound bytes transferred. |
|
Web |
| boolean | Indicates whether the event data is cached or not. | prescribed values: |
Web |
| string | The category of traffic, such as may be provided by a proxy server. | required for pytest-splunk-addon |
Web |
| string | The destination of the network traffic (the remote host). You can alias this from more specific fields, such as |
|
Web |
| string | These fields are automatically provided by asset and identity correlation features of applications like Splunk Enterprise Security. Do not define extractions for these fields when writing add-ons. | |
Web |
| string | ||
Web |
| string | ||
Web |
| number | The destination port of the web traffic. | required for pytest-splunk-addon |
Web |
| number | The time taken by the proxy event, in milliseconds. | |
Web |
| string | The content-type of the requested HTTP resource. | recommended |
Web |
| string | The HTTP method used in the request. |
|
Web |
| string | The HTTP referrer used in the request. The W3C specification and many implementations misspell this as | recommended |
Web |
| string | The domain name contained within the HTTP referrer used in the request. | recommended |
Web |
| string | The user agent used in the request. |
|
Web |
| number | The length of the user agent used in the request. | required for pytest-splunk-addon |
Web |
| number | The amount of time it took to receive a response, if applicable, in milliseconds. | |
Web |
| string | The virtual site which services the request, if applicable. | |
Web |
| string | The source of the network traffic (the client requesting the connection). |
|
Web |
| string | These fields are automatically provided by asset and identity correlation features of applications like Splunk Enterprise Security. Do not define extractions for these fields when writing add-ons. | |
Web |
| string | ||
Web |
| string | ||
Web |
| string | The HTTP response code indicating the status of the proxy request. |
|
Web |
| string | This automatically generated field is used to access tags from within datamodels. Do not define extractions for this field when writing add-ons. | |
Web |
| string | The path of the resource served by the webserver or proxy. | other: |
Web |
| string | The path of the resource requested by the client. | other: |
Web |
| string | The URL of the requested HTTP resource. |
|
Web |
| string | The domain name contained within the URL of the requested HTTP resource. | recommended |
Web |
| number | The length of the URL. | |
Web |
| string | The user that requested the HTTP resource. | recommended |
Web |
| string | These fields are automatically provided by asset and identity correlation features of applications like Splunk Enterprise Security. Do not define extractions for these fields when writing add-ons. | |
Web |
| string | ||
Web |
| string | ||
Web |
| string | The vendor and product of the proxy server, such as | recommended |
Storage |
| string | The error code that occurred while accessing the storage account. | other: |
Storage |
| string | The operation performed on the storage account. | other: |
Storage |
| string | The name of the bucket or storage account. | other: |
Installation
The