Skip to content

Commit

Permalink
Feature/pysigma version increase (#11)
Browse files Browse the repository at this point in the history
* Updated pysigma core + backends
* Added Kusto, netwitness
* Added async support for LLMs
  • Loading branch information
slincoln-systemtwo authored Oct 10, 2024
1 parent 06621bf commit b4c7c05
Show file tree
Hide file tree
Showing 37 changed files with 2,879 additions and 2,084 deletions.
28 changes: 28 additions & 0 deletions .github/workflows/release-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: Release to Test.PyPI
on:
push:
branches:
- '!main'
- '!master'
jobs:
build-and-publish:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install Poetry
run: pipx install poetry
- name: Install dependencies
run: poetry install --with dev,llm
- name: Build packages
run: poetry build
- name: Configure Poetry
run: |
poetry config repositories.testpypi https://test.pypi.org/legacy/
poetry config pypi-token.testpypi ${{ secrets.TEST_PYPI_API_TOKEN }}
- name: Publish to test PyPI
if: ${{ github.event_name == 'push' }}
run: poetry publish -r testpypi
17 changes: 9 additions & 8 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,20 @@ jobs:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
- name: Install Poetry
run: pipx install poetry
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.8
cache: poetry
python-version: '3.11'
- name: Install Poetry
run: pipx install poetry
- name: Verify versioning
run: |
[ "$(poetry version -s)" == "${GITHUB_REF#refs/tags/v}" ]
- name: Install dependencies
run: poetry install
run: poetry install --with dev,llm
- name: Run tests
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: poetry run pytest
- name: Build packages
run: poetry build
Expand All @@ -32,9 +33,9 @@ jobs:
poetry config repositories.testpypi https://test.pypi.org/legacy/
poetry config pypi-token.testpypi ${{ secrets.TEST_PYPI_API_TOKEN }}
poetry config pypi-token.pypi "${{ secrets.PYPI_API_TOKEN }}"
- name: Publish to test PyPI
if: ${{ github.event_name == 'push' }}
run: poetry publish -r testpypi
#- name: Publish to test PyPI
# if: ${{ github.event_name == 'push' }}
# run: poetry publish -r testpypi
- name: Publish to PyPI
if: ${{ github.event_name == 'release' }}
run: poetry publish
18 changes: 12 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,26 @@ jobs:
strategy:
matrix:
os: [ 'ubuntu-20.04', 'windows-2019', 'macos-12' ]
python-version: [ '3.8', '3.9', '3.10', '3.11' ]
python-version: [ '3.9', '3.10', '3.11', '3.12' ]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- name: Install Poetry
run: pipx install poetry
- name: Set up Python
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: poetry
- name: Install Poetry
run: pipx install poetry
- name: Configure Poetry
run: |
poetry env use ${{ matrix.python-version }}
poetry --version
poetry env info
- name: Install dependencies
run: poetry install
run: poetry install --with dev,llm
- name: Run tests
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: poetry run pytest --cov=sigmaiq --cov-report term --cov-report xml:cov.xml -vv
- name: Store coverage for badge
if: ${{ runner.os == 'Linux' }}
Expand Down
214 changes: 125 additions & 89 deletions README.md

Large diffs are not rendered by default.

96 changes: 47 additions & 49 deletions examples/custom_field_mappings.py
Original file line number Diff line number Diff line change
@@ -1,49 +1,47 @@
# %% This example shows how to use the SigmAIQ pySigma wrapper to provide custom field mappings for a backend
# %% This will allow you to translate specific field names to custom field names during rule translation

# %% Import SigmAIQ
from sigmaiq import SigmAIQBackend, SigmAIQPipeline

# %% Import pprint for pretty printing, and copy for copying rules
from pprint import pprint
from copy import copy

# %% A basic Sigma Rule in YAML str to convert to a query.
# %% SigmAIQ also accepts a rule in JSON/Dict format, SigmaRule objects, and SigmaCollection objects

sigma_rule = """
title: whoami Command
description: Detects a basic whoami commandline execution
logsource:
product: windows
category: process_creation
detection:
selection1:
- CommandLine|contains: 'whoami.exe'
condition: selection1
"""

# %% Create SigmAIQ backend translate the rule to a Microsoft 365 Defender query
sigmaiq_backend = SigmAIQBackend(backend="splunk").create_backend()
query = sigmaiq_backend.translate(copy(sigma_rule)) # Returns List of queries

print("\nM365Defender Query: ", end="\n\n")
pprint(query[0])
print("\n-------------------")

# %% Create custom field mappings
# %% This will map the CommandLine field to a custom field named "CustomCommandLine"
custom_field_mappings = {"CommandLine": "CustomCommandLine"}
my_custom_pipeline = SigmAIQPipeline.from_fieldmap(custom_field_mappings, priority=0).create_pipeline()

# %% Create SigmAIQ backend translate the rule to a Microsoft 365 Defender query with our custom field mappings
sigmaiq_backend = SigmAIQBackend(
backend="splunk",
processing_pipeline=my_custom_pipeline).create_backend()

query = sigmaiq_backend.translate(copy(sigma_rule)) # Returns List of queries

print("\nM365Defender Query with Custom Fieldmappings: ", end="\n\n")
pprint(query[0])
print("\n-------------------")

# %% This example shows how to use the SigmAIQ pySigma wrapper to provide custom field mappings for a backend
# %% This will allow you to translate specific field names to custom field names during rule translation

# %% Import SigmAIQ
from sigmaiq import SigmAIQBackend, SigmAIQPipeline

# %% Import pprint for pretty printing, and copy for copying rules
from pprint import pprint
from copy import copy
from typing import Dict, Union, List

# %% A basic Sigma Rule in YAML str to convert to a query.
# %% SigmAIQ also accepts a rule in JSON/Dict format, SigmaRule objects, and SigmaCollection objects

sigma_rule = """
title: whoami Command
description: Detects a basic whoami commandline execution
logsource:
product: windows
category: process_creation
detection:
selection1:
- CommandLine|contains: 'whoami.exe'
condition: selection1
"""

# %% Create SigmAIQ backend translate the rule to a Microsoft 365 Defender query
sigmaiq_backend = SigmAIQBackend(backend="splunk").create_backend()
query = sigmaiq_backend.translate(copy(sigma_rule)) # Returns List of queries

print("\nM365Defender Query: ", end="\n\n")
pprint(query[0])
print("\n-------------------")

# %% Create custom field mappings
# %% This will map the CommandLine field to a custom field named "CustomCommandLine"
custom_field_mappings: Dict[str, Union[str, List[str]]] = {"CommandLine": "CustomCommandLine"}
my_custom_pipeline = SigmAIQPipeline.from_fieldmap(custom_field_mappings, priority=0).create_pipeline()

# %% Create SigmAIQ backend translate the rule to a Microsoft 365 Defender query with our custom field mappings
sigmaiq_backend = SigmAIQBackend(backend="splunk", processing_pipeline=my_custom_pipeline).create_backend()

query = sigmaiq_backend.translate(copy(sigma_rule)) # Returns List of queries

print("\nM365Defender Query with Custom Fieldmappings: ", end="\n\n")
pprint(query[0])
print("\n-------------------")
73 changes: 37 additions & 36 deletions examples/llm_basic_usage.py
Original file line number Diff line number Diff line change
@@ -1,36 +1,37 @@
# %% This example will demonstrate how to use SigmAIQ to perform the following:
# %% 1. Download the latest Sigma Rule package release
# %% 2. Create embeddings of the Sigma Rules in the package
# %% 3. Create and save a VectorDB of the Sigma Rule embeddings
# %% 4. Use a similarity search on the VectorDB to find Sigma Rules similar to a provided query
from pprint import pprint

# %% NOTE, this example uses OpenAI for embeddings. Ensure you have an OpenAI API key set in your environment variable
# %% OPENAI_API_KEY

# %% Also ensure you have installed the correct requirements with:
# `pip install -r requirements/common.txt -r requirements/llm.txt`


# %% Import SigmAIQ LLM and OpenAIEmbeddings
from sigmaiq.llm.base import SigmaLLM

# %% Create a SigmaLLM object with default settings. See the class docstring for more information
sigma_llm = SigmaLLM()

# %% The `create_sigma_vectordb()` method will automatically do all the work for you :) (only run this once)
sigma_llm.create_sigma_vectordb(save=True) # Save locally to disk

# %% Run a similarity search on the vectordb for encoded powershell commands and print top 3 results
query = "Encoded powershell commands"
matching_rules = sigma_llm.simple_search(query, k=3)
for matching_rule in matching_rules:
print(matching_rule.page_content, end="\n\n-------------------\n\n")

# %% You can also load an existing vector store from disk (recommended)
sigma_llm.load_sigma_vectordb()

query = "certutil"
matching_rules = sigma_llm.simple_search(query, k=3)
for matching_rule in matching_rules:
print(matching_rule.page_content, end="\n\n-------------------\n\n")
# %% This example will demonstrate how to use SigmAIQ to perform the following:
# %% 1. Download the latest Sigma Rule package release
# %% 2. Create embeddings of the Sigma Rules in the package
# %% 3. Create and save a VectorDB of the Sigma Rule embeddings
# %% 4. Use a similarity search on the VectorDB to find Sigma Rules similar to a provided query

# %% NOTE, this example uses OpenAI for embeddings. Ensure you have an OpenAI API key set in your environment variable
# %% OPENAI_API_KEY

# %% Also ensure you have installed the correct requirements with:
# `pip install -r requirements/common.txt -r requirements/llm.txt`


# %% Import SigmAIQ LLM and OpenAIEmbeddings
from sigmaiq.llm.base import SigmaLLM

# %% Create a SigmaLLM object with default settings. See the class docstring for more information
from langchain_openai import OpenAIEmbeddings

sigma_llm = SigmaLLM(embedding_model=OpenAIEmbeddings(model="text-embedding-3-large"))

# %% The `create_sigma_vectordb()` method will automatically do all the work for you :) (only run this once)
sigma_llm.create_sigma_vectordb(save=True) # Save locally to disk

# %% Run a similarity search on the vectordb for encoded powershell commands and print top 3 results
query = "Encoded powershell commands"
matching_rules = sigma_llm.simple_search(query, k=3)
for matching_rule in matching_rules:
print(matching_rule.page_content, end="\n\n-------------------\n\n")

# %% You can also load an existing vector store from disk (recommended)
sigma_llm.load_sigma_vectordb()

query = "certutil"
matching_rules = sigma_llm.simple_search(query, k=3)
for matching_rule in matching_rules:
print(matching_rule.page_content, end="\n\n-------------------\n\n")
Loading

0 comments on commit b4c7c05

Please sign in to comment.