OSSEM – A Tool To Assess Data Quality

A tool to assess data quality, built on top of the awesome OSSEM project.

Mission

Answer the question: I want to start hunting ATT&CK techniques, what log sources and events are more suitable?
Create transparency on the strengths and weaknesses of your log sources
Provide an easy way to evaluate your logs

OSSEM Power-up Overview
Power-up uses OSSEM Detection Data Model (DDM) as the foundation of its data quality assessment. The main reason for this is because it provides a structured way to correlate ATT&CK Data Sources, Common information model entities (CIM), and Data Dictionaries (events) with each other.
For those unfamiliar the DDM structure, here is a sample:

ATT&CK Data Source	Sub Data Source	Source Data Object	Relationship	Destination Data Object	EventID
Process monitoring	process creation	process	created	process	4688
Process monitoring	process creation	process	created	process	1
Process monitoring	process termination	process	terminated	–	4689
Process monitoring	process termination	process	terminated	–	5

As you can see each entry in the DDM defines a sub data source (scope) using abstract entities like process, user, file, etc. Each of these entries also contain an event ID, where the scope applies. You can read more about these entitites here.
In a nutshell, DDM entries play a major role on removing the complexity of raw events, by providing a scope that defines how a log source (data channels) can be consumed.

Data Quality Dimensions
Power-up assesses data quality score according to five distinct dimensions:

Dimension	Type	Description
Coverage	Data channel	How many devices or network segments are covered by the data channel
Timeliness	Data channel	How long does it take for the event to be available
Retention	Data channel	How long does the event remain available
Structure	Event	How complete is the event, if relevant fields are available
Consistency	Event	How standard are the event fields, if fields have been normalized

Every dimension is rated with a score between 0 (none) to 5 (excelent).

Coverage, Timeliness and Retention
These dimensions are tied to data channels, and propagate to all events provided by it.
Due to the nature of these dimensions, they must be rated manually, according to the specifities of the data channels.
Power-up uses resources/dcs.yml to define data channel and rate the dimensions:

data channel: sysmon
description: sysmon monitoring
coverage: 2
timeliness: 5
retention: 2
---
data channel: security
description: windows security auditing
coverage: 5
timeliness: 5
retention: 2

Structure
In order to calculate how complete the event structure is, power-up compares the data dictionary standard names with the fields of the entities (CIM) referenced in the DDM entry (source and destination).
Because not all entity fields are relevant (depends on the context), power-up uses the concept of profiles to select which fields need to match the data dictionary standard names. For example:

# OSSEM CIM Profile
process: - process_name - process_path - process_command_line

Note: There is an example profile in profiles/default.yml for you to play with.

The structure score is calculated with the following formula:
SCORE_PERCENT = (MATCHED_FIELDS / TOTAL_RELEVANT_FIELDS) * 100
For the sake of clarity, here is an example of how structure score is calculated:

Note: Because Sysmon Event Id 1 data dictionary matches 100% of the relevant entity fields, the structure score will be rated as 5 (excelent).

The structure score is translated to the 0-5 scale in the following way:

Percentage	Score
0	0
1 to 25	1
26 to 50	2
51 to 75	3
76 to 99	4
100	5

Note: Depending on the use case (SIEM, Threat Hunting, Forensics), you can define different profiles so that you can rate your logs differently.

Consistency
To calculate consistency, power-up simply calculates the percentage of fields with a standard name in a data dictionary. Data dictionaries with a high number of fields mapped to a standard name are more likely to correlate with CIM entities.
The consistency score is calculated with the following formula:
SCORE_PERCENT = (STANDARD_NAME_FIELDS / TOTAL_FIELDS) * 100
The consistency score is translated to the 0-5 scale in the following way:

Percentage	Score
0	0
1 to 50	1
51 to 99	3
100	5

How to use

Before you start

Power-up is a python script, be sure to pip install -r requirements.txt
Be sure to have a local copy of OSSEM repository

Running power-up

$> python3 powerup.py --help _____ _____ _____ _____ _____ _____ _____ _ _ _ _____ _____ _____ _____ __ | | __| __| __| | | _ | | | | | __| __ |___| | | _ | | | | |__ |__ | __| | | | | __| | | | | | __| -|___| | | __|__| |_____|_____|_____|_____|_|_|_| |__| |_____|_____|_____|__|__| |_____|__| |__| usage: powerup.py [-h] [-o OSSEM] [-y OSSEM_YAML] [-p PROFILE] [--excel] [--elastic] [--yaml] A tool to assess ATT&CK data source coverage, built on top of awesome OSSEM. optional arguments: -h, --help show this help message and exit -o OSSEM, --ossem OSSEM path to import OSSEM markdown -y OSSEM_YAML, --ossem-yaml OSSEM_YAML path to import OSSEM yaml -p PROFILE, --profile PROFILE path to CIM profile --excel export OSSEM DDM to excel --elastic export OSSEM data models to elastic --yaml export OSSEM data models to yaml --layer export OSSEM data models to navigator layer

As you can see power-up can consume OSSEM data from two different formats:

OSSEM markdown – The native format of OSSEM when you clone from git.
OSSEM yaml – A sumarized format of OSSEM, only the data fields and a few metadata. You can power-up to convert OSSEM markdown to yaml.

Currently, Power-up exports OSSEM output to:

Yaml – Creates OSSEM structures in yaml, in the output/ folder
Excel – Creates an OSSEM DDM table, enriched with the data quality scores, in the ouput/ folder
Elastic – Creates an OSSEM structure in elastic, the indexes are as follows:
- ossem.ddm – OSSEM DDM table, enriched with the data quality scores
- ossem.cim – OSSEM CIM entries
- ossem.dds – OSSEM Data Dictionaries
- ossem.dcs – OSSEM Data Channels

Note: if no profile file path is specified power-up uses profiles/default.yml by default.

Exporting to YAML

$> python3 powerup.py -o ../OSSEM --yaml _____ _____ _____ _____ _____ _____ _____ _ _ _ _____ _____ _____ _____ __ | | __| __| __| | | _ | | | | | __| __ |___| | | _ | | | | |__ |__ | __| | | | | __| | | | | | __| -|___| | | __|__| |_____|_____|_____|_____|_|_|_| |__| |_____|_____|_____|__|__| |_____|__| |__| [*] Profile path: profiles/default.yml
[*] Parsing OSSEM from markdown
[*] Exporting OSSEM to YAML
[*] Created output/ddm_20191114_160246.yml
[*] Created output/cim_20191114_160246.yml
[*] Created output/dds_20191114_160246.yml

The goal of exporting/importing to/from YAML is to facilitate OSSEM customization. Chances are that the first you will do is create your own data dictionaries, and then add new DDM entries, so YAML will make updates easier.

Note 1: modify resources/config.yml to instruct power-up about the file names for the correct structures. Then you just need to place then in a folder and pass to OSSEM_YAML argument.

Note 2: power-up does not parse the entire OSSEM objects to YAML, only the data fields and some

metadata (i.e. description). The reason for this is that I wanted to keep the YAML object as lean as possible, just with the data you need to assess data quality.

Exporting to EXCEL

$> python3 powerup.py -o ../OSSEM --excel _____ _____ _____ _____ _____ _____ _____ _ _ _ _____ _____ _____ _____ __ | | __| __| __| | | _ | | | | | __| __ |___| | | _ | | | | |__ |__ | __| | | | | __| | | | | | __| -|___| | | __|__| |_____|_____|_____|_____|_|_|_| |__| |_____|_____|_____|__|__| |_____|__| |__| [*] Profile path: profiles/default.yml
[*] Parsing OSSEM from markdown
[*] Exporting OSSEM DDM to Excel
[*] Saved Excel to output/ddm_enriched_20191114_160041.xlsx

When exporting to Excel, power-up will create an eye-candy DDM, with the respective data quality dimensions for every entry:

Exporting to ELASTIC

$> python3 powerup.py -o ../OSSEM --elastic _____ _____ _____ _____ _____ _____ _____ _ _ _ _____ _____ _____ _____ __ | | __| __| __| | | _ | | | | | __| __ |___| | | _ | | | | |__ |__ | __| | | | | __| | | | | | __| -|___| | | __|__| |_____|_____|_____|_____|_|_|_| |__| |_____|_____|_____|__|__| |_____|__| |__| [*] Profile path: profiles/default.yml
[*] Parsing OSSEM from markdown
[*] Exporting OSSEM to Elastic
[*] Creating elastic index ossem.ddm
[*] Creating elastic index ossem.cim
[*] Creating elastic index ossem.dds
[*] Creating elastic index ossem.dcs

When exporting to Elastic, power-up will store all OSSEM data in elastic. Because the DDM is also enriched with the respective data quality dimensions, you will be able to create dashboards like this:

Exporting to ATT&CK Navigator

$> python3 powerup.py -o ../OSSEM --layer _____ _____ _____ _____ _____ _____ _____ _ _ _ _____ _____ _____ _____ __ | | __| __| __| | | _ | | | | | __| __ |___| | | _ | | | | |__ |__ | __| | | | | __| | | | | | __| -|___| | | __|__| |_____|_____|_____|_____|_|_|_| |__| |_____|_____|_____|__|__| |_____|__| |__| [*] Profile path: profiles/default.yml
[*] Parsing OSSEM from markdown
[*] Exporting OSSEM to Naviagator Layer
[*] Pulling ATT&CK data
[*] Generating data source quality layer
[*] Created output/ds_layer_20191119_220141.json

When exporting to layer, power-up will create an Attack Navigator Layer JSON file, with the respective data quality dimensions for every technique:

Note: technique scores are derived from data sources average scores in the DDM.

Acknowledgements

To-Do

Create additional documentation
Export to ATT&CK Navigator Layer
Properly handle data dictionaries that share the same data channel, but have different schema depending on the operating system
Provide Kibana objects (visualizations and dashboards)

Source : KitPloit – PenTest Tools!

Huge security flaw in macOS lets hackers steal your passwords

Motorola Moto G (2nd Gen) gets Android Marshmallow update

WhatsApp down on New Year’s Eve: Users worldwide unable to connect as messaging app crashes repeatedly

WhatsApp for Windows Phone update brings starred messages, new camera interface

Microsoft Lumia 950 Dual SIM, Lumia 950 XL Dual SIM Launched in India

Nokia C1 Leak Tips Launch With Android and Windows 10 Mobile

A solar-powered “Lunar” smartwatch seems like a good idea — if it works

TV Service is being killed by Google Fiber; The Company wants to concentrate on High Speed Internet

Google Home now lets you set and manage your reminders

Hacker Steve Lord says Windows Phone is the”hardest nut to crack”

Google Makes Full-Disk Encryption Mandatory for New Android 6.0 Devices

Hike users can now send messages without internet

Social-Analyzer – API And Web App For Analyzing And Finding A Person Profile Across +300 Social Media Websites (Detections Are Updated Regularly)

Six Methods to Create a Secure Password You’ll Actually Remember [INFOGRAPHIC]

Here’s how to kick nazis off your Twitter right now

Twitter CEO promises to crack down on hate, violence and harassment with “more aggressive” rules

Twitter users join 24hr boycott to protest online harassment

Twitter says it may “refine” its policies after reversing position on Blackburn campaign ad

WhatsApp video calling feature, new design leaked

Microsoft Lumia 950 Dual SIM, Lumia 950 XL Dual SIM Launched in India

Flipkart Partners With Google to Launch App-Like Mobile Website

Google Makes Full-Disk Encryption Mandatory for New Android 6.0 Devices

Indian govt to launch its own operating system for official use

Google Makes Website Making Easy With “Material Design Lite” and Free Website Builder

Shodan-Dorks – Dorks for Shodan; a powerful tool used to search for Internet-connected devices

Secator – The Pentester’S Swiss Knife

RecycledInjector – Native Syscalls Shellcode Injector

CakeFuzzer – Automatically And Continuously Discover Vulnerabilities In Web Applications Created Based On Specific Frameworks

Mantra – A Tool Used To Hunt Down API Key Leaks In JS Files And Pages

ScrapPY – A Python Utility For Scraping Manuals, Documents, And Other Sensitive PDFs To Generate Wordlists That Can Be Utilized By Offensive Security Tools

VulnKnox – A Go-based Wrapper For The KNOXSS API To Automate XSS Vulnerability Testing

Camtruder – Advanced RTSP Camera Discovery and Vulnerability Assessment Tool

Ghost-Route – Ghost Route Detects If A Next JS Site Is Vulnerable To The Corrupt Middleware Bypass Bug (CVE-2025-29927)

DockerSpy – DockerSpy Searches For Images On Docker Hub And Extracts Sensitive Information Such As Authentication Secrets, Private Keys, And More

VulnNodeApp – A Vulnerable Node.Js Application

Pyrit – The Famous WPA Precomputed Cracker

Sri Lanka arrests 2 men over Taiwan bank hacking

Here’s the Facebook Hacking Tool that Can Really Hack Accounts, But…

3 Wipro employees arrested for hacking UK firm TalkTalk

Samsung agrees to pay Apple $548 million for copying its iPhone designs

Indian hackers ‘pay back’ Pakistan for 26/11

Boy, 15, arrested in Northern Ireland in connection with TalkTalk hack

Sri Lanka arrests 2 men over Taiwan bank hacking

324,000 Financial Records with CVV Numbers Stolen From A Payment Gateway

Over 800,000 Brazzers User Accounts Hacked

Aryabhatta college of Delhi University (DU) website hacked by Pakistani Hackers

Indian Railways page hacked by Al Qaeda. And this is the message they left for Indian Muslims

JNU’s Website Defaced by Indian Hackers

‘Pokémon Snap’ lives on through ‘Pokémon Go’ photography contest

Desk lamp transforms from notepad into a modern, stylish lamp

Nissan drove a GT-R around a racetrack using a PS4 controller

Razer’s first ever smartphone could be coming next month

Oculus Go solves VR’s two biggest problems

Truly driverless cars could soon be allowed on California’s roads

Shodan-Dorks – Dorks for Shodan; a powerful tool used to search for Internet-connected devices

Uro – Declutters Url Lists For Crawling/Pentesting

Witcher – Managing GitHub Advanced Security (GHAS) Controls At Scale

ByeDPIAndroid – App To Bypass Censorship On Android

API-s-for-OSINT – List Of API’s For Gathering Information About Phone Numbers, Addresses, Domains Etc

Firecrawl-Mcp-Server – Official Firecrawl MCP Server – Adds Powerful Web Scraping To Cursor, Claude And Any Other LLM Clients

Your iPhone will Alert You if You are Being Monitored At Work

Warning! — Linux Mint Website Hacked and ISOs replaced with Backdoored Operating System

WhatsApp down on New Year’s Eve: Users worldwide unable to connect as messaging app crashes repeatedly

WhatsApp video calling feature, new design leaked

Bad Santa! Microsoft Offers — ‘Upgrade now’ or ‘Upgrade tonight’ to Push Windows 10

Samsung agrees to pay Apple $548 million for copying its iPhone designs

Drozer – The Leading Security Assessment Framework For Android

Apepe – Enumerate Information From An App Based On The APK File

Androidqf – (Android Quick Forensics) Helps Quickly Gathering Forensic Evidence From Android Devices, In Order To Identify Potential Traces Of Compromise

FireStorePwn – Firestore Database Vulnerability Scanner Using APKs

LibAFL – Advanced Fuzzing Library – Slot Your Fuzzer Together In Rust! Scales Across Cores And Machines. For Windows, Android, MacOS, Linux, No_Std, …

Cpufetch – Simplistic Yet Fancy CPU Architecture Fetching Tool