# Filesystem Forensics ## Forensic Analysis with VQL --- ## Module overview * Velociraptor implements many forensic capabilities in VQL * This module will focus on typical forensic analysis and deep inspection capabilities. We will learn how to put the capabilities together to produce effective artifacts and when to use those.
## Searching for files - glob() * One of the most common operations in DFIR is searching for files efficiently. * Velociraptor has the `glob()` plugin to search for files using a glob expression. * Glob expressions use wildcards to search the filesystem for matches. * Paths are separated by / or \ into components * A `*` is a wildcard match (e.g. `*.exe` matches all files ending with .exe) * Alternatives are expressed as comma separated strings in `{}` e.g. `*.{exe,dll,sys}` * A `**` denotes recursive search. `e.g. C:\Users\**\*.exe` --- ## Exercise: Search for exe * Search user’s home directory for binaries. ```sql SELECT * FROM glob(globs='C:\\Users\\**\\*.exe') ``` Note the need to escape `\` in strings. You can use `/` instead and specify multiple globs to search all at the same time: ```sql SELECT * FROM glob(globs=['C:/Users/**/*.exe', 'C:/Users/**/*.dll']) ``` --- ## Filesystem accessors * Glob is a very useful concept to search hierarchical trees * Velociraptor supports direct access to many different such trees via accessors (essentially FS drivers): * `file` - uses OS APIs to access files. * `ntfs` - uses raw NTFS parsing to access low level files * `reg` - uses OS APIs to access the windows registry * `raw_reg` - search in a registry hive * `zip` - Search inside zip files --- ## The registry accessor * Uses the OS API to access the registry * Top level consists of the major hives (`HKEY_USERS` etc) * Values appear as files, Keys appear as directories * Default value is named “@” * Value content is included inside the Data attribute * Can escape components with / using quotes `HKLM\Microsoft\Windows\"http://www.microsoft.com/"` --- ## The registry accessor * The `OSPath` column includes the key (as directory) and the value (as a filename) in the path. * The Registry accessor also includes value contents if they are small enough in the `Data` column. --- ## Exercise - RunOnce artifact * Write an artifact which hashes every binary mentioned in Run/RunOnce keys. * “Run and RunOnce registry keys cause programs to run each time that a user logs on.” * https://learn.microsoft.com/en-us/windows/win32/setupapi/run-and-runonce-registry-keys * You can test this by adding a key ```text REG ADD "HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run" /v Notepad /t REG_SZ /d "C:\Windows\notepad.exe" ``` * Can you think of limitations? --- ## Exercise - RunOnce artifact
``` LET RunGlob = '''HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run\*''' SELECT Name, Mtime, Data.value AS Data FROM glob(globs=RunGlob, accessor="registry") ```
--- ## Exercise: Hash all files provided in the globs * Create an artifact that hashes files found by user provided globs. * BONUS: * Support number of concurrent threads. * Accept a list of hashes and filter on those. --- # Searching data ## Scanning file contents --- ## Searching data * A powerful DFIR technique is searching bulk data for patterns * Searching for CC data in process memory * Searching for URLs in process memory * Searching binaries for malware signatures * Searching registry for patterns Bulk searching helps to identify evidence without needing to parse file formats --- ## YARA - The swiss army knife * YARA is a powerful keyword scanner * Uses rules designed to identify binary patterns in bulk data * YARA is optimized to scan for many rules simultaneously. * Velociraptor supports YARA scanning of bulk data (via accessors) and memory. `yara()` and `proc_yara()` --- ## YARA rules Yara rules has a special domain specific language ```yara rule X { strings: $a = “hello” nocase $b = “Goodbye” wide $c = /[a-z]{5,10}[0-9]/i condition: $a and ($b or $c) } ``` --- ## Exercise: drive by download * You suspect a user was compromised by a drive by download (i.e. they clicked and downloaded malware delivered by mail, ads etc). * You think the user used the Edge browser but you have no idea of the internal structure of the browser cache/history etc. * Write an artifact to extract potential URLs from the Edge browser directory (also where is it?) --- ## Step 1: Figure out where to look Looks like somewhere in `C:\Users\
\AppData\Local\Microsoft\Edge\**` ``` LET Globs = "C:/Users/Administrator/AppData/Local/Microsoft/Edge/**" SELECT OSPath FROM glob(globs=Globs) ``` --- ## Step 2: Recover URLs * We don't exactly understand how Edge stores data but we know roughly what a URL is supposed to look like! * Yara is our sledgehammer ! ``` rule URL { strings: $a = /https?:\/\/[a-z0-9\/+:\?.-]+/i condition: any of them } ``` --- ## Step 3: Let’s do this!
``` LET Globs = "C:/Users/Administrator/AppData/Local/Microsoft/Edge/**" LET URLRule = ''' rule URL { strings: $a = /https?:\/\/[a-z0-9\/+:\?.-]+/i condition: any of them } ''' SELECT * FROM foreach(row={ SELECT OSPath FROM glob(globs=Globs) }, query={ SELECT OSPath, String.Data AS Hit FROM yara(files=OSPath, rules=URLRule, number=10000000) }) ```
--- ## YARA best practice * You can get yara rules from many sources (threat intel, blog posts etc) * YARA is really a first level triage tool: * Depending on signature many false positives expected * Some signatures are extremely specific so make a great signal * Try to collect additional context around the hits to eliminate false positives. * Yara scanning is relatively expensive! consider more targeted glob expressions and client side throttling since usually YARA scanning is not time critical. --- ## Uploading files * Velociraptor can collect file data. * Over the network * Locally to a collection zip file. * Driven by VQL The `upload()` VQL function copies a file using an accessor to the relevant container --- ## Exercise: Upload Recent executables * Collect all recent executables in users’ home directory * Written in the past week * Write your own VQL by combining `glob()` and `upload()`
``` LET Globs = "C:/Users/Administrator/**/*.exe" SELECT Mtime, OSPath, upload(file=OSPath) AS Upload FROM glob(globs=Globs) WHERE Mtime > now() - 7 * 24 * 60 * 60 ```
## NTFS Overview * NTFS is the file system in all modern Windows operating systems. * Feature packed with a design focused on storage optimization and resilience. * NTFS implements Journalling to record metadata changes to track state and integrity of the file system. * Allows for recovery after system crashes to avoid data loss * File System objects referenced in a Master File Table (MFT) --- ## New Technology File System * In NTFS, the Master File Table (MFT) is at the heart of the file system. A structured database that stores metadata entries for every file and folder. * Every object gets an entry within the MFT. Each entry is usually 1024 bytes long. Contains a series of attributes that fully describe the object. --- ## MFT entries contain attributes
## File entry examples * $STANDARD_INFORMATION * $FILE_NAME (Windows long name) * $FILE_NAME (short name) * $DATA * $DATA (alternate data stream sometimes)
## Folder entry examples * $STANDARD_INFORMATION * $FILE_NAME (Windows long name) * $FILE_NAME (short name) * $INDEX_ROOT * $INDEX_ALLOCATION (sometimes)
--- ## NTFS Analysis Velociraptor offers a number of plugins to access detailed information about NTFS: * `parse_mft()`: parses each MFT entry and returns high level metadata about the entry - including reconstruct the full path of the entry by traversing parent MFT entries. * `parse_ntfs()`: Given an MFT ID this function will display information about the various streams (e.g. `$DATA`, `$Filename` etc) * `parse_ntfs_i30()`: This scans the `$i30` stream in directories to recover potentially deleted entries. --- ## Finding suspicious files Parse the MFT using `Windows.NTFS.MFT` * Common DFIR use case is finding files * File name * Path * File type * Content * Velociraptor plugins * glob * parse_mft * yara * other content based plugins
--- ## Windows.Forensics. FilenameSearch * Apply yara on the MFT * fast yara * simple string based * filename / top level folder only * comma separated * Crude and less control * Verbose results
--- ## Windows.NTFS.MFT
* Parses MFT * Easy to use * Filters * Path * File name * Drive * Time bounds * Size * Performance optimised
--- ## Exercise - Generate test data To automatically prep your machine run this script: ```powershell ### NTFS exercise setup ## 1. download some files to test various content and add ADS to simulate manual download from a browser $downloads = ( "https://live.sysinternals.com/PsExec64.exe", "https://live.sysinternals.com/procdump64.exe", "https://live.sysinternals.com/sdelete64.exe" ) foreach ( $url in $downloads){ "Downloading " + $Url $file = Split-Path $Url -Leaf $dest = "C:\PerfLogs\" +$file $ads = "[ZoneTransfer]`r`nZoneId=3`r`nReferrerUrl=https://18.220.58.123/yolo/`r`nHostUrl=https://18.220.58.123/yolo/" + $file + "`r`n" Remove-Item -Path $dest -force -ErrorAction SilentlyContinue Invoke-WebRequest -Uri $Url -OutFile $dest -UseBasicParsing Set-Content -Path $dest":Zone.Identifier" $ads } ``` --- ## More setup ```powershell ## 2.Create a PS1 file in staging folder (any text will do but this is powershell extension) echo "Write-Host ‘this is totally a resident file’" > C:\Perflogs\test.ps1 ## 3.Modify shortname on a file fsutil file setshortname C:\PerfLogs\psexec64.exe fake.exe ## 4. Create a process dumpOpen calculator (calc.exe) calc.exe ; start-sleep 2 C:\PerfLogs\procdump64.exe -accepteula -ma win32calc C:\PerfLogs\calc.dmp get-process | where-object { $_.Name -like "*win32calc*" } | Stop-Process ## 5. Create a zip file in staging folder Compress-Archive -Path C:\PerfLogs\* -DestinationPath C:\PerfLogs\exfil.zip -CompressionLevel Fastest ## 6. Delete dmp,zip and ps1 files - deleted file discovery is important for later! Remove-Item -Path C:\PerfLogs\*.zip, C:\PerfLogs\*.dmp, C:\PerfLogs\*.ps1 ``` Note: * Download and copy to staging folder C:\PerfLogs\ * https://live.sysinternals.com/procdump64.exe * https://live.sysinternals.com/sdelete64.exe * https://live.sysinternals.com/psexec64.exe * Add ADS to simulate Mark of the Web Create a PS1 file in staging folder (any text will do but this is powershell extension) ``` echo "Write-Host ‘this is totally a resident file’" > C:\Perflogs\test.ps1 ``` Modify short name on a file ``` fsutil file setshortname C:\PerfLogs\psexec64.exe fake.exe ``` Create a process dump Open calculator (`calc.exe`) ``` C:\PerfLogs\procdump64.exe -accepteula -ma calc C:\PerfLogs\calc.dmp ``` Create a zip file in staging folder - open `C:\Perflogs in Explorer` highlight and select: Send to > Compressed (zipped) folder. Delete `dmp.zip` and `ps1` files - deleted file discovery is important for later! ``` Remove-Item -Path C:\PerfLogs\*.zip, C:\PerfLogs\*.dmp, C:\PerfLogs\*.ps1 ``` --- ## Exercise
* Find contents of `C:\Perflogs` * Review metadata of objects * Explore leveraging filters * to target specific files or file types * to find files limited to a time frame * Can you find the deleted files? * You may get lucky and have an unallocated file show. * Try `Windows.Forensics.Usn` with filters looking for suspicious extensions in our staging location!
--- ## The USN journal * Update Sequence Number Journal or Change journal is maintained by NTFS to record filesystem changes. * Records metadata about filesystem changes. * Resides in the path $Extend\$UsnJrnl:$J ![](../../modules/ntfs_forensics/usnj.png) --- ## USN Journal * Records are appended to the file at the end * The file is sparse - periodically NTFS will remove the range at the start of the file to make it sparse * Therefore the file will report a huge size but will actually only take about 30-40mb on disk. * When collecting the journal file, Velociraptor will collect the sparse file. --- ## Exercise - Windows.Forensics.Usn
Target `C:\PerfLogs` with the `PathRegex` field. * typically the USN journal only records filename and MFTId and ParentMFTId record. Velociraptor automatically reconstructs the expected path so the user can filter on path. * This artifact uses FullPath results with “/”.
--- ## Exercise - UsnJ solution * There are many entries even for a simple file action like download to disk. ![](../../modules/ntfs_forensics/USN_results.png) --- ## Exercise - UsnJ solution
* But these are simple to detect when you know what to look for!
![](../../modules/ntfs_forensics/USN_groupby.png)
![](../../modules/ntfs_forensics/USN_delete.png)
--- ## Advanced NTFS: Alternate Data Stream
* Most browsers attach an ADS to files downloaded from the internet. * Use the VFS viewer to view the ADS of downloaded files. * Use ADS Hunter to discover more interesting ADS * Use `Windows.Analysis. EvidenceOfDownload` to identify downloaded files and unpacked ZIP files.
Note: The inset shows typical frequency analysis of ADS naturally occurring What is the `Wof` stuff? https://devblogs.microsoft.com/oldnewthing/20190618-00/?p=102597
## Volume Shadow Copies NTFS allows for a special copy on write snapshot feature called `Volume Shadow Copy`. Create a VSS copy on your own machine using WMI: ```sh wmic shadowcopy call create Volume='C:\' ``` Ensure your system contains a volume shadow copy ```bash vssadmin list shadows ``` Note: On Windows server OS you can use: ```bash vssadmin create shadow ``` --- ## NTFS accessor and VSS * When a VSS copy is created, it is accessible via a special device. Velociraptor allows the VSS copies to be enumerated by listing them at the top level of the filesystem. * At the top level, the accessor provides metadata about each device in the “Data” column, including its creation time. This is essentially the same output as vssadmin list shadows --- ### Velociraptor shows VSS at the top level of the filesystem ![](vss.png) --- ## Exercise: Find all VSS copies of the event logs * We can glob the VSS just as if they were a directory * Makes it easy to fetch every version of a certain file (e.g. a log file). --- ## Exercise: Find all VSS copies of the event logs
``` SELECT * FROM glob(globs="/*/windows/system32/winevt/logs/system.evtx", accessor="ntfs") ``` ![](evtx_in_vss.png)
--- ## Carving The USN Journal * The USN Journal is very useful to determine filesystem activities. * However, it is normally limited to 30mb * In practice the USN Journal rolls over quickly * However, the journal is not overwritten! * There is a large likelyhood that entries remain for a long time. Lets carve them with `Windows.Carving.USN` --- ### Carving USN Journal can recover events in the distant past ![](carving_usnj.png)
# More about Accessors ## Accessing data in many ways. --- ## What is an accessor? * Velociraptor offers many plugins that operate on file data. * Sometimes file data is available in many different contexts. * An `accessor` is a driver that presents data in the shape of a filesystem: * Hierarchical data can be searched using the `glob()` plugin. * Data can be opened using a filename just like a file. * The actual implementation of how to read the data is varied. * Accessors deal with `OSPath` objects. --- ## The OSPath object * Consists of the following data points: * `Components`: Directories are represented as a series of components. * `Path`: The stringified version of the components above (using path separator etc). * `Type`: The type of the OSPath controls how to serialize and parse the Components into a string (e.g. path separator). * `DelegateAccessor`: Some accessors work on the output from other accessors. * `DelegatePath`: The Path that will be passed to the `DelegateAccessor`. * Have convenience methods and behaviors: * Indexing or slicing the `OSPath` gets the indexed `Component` * Has `Base` and `Dir` methods * Addition with a string overloaded. --- ## Parsing a string into an OSPath * Paths are strings that are interpreted by the `accessor` to reference a file or directory. * Accessors are free to interpret the string in whatever way they wish. * Accessors consume a `OSPath` object and return `OSPath` objects. * Within the query all paths are represented by `OSPath` objects. * On the way in (via plugin args) or out (via JSON encoding) the `OSPath` objects are converted to strings. * Use the `pathspec()` function to control parsing of strings more carefully. --- ## Exercise: Parsing paths In a VQL notebook parse the following paths using the `pathspec()` plugin: * `/usr/bin/ls` * `\\.\C:\Windows\Notepad.exe` * `HKLM\Software\"http://www.google.com"\Some Key\Value` --- ## Exercise: Parsing paths ```sql LET X = '''Path /usr/bin/ls \\.\C:\Windows\Notepad.exe HKLM\Software\"http://www.google.com"\Some Key\Value ''' SELECT pathspec(parse=Path, path_type="windows").Components, pathspec(parse=Path, path_type="ntfs").Components, pathspec(parse=Path, path_type="linux").Components, pathspec(parse=Path, path_type="registry").Components FROM parse_csv(accessor="data", filename=X) ``` --- ### Parsing paths ![](parsing_paths.png) --- ## Life of a Path ``` SELECT * FROM glob(globs="*", root='''\\.\C:\Windows''', accessor="ntfs") ``` 1. The Glob plugin accepts a `pathspec` for the root parameter. 2. It is given a string `\\.\C:\Windows`. 3. `glob()` will now attempt to convert the string to an `OSPath` object. This depends on the accessor to interpret the data. 4. The `ntfs` accessor interprets the string into a list of path components: `\\.\C:`, `Windows` 5. The plugin will now list all the files in the directory using the `ntfs` accessor. For each file, we get an OSPath object. --- ## Exercise: OSPath operations * The OSPath object can be used to manipulate paths * Useful methods: `Basename`, `Dirname`, `Components` * Adding components, indexing gets specific components. * Works for complex nested paths ```sql LET ZipPath = "C:/Users/Administrator/Documents/test.docx" SELECT OSPath, OSPath.Basename, OSPath[0], OSPath.Components, OSPath.Dirname, OSPath.Dirname + "Hello.txt", OSPath.Path FROM glob(globs="**", root=pathspec(DelegateAccessor="file", DelegatePath=ZipPath), accessor="zip") ``` --- ### OSPath operations ![](ospath_operations.png) --- ## Basic accessors: file, auto * We already encountered the `file` and `auto` accessors previously. * Provide access to files. * There are a number of different flavors: * A Windows path starts with a drive letter, or a device name, and uses `\` (preferred) or `/` for path separator. * Linux paths are rooted at `/` --- ## The data and scope accessors * Velociraptor contains many plugins that read files via accessors * Sometimes data is already available as a string. * The `data` accessor allows VQL plugins to treat a string as a file. * The filename is taken as the content of the file. * The `scope` accessor is similar * The filename is takes an the name of a scope variable that contains the data. * Useful for uploads as the original path is also sent --- ## The ZIP accessor * Zip files are a common basis for many file formats * e.g. `docx`, `pptx`, `jar`, `odt` * Velociraptor makes it easy to access using the `zip` accessor: * `Path`: Is the path within the zip file * `DelegateAccessor`: The zip accessor will use this to open the underlying file. * `DelegatePath`: The zip accessor will use this to open the underlying file. --- ## Exercise: Search a word document for a keyword * Create a `docx` document using `wordpad` * Apply the `glob()` plugin with the zip accessor to view all the files. * Apply the `yara()` plugin to searh the content of the zip for a keyword. --- ## Solution: Search a word document for a keyword
```sql LET ZipPath = "C:/Users/Administrator/Documents/test.docx" LET Rule = ''' rule X { strings: $a="secret" condition: any of them } ''' SELECT * FROM foreach(row={ SELECT * FROM glob(globs="**", root=pathspec(DelegateAccessor="file", DelegatePath=ZipPath), accessor="zip") }, query={ SELECT * FROM yara(rules=Rule, files=OSPath, accessor="zip") }) ```
--- ## Exercise: Identify vulnerable Java programs * Java programs can be compiled into a `JAR` file. * This is basically a zip file bundling all dependencies. * Because dependencies are embedded in the JAR file: * If a library is compromised the entire program is still compromised * It is hard to know exactly which version of each library exists * Write a VQL Artifact to detect JAR files that contain a particular set of hashs. --- ## Exercise: Identify vulnerable Java programs * Download the vulnerable JAR from: https://github.com/tothi/log4shell-vulnerable-app/releases * Download vulnerable hashes from: https://gist.github.com/xanda/7ac663984f3560f0b39b18437362d924 --- ## Solution: Identify vulnerable Java programs
``` LET HashList = SELECT Content FROM http_client(url="https://gist.github.com/xanda/7ac663984f3560f0b39b18437362d924/raw/79d765296634c0407db99763d0b2c7c318e30078/Vulnerable_JndiLookup_class_hashes.csv") LET VulnHashes <= SELECT * FROM parse_csv(accessor="data", filename=HashList[0].Content) LET VulnMD5 <= VulnHashes.md5sum LET VulnMD5Regex <= join(array=VulnHashes.md5sum, sep="|") SELECT * FROM foreach(row={ SELECT OSPath AS JAR FROM glob(globs="C:/Users/Administrator/Downloads/*.jar") }, query={ SELECT JAR, OSPath.Path AS Member, Size, hash(accessor="zip", path=OSPath) AS Hash FROM glob(globs="**", root=pathspec(DelegatePath=JAR), accessor="zip") }) WHERE Hash.MD5 =~ VulnMD5Regex // OR Hash.MD5 IN VulnMD5 ```
--- ## Raw registry parsing * In a previous exercise we looked for a key in the `HKEY_CURRENT_USER` hive. * Any artifacts looking in `HKEY_USERS` using the Windows API are limited to the set of users currently logged in! We need to parse the raw hive to reliably recover all users. * Each user’s setting is stored in: `C:\Users\
\ntuser.dat` * It is a raw registry hive file format. We need to use `raw_reg` accessor. The raw reg accessor uses a PathSpec to access the underlying file. --- ## Exercise: Repeat the Run/RunOnce example with raw registry.
``` LET RunGlob = '''SOFTWARE\Microsoft\Windows\CurrentVersion\Run\*''' SELECT * FROM foreach(row={ SELECT OSPath FROM glob(globs="C:/Users/*/NTUser.dat") }, query={ SELECT Name, Mtime, Data.value AS Data FROM glob(globs=RunGlob, accessor="raw_reg", root=pathspec(DelegatePath=OSPath)) }) ```
--- ## The process accessor: accessing process memory * Velociraptor can read process memory using the `process` accessor * Process memory is not contiguous - it is very sparse. * Velociraptor handles the sparse nature automatically * The yara plugin automatically handles sparse regions * Upload plugin skips uploading unmapped memory --- ## Exercise: Write an artifact that uploads process memory * Search for a keyword hit and upload the entire process memory if there is a hit. * Option: Uses `procdump()` to get a Windows debugger compatible crashdump * Option: Use `upload()` to get a Velociraptor sparse file. --- ## Exercise: Write an artifact that uploads process memory * Using `proc_dump()`
``` SELECT * FROM foreach(row={ SELECT Pid, Name FROM pslist() WHERE Name =~ "wordpad" }, query={ SELECT upload(file=FullPath) AS Upload, Name, Pid FROM proc_dump(pid=Pid) }) ```
--- ## Exercise: Write an artifact that uploads process memory * Using `upload()`
``` SELECT Pid, Name, upload(accessor="process", file=str(str=Pid)) AS Upload FROM pslist() WHERE Name =~ "wordpad" ```
--- ## The sparse accessor * Velociraptor can handle sparse files correctly. * The `sparse` accessor allows you to create a sparse overlay over other data * Velociraptor will skip sparse regions when scanning or uploading * Useful when we want to avoid reading certain data * e.g. Memory scanning or Carving ``` FileName = pathspec( DelegateAccessor="data", DelegatePath=MyData, Path=[dict(Offset=0,Length=5), dict(Offset=10,Length=5)]) ``` --- ## Exercise: Upload only first 10k of each file. * Write an artifact that uploads only the first 10kb of each file.
``` LET Globs = "C:/Users/*/Downloads/*" SELECT OSPath, upload(accessor="sparse", file=pathspec(DelegatePath=OSPath, Path=[dict(offset=0, length=10000),]), name=OSPath) AS Upload FROM glob(globs=Globs) ```
--- ## The smb accessor * It is possible to access a remote SMB server using the `smb` accessor. * The accessor requires credentials for accessing the remote server. * Credentials are provided via a scope parameter ```sql LET SMB_CREDENTIALS <= dict(`192.168.1.11`="admin:password") -- Or build from artifact args LET SMB_CREDENTIALS <= set(item=dict(), field=ServerName, value=format(format="%s:%s", args=[Username, Password])) ``` --- ## Exercise: Configuring an SMB share * Configure an SMB share on your server and place a file there. * Write a VQL query that searches the SMB share. * See https://docs.velociraptor.app/docs/offline_triage/remote_uploads/#smb-share
``` LET SMB_CREDENTIALS <= dict(`192.168.1.112`="administrator:test!password") SELECT * FROM glob(globs="**", root="//192.168.1.112/uploads", accessor="smb") WHERE OSPath =~ ".exe$" ```
# Parsing ## Processing and analyzing evidence on the endpoint --- ## Parsing evidence on the endpoint * By analyzing files directly on the endpoint we can extract relevant data immediately. * Velociraptor supports sophisticated parsing strategies that allow VQL artifacts to extract maximum details directly on the endpoint. * Built in parsers (`parse_ntfs`, `parse_xml`, `parse_json`) * Text based parsers (`parse_string_with_regex`, `split`) * Binary parser * By eliminating the need for post processing we can scale analysis across larger number of endpoints --- ## Built in parsers - SQLite * SQLite is used in many contexts and many applications * Velociraptor has a built in parser for SQLite that can be controlled via VQL. * If the SQLite file locked, Velociraptor with make a local copy! * This allows Velociraptor to access many different types of evidence. --- ## Exercise: Parse the chrome Top Sites file * Location is ```sh C:\Users\*\AppData\Local\Google\Chrome\User Data\Default\Top Sites ``` * SQLite query to see the schema ``` SELECT * FROM sqlite_master ``` --- ## Sqlite analysis ![](sqlite_analysis.png) * Streamlined Artifact: https://github.com/Velocidex/SQLiteHunter --- ## Complex RegEx parsing * Sometimes log files are less structured and a regex based approach is not reliable enough. * In this case think about how to split the data in a reliable way and apply regular expressions multiple times. * Divide and Concur --- ## Parsing with Regular Expressions * Two main regex parsing tools: * `parse_records_with_regex()` splits text into larger "records" * `parse_string_with_regex()` extracts specific fields from each "record" --- ## Exercise: MPLogs * [Mind the MPLog: Leveraging Microsoft Protection Logging for Forensic Investigations](https://www.crowdstrike.com/blog/how-to-use-microsoft-protection-logging-for-forensic-investigations/) * MPLog files are found in `C:\ProgramData\Microsoft\Windows Defender\Support` * Events described in [This Reference](https://learn.microsoft.com/en-us/microsoft-365/security/defender-endpoint/troubleshoot-performance-issues?view=o365-worldwide) Write a VQL Parser to parse these logs. --- ## Steps for solution 1. Locate data from disk and split into separate log lines (records). * Use `glob()`, `parse_lines()` and `utf16()` 2. Find a strategy to parse each record: * Will one pass Regex work? * What is the structure of the line? * Use `split()` 3. This about how to present the data: * Dict addition to combine several fields. --- ## Possible solution * Not really perfect because log is not very consistent.
```sql LET LogGlob = '''C:\ProgramData\Microsoft\Windows Defender\Support\MPLog*.log''' LET AllLines = SELECT * FROM foreach(row={ SELECT utf16(string=read_file(filename=OSPath, length=10000000)) AS Data, OSPath FROM glob(globs=LogGlob) }, query={ SELECT Line, OSPath FROM parse_lines(filename=Data, accessor="data") }) LET ParseData(Data) = to_dict(item={ SELECT split(sep_string=":", string=_value)[0] AS _key, split(sep_string=":", string=_value)[1] AS _value FROM foreach(row=split(sep=", ", string=Data)) }) LET Lines = SELECT OSPath, Line, parse_string_with_regex(string=Line, regex="^(?P
[^ ]+) (?P
.+)") AS P FROM X WHERE P.Timestamp SELECT * FROM foreach(row={ SELECT dict(Timestamp=P.Timestamp, _Line=Line, _OSPath=OSPath) + ParseData(Data=P.Data) AS Data FROM Lines }, column="Data") ```
--- # The Binary Parser --- ## Parsing binary data * A lot of data we want to parse is binary only * Having a powerful binary parser built into VQL allows the VQL query to parse many more things! * [VQL Binary parser](https://github.com/Velocidex/vtypes) is declerative. * Focus on **what** the data means not how to extract it. * Exact data layout is specified by a `Profile` --- ## What is binary data? * Serialized representation of abstract data structures * Declare the layout of the data and let the parser recover the data from the binary stream. * Example: Parsing integers from binary stream. ```sql LET Data = unhex(string="0102030405060708") LET Parsed = parse_binary(accessor="data", filename=Data, offset=4, struct="uint32") SELECT Parsed, format(format="%#08x", args=Parsed) FROM scope() ``` --- ## Parsing a struct * In practice most software arrange simple types into "records" or "structs". This lays the data in "fields" * We can define a profile to interpret the binary data as fields. ```sql LET Data = unhex(string="0102030405060708") LET Profile = '''[ ["Header", 12, [ ["Field1", 0, "uint16"], ["Field2", 4, "uint32"] ]]] ''' SELECT parse_binary(accessor="data", filename=Data, struct="Header", profile=Profile) FROM scope() ``` --- ## Parsing Structs ![](parsing_structs.png) --- ## Calculating fields * In practice many fields, such as offsets or sizes are calculated based on the data. * Velociraptor supports these derived fields using `VQL Lambda`. * VQL Lambda is a function that receives the current struct as a parameter and returns a single value. * The calculated value will be used to parse the field. ```sql LET Profile = '''[ ["Header", 12, [ ["OffsetOfField2", 1, "uint8"], ["Field2", "x=>x.OffsetOfField2 + 2", "uint32"] ]]] ''' ``` --- ## Calculating fields ![](parsing_structs_derived_offset.png) --- ## Unsecured SSH keys
A common mechanism of privilege escalation is compromise of SSH keys without password * Can be immediately leveraged to gain access to other hosts * e.g. AWS by default does not have password!
![](ssh_keys_aws.png)
---
## Traditional approach 1. Collect all ssh private key files in the environment. 2. Store them in a central locations. 3. Run specialized parser to determine if they keys are protected
## Velociraptor approach 1. Write a (reusable) artifact to parse SSH private key files - determine if they are protected. 2. Hunt across the environment for unprotected files. 3. Remediate or focus on weak keys.
--- ## How can I tell if a file is protected? ### Parsing SSH private key files.
* Private key files come in various formats and types * Let's develop some VQL to parse it * File format reference https://coolaj86.com/articles/the-openssh-private-key-format/
![](ssh_keys_format.png)
--- ## Exercise: Parse SSH Private keys * Create some new ssh keys using `ssh-keygen` ![](ssh-keygen.png) --- ## Step 1: read the file.
```sql LET Filename = '''C:\Users\Administrator/.ssh/id_rsa''' SELECT read_file(filename=Filename) FROM scope() ``` ![](read_ssh_keyfile.png)
--- ## Step 2: Extract the base64 encoded part * Using regular expressions we can isolate the base64 encoded data. * Apply `base64decode()` to recover the binary data. * What is the binary data though? * Write a "Profile" and apply it to the binary data to extract fields.
```sql LET Filename = '''C:\Users\Administrator/.ssh/id_rsa''' LET Decoded(Filename) = base64decode( string=parse_string_with_regex( string=read_file(filename=Filename), regex="(?sm)KEY-----(.+)-----END").g1) SELECT Decoded(Filename=Filename) FROM scope() ```
--- ## Step 3: Binary parser built in VQL * Declare struct layout as a data driven "profile".
```sql LET SSHProfile = '''[ ["Header", 0, [ ["Magic", 0, "String", { "length": 100, }], ["cipher_length", 15, "uint32b"], ["cipher", 19, "String", { "length": "x=>x.cipher_length", }] ]]] ''' ```
* We can update the profile at any time without rebuilding the client. --- ### Step 4: Parse the header and find if the key is encrypted.
```sql LET Filename = '''C:\Users\Administrator/.ssh/id_rsa''' LET SSHProfile = '''[ ["Header", 0, [ ["Magic", 0, "String", { "length": 100, }], ["cipher_length", 15, "uint32b"], ["cipher", 19, "String", { "length": "x=>x.cipher_length", }] ]]] ''' LET Decoded(Filename) = base64decode( string=parse_string_with_regex( string=read_file(filename=Filename), regex="(?sm)KEY-----(.+)-----END").g1) SELECT parse_binary( accessor="data", filename=Decoded(Filename=Filename), profile=SSHProfile, struct="Header") AS Parsed FROM scope() ```
--- **Full SSH Private key parser** *Uses binary parser, regular expression and file search* ![](private_key_artifact.png) --- ## Exercise: Parsing root certificates in the registry * Subverting the certificate root store is an effective technique to intercept encryption. https://attack.mitre.org/techniques/T1553/004/ ![](mitre_cert_subversion.png) --- ## Exercise: Parsing root certificates in the registry * Root certs are stored in the registry as a binary blob. * Inspect the binary data * Parse the binary data https://blog.nviso.eu/2019/08/28/extracting-certificates-from-the-windows-registry/ ```sql LET ColumnTypes = dict(Blob="base64hex") LET Glob = '''HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SystemCertificates\ROOT\Certificates\**\Blob''' SELECT base64encode(string=Data.value) AS Blob FROM glob(globs=Glob, accessor="registry") ``` --- ## Solution
```sql LET ColumnTypes = dict(Blob="base64hex") LET Glob = '''HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SystemCertificates\ROOT\Certificates\**\Blob''' LET profile = '''[ ["Record", "x=>x.Length + 12", [ ["Type", 0, "uint32"], ["Length", 8, "uint32"], ["Data", 12, "String", { length: "x=>x.Length", term: "", }], ["UnicodeString", 12, "String", { encoding: "utf16", }] ]], ["Records", 0, [ ["Items", 0, "Array", { type: "Record", count: 20, sentinel: "x=>x.Length = 0", }] ]] ]''' SELECT OSPath, Certificate FROM foreach(row={ SELECT OSPath, base64encode(string=Data.value) AS Blob, parse_binary(accessor="data", filename=Data.value, profile=profile, struct="Records") AS Parsed FROM glob(globs=Glob, accessor="registry") }, query={ SELECT OSPath, parse_x509(data=Data)[0] AS Certificate FROM foreach(row=Parsed.Items) WHERE Type = 32 }) WHERE Certificate ```
--- ## Parsing the trusted certificates from the registry ![](parsing_certificates_from_registry.png)
# Timelines ## Combining different sources of information --- ## What is a timeline? * It is a way to visualize time based rows from multiple sources. * The main concepts: * `Timeline`: Just a series of rows keyed on a time column. The rows can be anything at all, as long as a single column is specified as the time column and it is sorted by time order. * `Super Timeline`: A grouping of several timelines viewed together on the same timeline. --- ## Timeline workflow * Timelines are created from post processed results from the notebook: 1. Collect a set of artifacts with relevant information: * e.g. MFT entries, Prefetch, Event logs etc. 2. Create a `Supertimeline` to hold all the timelines together. 3. Reduce the data from each artifact source by manipulating the VQL query: * Reduce the number of rows by limiting only interesting rows. * Reduce the columns by adding only important columns. 4. Add the table to the `Super Timeline` by selecting the time column. --- ## Example: Correlating execution with files * Run the following command: ``` curl.exe -o test.ps1 https://www.google.com/ ``` * Collect two sources of evidence: * `Windows.Timeline.Prefetch`: Collects execution times. * `Windows.NTFS.MFT`: Collects filesystem information. * For the sake of the exercise, limit times to the previous day or so. --- ### Example: Correlating execution with files ![](collecting_prefetch_and_mft.png) --- ### Example: Correlating execution with files * We want to reduce the total data in each table to make it easier to see. * Usually a time column and a single other column ![](reducing_table_for_timeline.png) --- ### Example: Correlating execution with files ![](reducing_table_for_timeline_2.png) --- ### Example: Correlating execution with files * Create a super timeline to hold the individual timelines. ![](add_super_timeline.png) --- ### Example: Correlating execution with files ![](new_empty_timeline.png) ![](adding_timeline.png) --- ### Example: Correlating execution with files * Investigating temporal correlation ![](temporal_correlation.png)
## Review And Summary * Searching for files is a powerful technique * We can search by file names * Search file content with Yara * Velociraptor's powerfull NTFS parser provides access to filesystem specific information. * Accessors allow Velociraptor's plugins to be applied to a wide range of situations. --- ## Review And Summary * Velociraptor has a number of powerful parsing tools * Allows deep analysis to be performed on the endpoint * Many built in parsers for common file formats (SQLite, Zip, PST etc). * Using Regexp we can parse text based files * A declerative binary parser can extract information from binary files. * Velociraptor timelines allows to visualize multiple time based events together.