Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support psort filters when ingesting into Timesketch #1931

Open
mpilking opened this issue Aug 28, 2021 · 7 comments
Open

Support psort filters when ingesting into Timesketch #1931

mpilking opened this issue Aug 28, 2021 · 7 comments

Comments

@mpilking
Copy link

mpilking commented Aug 28, 2021

Is your feature request related to a problem? Please describe.
Some hosts produce very large plaso data sets. As an example, a domain controller produced nearly 18 million parsed events when processed with log2timeline. Often I don't want to ingest all of those events. Many of them are from a timeframe that is irrelevent. Also, the majority of parsed events from the DC are Windows event logs. The way log2timeline parses Windows event logs results in duplicate events: one for Creation Time and one for Last Modification Time. I understand where these come from, but 99.99% of the time I only care about the Creation Time events. So, I have a psort filter I can use that will output just the Windows event log Creation events and also filter to a time range of interest. This works great for psort output to CSV. However, I don't know of a way to do an equivalent filter when importing into Timesketch. It would be great to be able to use psort filters with timesketch_importer.

Describe the solution you'd like
Here is an example of a psort filter that will narrow down those 18 million events to under 2 million. This is what I'd like to replicate with timesketch_importer or a similar option.

psort.py --output-time-zone 'UTC' -o dynamic -w dc-triage.csv dc-triage.plaso "(((parser == 'winevtx') and (timestamp_desc == 'Creation Time')) or (parser != 'winevtx')) and ( date > datetime('2021-02-01T00:00:00'))"

Describe alternatives you've considered
The only other option I'm aware of is to use timesketch_importer or the webui to upload all the data and then go back and delete unwanted documents from the Elasticsearch index. For example, this will delete documents prior to 2021-02-01 and will delete any winevtx events with the timestamp_desc containing "Modification":

curl -XPOST --header 'Content-Type: application/json' localhost:9200/plaso-dc-triage-index/_delete_by_query -d '{"query": {"range": {"datetime": {"time_zone": "+00:00","lt": "2021-02-01T00:00:00"}}}}'
curl -XPOST --header 'Content-Type: application/json' localhost:9200/plaso-dc-triage-index/_delete_by_query -d '{"query" : {"bool" : {"must": [{"match": {"parser": "winevtx"}}, {"match": {"timestamp_desc": "Modification"}}]}}}'

Additional context
This issue is tangentially related to Plaso issue #3813

@berggren
Copy link
Contributor

This is a great idea, and it shouldn't be too dificult to implement. We call psort in the background worker, and we can pass in arguments to the command. We can add a filter argument and let the user (timesketch_importer for example) set that. This should replicate exactly what you do with normal CSV output.

@56616c6f72
Copy link

Just wanted to say it be awesome if we could implement this! I had actually completely reverted to using psort.py as the timesketch_importer is missing this.

Is this still being worked on?

@jaegeral jaegeral added this to the Future milestone Oct 13, 2021
@jaegeral
Copy link
Collaborator

Just wanted to say it be awesome if we could implement this! I had actually completely reverted to using psort.py as the timesketch_importer is missing this.

Is this still being worked on?

Hey, afaik this is currently not being worked on.

@jleaniz
Copy link
Collaborator

jleaniz commented Nov 9, 2021

This issue is currently being worked on (ref: #1987)

@WoBuGs
Copy link

WoBuGs commented Feb 5, 2023

This issue is currently being worked on (ref: #1987)

I all :) Do you know if a maintainer had the opportunity to take a look at the pull request?

@pemontto
Copy link
Contributor

pemontto commented Jul 31, 2023

Would love to see this implemented. My current workflow uses psort externally with opensearch_ts so I can add filters - https://timesketch.org/developers/api-upload-data/#import-data-already-ingested-into-opensearch

Unless I'm mistaken this assumes you already have an existing timeline, however in my case the evidence is always a new timeline. I generally just attempt to +1 to the previous ID --timeline_identifier however this doesn't always work with concurrency. So I may later have to go in a re-index the Opensearch documents to update the timeline ID.

POST timesketch_index/_update_by_query
{
  "script": {
    "source": "ctx._source.__ts_timeline_id = 4",
    "lang": "painless"
  }
}

Unless I'm mistaken there's no easy way to create a timeline ID prior to uploading content?

@berggren berggren removed this from the Future milestone Sep 25, 2023
@Camel0101
Copy link

This issue would really help reduce timelines. Is this still being worked on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants