-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seeing data already ingested into a timeline when the related search index is being updated #3219
Comments
Hi @jbaptperez, I want to make sure I fully understand your workflow and needs. To clarify, it seems like you're:
Is that correct?
Would you be willing to contribute to such a feature? |
Hi @jkppr, I am not at work for the end of the week (with my project source code using Timesketch), but I can already answer most of your questions.
Yes, that's exactly what I want.
Plaso files are sent using the importer client (one chunk, but using chunks would not change anything here). Upstream, we parallelise this and we send those small Plaso files to Timesketch.
A single timeline ingests a set of 100 MiB Plaso files once and that's it in the main case. Another situation can lead to other sets of Plaso files that complete the first ones (in different ranges of time). A concerte example could be posting 20 100-MiB Plaso files using 5 POST requests in parallel (the parallelism can be adjusted) with the importer client.
I'm trying to find a proper workaround but I haven't found one yet. Your idea is interesting, in particular if the a single timeline can be "completed" with data coming from others (the user only has to refresh the same page), and assuming the other ones are deleted when merged. Actually, I really need your advice to find a correct solution that respects the philosophy of Timesketch. The 2 solutions I thought for now are the one described in the issue and a very ugly quick and dirty one: Forcing constantly the status ready for a timeline in the database with a scheduled task (an UPDATE SQL statement every n seconds).
First of all, I don't master OpenSearch but, as a database, I would bet we can request one of its index as the one it is being updated (getting partial data). I studied the source code, and I could see that both the frontend and the backend exclude a running search index (i.e. timeline) from an OpenSearch request. Moreover, the frontend part seems harder because a timeline widget behaves differently, depending of its status. So, the idea of "unlocking" timelines would mean to rethink the way it appears an it can be manipulated in Timesketch.
This is actually a big change, and I'm afraid I'll face technical / philosophical limitations, that's why I need clarification for such a feature.
Yes, definitely. I'll work on that feature from the next week and the whole month (November). To sum up, my target is "just" to be able to see timeline data while it is being imported. Thank you for your help. |
Thanks for the additional context, @jbaptperez. Also summoning @berggren here for his expertise. I've considered your feature request and how it can be implemented within Timesketch's design philosophy. These suggestions assume you're using the latest Timesketch release with the default (frontend-ng) UI. Clarifications
Any solution would require:
Allow Querying Processing Timelines/Indices:
Alternative to be considered: However, this requires a significant backend change. Currently, events in OpenSearch are linked to timelines via the What do you think? Would this be a solution you are willing to tackle? |
After some internal discussion with @berggren we recommend to proceed with the proposed "Allow Querying Processing Timelines/Indices" solution and ignore the alternative for now since it would need to much changes of our existing backend logic. |
Is your feature request related to a problem? Please describe.
I cannot see data already ingested into a timeline when the related search index is being updated with new data (a new Plaso file is being updated).
An automated process send a set of Plaso files successively to the same timeline/searchindex and I can only see the final result, late.
I need access to the current ingested data, as soon as possible, even if incomplete.
Describe the solution you'd like
The frontend and the backend should not exclude the search indices which status is "running".
Describe alternatives you've considered
Waiting for the complete set of Plaso to be integrated.
Additional context
The whole timeline status depends on status of its data sources (see the
timesketch.lib.tasks._set_datasource_status
method).Both the frontend and the backend filter out timelines with status not ready or fail.
Moreover, assuming the modification above is done, it would require another frontend change for consistency:
Allowing to manipulate include/exclude a timeline in a query, that is to change the timeline widget to behave like if the one is ready (no spinning wheel).
I imagine an identical behavior (full access to the menu), but only the spinning wheel appears in the case the timeline is being updated.
The regular polling mechanism (for now running when status is running) would be propagated to all states of a timeline, for a real-time adjustment. Otherwise, a manual refresh is necessary, like now.
The text was updated successfully, but these errors were encountered: