Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reasonably big JSON files within JSONL can't be uploaded via web uploader #3243

Open
xSAVIKx opened this issue Dec 6, 2024 · 2 comments
Open
Labels

Comments

@xSAVIKx
Copy link

xSAVIKx commented Dec 6, 2024

Describe the bug
Uploading JSONL file with somewhat big JSON files in it fails during the headers check with a weird Unterminated string in JSON at position error message.

The reason for that is using only first 1000 bytes from the file to check the headers and with somewhat bigger JSON files we're just somewhere in the middle of the first JSON still. I believe this is the related place:

reader.readAsText(file.slice(0, 10000))

I've added a redacted copy from a real file I wanted to upload. It's smth from AWS WAF logs, but I guess any other similar system may have comparable size of objects nowadays.

To Reproduce
Steps to reproduce the behavior:

  1. Go to timelines upload form.
  2. Select the attached file.
  3. See error

Expected behavior
The JSONL file is processed and ingested.
Screenshots
Image

Desktop (please complete the following information):

  • OS: Ubuntu
  • Browser: Chrome
  • Version: 131.0.6778.108

Additional information
Here's the example file: timesketch-long-json.jsonl.txt
(I had to change the extension to .txt as GitHub does not allow uploading jsonl).
It has only 2 JSON objects in it. If you drop the second one it actually would be able to succeed.

Timesketch version: 20241129

@xSAVIKx xSAVIKx added the Bug label Dec 6, 2024
@jkppr
Copy link
Collaborator

jkppr commented Dec 12, 2024

Thanks for raising this issue and providing a sample file as well.

I can reproduce the issue as explained above using the sample file.

Temporary workaround: Using the python importer client should circumvent the issue.

@xSAVIKx I think you are on the right track with the header limit. Would you like to contribute a fix PR?

@xSAVIKx
Copy link
Author

xSAVIKx commented Dec 12, 2024

I may try to find some time during the weekend to glance 👌 but if there's anyone who's willing to fix this - no problem at all 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants