This is source code implementation of our paper entitled "Automatic log parser to support forensic analysis" published in the 16th Australian Digital Forensics Conference, pp. 1-10, 2018. We name the tool as nerlogparser
because it uses named entity recognition (NER) technique to parse log files. This repository is a fork from sequence_tagging by Guillaume Genthial.
- Python 3.5
- TensorFlow 1.4.1
- nltk 3.4
-
Create a new directory for
nerlogparser
in your home directorymkdir $HOME/nerlogparser
-
Create virtual environment in newly created directory with specific Python version (3.5)
virtualenv $HOME/nerlogparser -p /usr/bin/python3.5
-
Activate the virtual environment
source $HOME/nerlogparser/bin/activate
-
Install
nerlogparser
pip install nerlogparser
-
Make sure your are still in the virtual environment mode
-
For example, run
nerlogparser
to parse authentication log file from/var/log/auth.log
and print output to the screennerlogparser -i /var/log/auth.log
-
We can save parsing results in an output file such as
parsed-auth.json
. At the moment, the only supported file output format is JSON.nerlogparser -i /var/log/auth.log -o parsed-auth.json
-
Run
nerlogpaser
helpnerlogparser -h
import pprint
from nerlogparser.nerlogparser import Nerlogparser
parser = Nerlogparser()
parsed_logs = parser.parse_logs('/var/log/auth.log')
for line_id, parsed in parsed_logs.items():
print('Line:', line_id)
pprint.pprint(parsed)
print()
Apache License 2.0. Please check LICENSE.