Skip to content

Comments

Kevin Schiroo edited this page Dec 24, 2015 · 1 revision

Linear merging

Currently implemented
In this algorithms text blocks are merged linearly until a signature can be identified. The indentation of this comment is determined to be the minimum indentation level of all of its component text blocks.

Example 1

~~~~ is used to denote a signature.

Input Text

Text Block 1
Text Block 2 ~~~~
  Text Block 3
    Text Block 4
  Text Block 5 ~~~~
Text Block 6 ~~~~

Extracted Comments

Comment 1

Text Block 1
Text Block 2 ~~~~

Comment 2

Text Block 3
  Text Block 4
Text Block 5 ~~~~

Comment 3

Text Block 6 ~~~~

Example 2

Input Text

Extracted Comments

Level merging

Implementation in progress
We view the text blocks as forming a tree structure with position in the tree determined by indentation level. We first try to find the owner of a text block by searching its siblings for a signature.

Clone this wiki locally