-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected Word Count #82
Comments
Not sure what the problem is but I figured it shouldn't take too long to diagnose. Let me know what you think a good time estimate is. |
Probably because we are keeping only single new line and removing extra new lines. This is counting everything below the the last list element as a part of that element. Keeping all new lines should fix it, but make sure it doesn't break anything else. Overall score here should be correct as counting them in li/ol instead of p. P has shortage of them in count. |
There is also the case of nested elements that I think introduces count error. Not sure to understand what you mean @EresDev ? Why are the new lines relevant since we count inside of each tag and new lines do not count as words? |
It doesn't appear to be a counting problem. It is a problem in converting into HTML elements and that is coming from a missing new line elements when we try to clean up a raw comment by removing unnecessary stuff from it in the purge module I think. New line characters in the raw comment are the delimiter used by
It is not a problem so far when nested elements are essential for each other, like ol/li Just need to make sure you score one of these as zero. That also happens automatically because any element missing in the config gets a 0 score. |
This also seems to be the case with |
@gentlementlegen the deadline is at Fri, Sep 13, 7:58 AM UTC |
I don't understand why there are these mistakes in the implementation when the code is ready from v1. My best guess is that it's draft pull related. |
Updates on this task: The first issue about counting issue is that GitHub uses Second problem which I can't find a workaround so far is for the content. For example, with the following structure: <li>
<p>text</p>
</li> when JSDOM gives |
The first implementation just counted the total amount of words in the entire comment in one go. You don't need to associate word count with each element, like I mentioned before. The emphasis is on tag count, not word count. For some elements, like |
|
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Task | 1 | 100 |
Issue | Comment | 5 | 0 |
Review | Comment | 5 | 0 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
There is also the case of nested elements that I think introduce… | 0content: p: symbols: \b\w+\b: count: 63 multiplier: 0 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0 score: 1 multiplier: 0 | 0.8 | - |
This also seems to be the case with `ul` and `li`… | 0content: p: symbols: \b\w+\b: count: 29 multiplier: 0 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0 score: 1 multiplier: 0 | 0.6 | - |
I think there is an issue with the unassign. It doesn't seem to … | 0content: p: symbols: \b\w+\b: count: 34 multiplier: 0 score: 1 multiplier: 0 | 0.2 | - |
Updates on this task: The first issue about counting issue is t… | 0content: p: symbols: \b\w+\b: count: 112 multiplier: 0 score: 1 code: symbols: \b\w+\b: count: 15 multiplier: 0 score: 1 pre: symbols: \b\w+\b: count: 1 multiplier: 0 score: 0 a: symbols: \b\w+\b: count: 6 multiplier: 0 score: 1 multiplier: 0 | 1 | - |
@0x4007 Yep got it, I think this would be part of https://github… | 0content: p: symbols: \b\w+\b: count: 19 multiplier: 0 score: 1 multiplier: 0 | 0.2 | - |
Resolves #82 QA: https://github.com/ubiquibot/conversation-rewa… | 0content: p: symbols: \b\w+\b: count: 45 multiplier: 0 score: 1 h2: symbols: \b\w+\b: count: 1 multiplier: 0 score: 1 ul: symbols: \b\w+\b: count: 1 multiplier: 0 score: 1 li: symbols: \b\w+\b: count: 3 multiplier: 0 score: 1 multiplier: 0 | 0.9 | - |
Body comes in a text form. Then it gets transformed into MD ->… | 0content: p: symbols: \b\w+\b: count: 45 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Correct, will be part of https://github.com/ubiquibot/conversati… | 0content: p: symbols: \b\w+\b: count: 13 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Will be changed in https://github.com/ubiquibot/conversation-rew… | 0content: p: symbols: \b\w+\b: count: 12 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
I think it is nice to get `info` within the logs by defa… | 0content: p: symbols: \b\w+\b: count: 21 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
[ 23.511 WXDAI ]
@0x4007
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Specification | 1 | 7.8 |
Issue | Comment | 3 | 8.451 |
Review | Comment | 5 | 7.26 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
122 `ol` and `li` counted in this comment, which… | 7.8content: p: symbols: \b\w+\b: count: 28 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 5 em: symbols: \b\w+\b: count: 17 multiplier: 0.1 score: 0 multiplier: 3 | 1 | 7.8 |
Not sure what the problem is but I figured it shouldn't take too… | 3.4content: p: symbols: \b\w+\b: count: 28 multiplier: 0.2 score: 1 multiplier: 1 | 0.2 | 0.68 |
I don't understand why there are these mistakes in the implement… | 3.5content: p: symbols: \b\w+\b: count: 29 multiplier: 0.2 score: 1 multiplier: 1 | 0.5 | 1.75 |
The first implementation just counted the total amount of words … | 6.69content: p: symbols: \b\w+\b: count: 60 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 1 | 0.9 | 6.021 |
Shouldn't the default be errors? | 0.46content: p: symbols: \b\w+\b: count: 6 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.46 |
If we use a virtual DOM creator like `jsdom` or `mda… | 3.29content: p: symbols: \b\w+\b: count: 57 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 3.29 |
textContent of the top level parent element will do the right th… | 0.83content: p: symbols: \b\w+\b: count: 12 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.83 |
I suppose that for the statistics it might be interesting to cou… | 1.85content: p: symbols: \b\w+\b: count: 31 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 1.85 |
HTML comments shouldn't be included in element.textContent is my… | 0.83content: p: symbols: \b\w+\b: count: 12 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.83 |
[ 2.259 WXDAI ]
@EresDev
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Comment | 2 | 2.259 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
Probably because we are keeping only single new line and removin… | 0.9content: p: symbols: \b\w+\b: count: 68 multiplier: 0.1 score: 1 multiplier: 0.25 | 0.8 | 0.72 |
It doesn't appear to be a counting problem. It is a problem in c… | 1.71content: p: symbols: \b\w+\b: count: 138 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0.1 score: 1 multiplier: 0.25 | 0.9 | 1.539 |
|
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Task | 1 | 100 |
Issue | Comment | 5 | 11.594 |
Review | Comment | 6 | 0 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
There is also the case of nested elements that I think introduce… | 3.56content: p: symbols: \b\w+\b: count: 63 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 0.8 | 2.848 |
This also seems to be the case with `ul` and `li`… | 2content: p: symbols: \b\w+\b: count: 29 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0.1 score: 1 multiplier: 1 | 0.6 | 1.2 |
I think there is an issue with the unassign. It doesn't seem to … | 2content: p: symbols: \b\w+\b: count: 34 multiplier: 0.1 score: 1 multiplier: 1 | 0.1 | 0.2 |
Updates on this task: The first issue about counting issue is t… | 6.98content: p: symbols: \b\w+\b: count: 112 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 15 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 0 a: symbols: \b\w+\b: count: 6 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 6.98 |
@0x4007 Yep got it, I think this would be part of https://github… | 1.22content: p: symbols: \b\w+\b: count: 19 multiplier: 0.1 score: 1 multiplier: 1 | 0.3 | 0.366 |
Resolves #82 QA: https://github.com/ubiquibot/conversation-rewa… | 0content: p: symbols: \b\w+\b: count: 45 multiplier: 0 score: 1 h2: symbols: \b\w+\b: count: 1 multiplier: 0 score: 1 ul: symbols: \b\w+\b: count: 1 multiplier: 0 score: 1 li: symbols: \b\w+\b: count: 3 multiplier: 0 score: 1 multiplier: 0 | 0.7 | - |
Body comes in a text form. Then it gets transformed into MD ->… | 0content: p: symbols: \b\w+\b: count: 45 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Correct, will be part of https://github.com/ubiquibot/conversati… | 0content: p: symbols: \b\w+\b: count: 13 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Will be changed in https://github.com/ubiquibot/conversation-rew… | 0content: p: symbols: \b\w+\b: count: 12 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
I think it is nice to get `info` within the logs by defa… | 0content: p: symbols: \b\w+\b: count: 21 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Yes, but there is no way to know it is a comment before converti… | 0content: p: symbols: \b\w+\b: count: 57 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
[ 23.182 WXDAI ]
@0x4007
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Specification | 1 | 7.8 |
Issue | Comment | 3 | 8.122 |
Review | Comment | 5 | 7.26 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
122 `ol` and `li` counted in this comment, which… | 7.8content: p: symbols: \b\w+\b: count: 28 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 5 em: symbols: \b\w+\b: count: 17 multiplier: 0.1 score: 0 multiplier: 3 | 1 | 7.8 |
Not sure what the problem is but I figured it shouldn't take too… | 3.4content: p: symbols: \b\w+\b: count: 28 multiplier: 0.2 score: 1 multiplier: 1 | 0.3 | 1.02 |
I don't understand why there are these mistakes in the implement… | 3.5content: p: symbols: \b\w+\b: count: 29 multiplier: 0.2 score: 1 multiplier: 1 | 0.5 | 1.75 |
The first implementation just counted the total amount of words … | 6.69content: p: symbols: \b\w+\b: count: 60 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 1 | 0.8 | 5.352 |
Shouldn't the default be errors? | 0.46content: p: symbols: \b\w+\b: count: 6 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.46 |
If we use a virtual DOM creator like `jsdom` or `mda… | 3.29content: p: symbols: \b\w+\b: count: 57 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 3.29 |
textContent of the top level parent element will do the right th… | 0.83content: p: symbols: \b\w+\b: count: 12 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.83 |
I suppose that for the statistics it might be interesting to cou… | 1.85content: p: symbols: \b\w+\b: count: 31 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 1.85 |
HTML comments shouldn't be included in element.textContent is my… | 0.83content: p: symbols: \b\w+\b: count: 12 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.83 |
[ 2.259 WXDAI ]
@EresDev
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Comment | 2 | 2.259 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
Probably because we are keeping only single new line and removin… | 0.9content: p: symbols: \b\w+\b: count: 68 multiplier: 0.1 score: 1 multiplier: 0.25 | 0.8 | 0.72 |
It doesn't appear to be a counting problem. It is a problem in c… | 1.71content: p: symbols: \b\w+\b: count: 138 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0.1 score: 1 multiplier: 0.25 | 0.9 | 1.539 |
|
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Task | 1 | 100 |
Issue | Comment | 5 | 11.672 |
Review | Comment | 6 | 0 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
There is also the case of nested elements that I think introduce… | 3.56content: p: symbols: \b\w+\b: count: 63 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 0.8 | 2.848 |
This also seems to be the case with `ul` and `li`… | 2content: p: symbols: \b\w+\b: count: 29 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0.1 score: 1 multiplier: 1 | 0.7 | 1.4 |
I think there is an issue with the unassign. It doesn't seem to … | 2content: p: symbols: \b\w+\b: count: 34 multiplier: 0.1 score: 1 multiplier: 1 | 0.1 | 0.2 |
Updates on this task: The first issue about counting issue is t… | 6.98content: p: symbols: \b\w+\b: count: 112 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 15 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 0 a: symbols: \b\w+\b: count: 6 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 6.98 |
@0x4007 Yep got it, I think this would be part of https://github… | 1.22content: p: symbols: \b\w+\b: count: 19 multiplier: 0.1 score: 1 multiplier: 1 | 0.2 | 0.244 |
Resolves #82 QA: https://github.com/ubiquibot/conversation-rewa… | 0content: p: symbols: \b\w+\b: count: 45 multiplier: 0 score: 1 h2: symbols: \b\w+\b: count: 1 multiplier: 0 score: 1 ul: symbols: \b\w+\b: count: 1 multiplier: 0 score: 1 li: symbols: \b\w+\b: count: 3 multiplier: 0 score: 1 multiplier: 0 | 0.9 | - |
Body comes in a text form. Then it gets transformed into MD ->… | 0content: p: symbols: \b\w+\b: count: 45 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Correct, will be part of https://github.com/ubiquibot/conversati… | 0content: p: symbols: \b\w+\b: count: 13 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Will be changed in https://github.com/ubiquibot/conversation-rew… | 0content: p: symbols: \b\w+\b: count: 12 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
I think it is nice to get `info` within the logs by defa… | 0content: p: symbols: \b\w+\b: count: 21 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Yes, but there is no way to know it is a comment before converti… | 0content: p: symbols: \b\w+\b: count: 57 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
[ 23.182 WXDAI ]
@0x4007
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Specification | 1 | 7.8 |
Issue | Comment | 3 | 8.122 |
Review | Comment | 5 | 7.26 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
122 `ol` and `li` counted in this comment, which… | 7.8content: p: symbols: \b\w+\b: count: 28 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 5 em: symbols: \b\w+\b: count: 17 multiplier: 0.1 score: 0 multiplier: 3 | 1 | 7.8 |
Not sure what the problem is but I figured it shouldn't take too… | 3.4content: p: symbols: \b\w+\b: count: 28 multiplier: 0.2 score: 1 multiplier: 1 | 0.3 | 1.02 |
I don't understand why there are these mistakes in the implement… | 3.5content: p: symbols: \b\w+\b: count: 29 multiplier: 0.2 score: 1 multiplier: 1 | 0.5 | 1.75 |
The first implementation just counted the total amount of words … | 6.69content: p: symbols: \b\w+\b: count: 60 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 1 | 0.8 | 5.352 |
Shouldn't the default be errors? | 0.46content: p: symbols: \b\w+\b: count: 6 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.46 |
If we use a virtual DOM creator like `jsdom` or `mda… | 3.29content: p: symbols: \b\w+\b: count: 57 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 3.29 |
textContent of the top level parent element will do the right th… | 0.83content: p: symbols: \b\w+\b: count: 12 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.83 |
I suppose that for the statistics it might be interesting to cou… | 1.85content: p: symbols: \b\w+\b: count: 31 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 1.85 |
HTML comments shouldn't be included in element.textContent is my… | 0.83content: p: symbols: \b\w+\b: count: 12 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.83 |
[ 2.52 WXDAI ]
@EresDev
Contributions Overview
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Comment | 2 | 2.52 |
Conversation Incentives
Comment | Formatting | Relevance | Reward |
---|---|---|---|
Probably because we are keeping only single new line and removin… | 0.9content: p: symbols: \b\w+\b: count: 68 multiplier: 0.1 score: 1 multiplier: 0.25 | 0.9 | 0.81 |
It doesn't appear to be a counting problem. It is a problem in c… | 1.71content: p: symbols: \b\w+\b: count: 138 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 1.71 |
122
ol
andli
counted in this comment, which is not correct.Originally posted by @UbiquityOS[bot] in ubiquity/pay.ubq.fi#259 (comment)
The text was updated successfully, but these errors were encountered: