Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move test fixtures into TSV plain text files #103

Open
ceefour opened this issue Jul 9, 2014 · 13 comments
Open

Move test fixtures into TSV plain text files #103

ceefour opened this issue Jul 9, 2014 · 13 comments

Comments

@ceefour
Copy link
Contributor

ceefour commented Jul 9, 2014

TestRelEx contain sentences and expected results inside the Java tests, which are then iterated.

It'd be more convenient to put these fixtures a la FitNesse into a spreadsheet with 3 sheets (Comparatives, Extraposition, Conjunction), which can be edited very conveniently in LibreOffice Calc, which can allow editing hundreds of tests and not too painful. :)

The fixtures are then loaded into JUnit tests using odftoolkit.

Depends on #98.

If accepted you can assign to me.

@linas
Copy link
Member

linas commented Jul 9, 2014

If you wish to create patches that do this, that's OK. I don't think they should be used for unit testing, for multiple reasons:

  1. People already find relex too difficult to configure and install. Adding yet another dependency would make that aspect worse.

  2. Spreadsheets and large, complex systems like LibreOffice are .. I dunno .. hard to use, hard to understand, ... I'm not sure what they do. They have something to do with business intelligence ... large corporations use them. Relex doesn't have any business people using it, and I don't see what the point of integration with business systems would be. Will business people start using relex? Why?

@linas linas closed this as completed Jul 9, 2014
@ceefour
Copy link
Contributor Author

ceefour commented Jul 9, 2014

  1. I've configured, installed from source, and tested RelEx. I know why it's difficult to build: many of the dependencies are managed manually. I did my job to Mavenize RelEx. It's sooo much easier now to build it.
  2. No LibreOffice is needed to use, build, or test RelEx. It's only used for editing the fixtures. I'll send you a spreadsheet or screenshot so you can know how it looks like.

@linas
Copy link
Member

linas commented Jul 9, 2014

If you wish to provide patches, that's OK; its unlikely I would turn them down, as long as they don't create additional dependencies for the user.

Myself, I have no plans to install or learn how to use libreOffice or to try to remember how to use spreadsheets .. again, I think spreadsheets are way beyond the level of complexity that most people would know how to use, and I just don't think business people are going to be flocking to Relex because of them. These are very different worlds.

@ceefour
Copy link
Contributor Author

ceefour commented Jul 9, 2014

This is how it would look like: (it's from another project of mine)

yago-rules ods - libreoffice calc_397

Basically it's just a table of text, no fancy "spreadsheet features" if that's your concern.

It's easier and faster to change, than editing test cases in code: (i.e. separates the test code from test _data)

selection_398

Note that putting test data inside Java code, the test data takes screen space, and to edit them one needs to escape "\n" and use string operators.

@linas
Copy link
Member

linas commented Jul 9, 2014

I can't even begin how to imagine how it can possibly be easier to edit a spreadsheet, than to edit the code. That like saying juggling three balls while standing on your head is "easier" than walking. That's just .. crazy.

Look I'm spending a lot of time talking to you but you keep offering these wild ideas and I just don't see how they are useful in any way. Spreadsheets are just big complicated tables and are pretty much useless. There's no value-add here.

For software to be useful, it has to actually do something. If it doesn't actually do anything, then its game, and I'm just not interested in games; I'm interested in learning and language processing, not business software.

@ceefour
Copy link
Contributor Author

ceefour commented Jul 9, 2014

It's easier to edit data in a spreadsheet the same way it's easier to edit code in an IDE.
A spreadsheet is can be small or big depending on data, just like a .java or .sch file can have 100 bytes or 100 KiB. The Java test has 876 lines and the actual Java code is probably about 100 lines, the rest 700-something lines are data wrapped in Java code. If these are put in table it's easier and more compact.

I'll prove it to you, here's very short video of me typing RelEx test data in a sheet: http://youtu.be/Xg1hXZdT6MU

It's very easy. I just type or edit or correct, Tab, and Enter. I don't need to escape \n or worry about string delimiters. I don't have to type \" if a " comes along.

I can switch between data sets easily just by changing tabs, in the Java editor I have to scroll or use the Outline to find the method which holds the data.

When typing this I have editor side-by-side. The editor shows me 4 test data in Java code. The spreadsheet displays probably 30 test data. Using the same visual space. If I maximize the window then it can probably display 60 test rows at once. RelEx has 80 tests now. One can very quickly see what the tests are and append more tests.

I can make another video if you want, showing me typing the same test data inside a Java editor. It's definitely longer than this video. :)

@linas
Copy link
Member

linas commented Jul 9, 2014

Its not easier to edit anything in a spreadsheet. I don't have access to any spreadsheets or spreadsheet editors. The last time I used one was 20 years ago. I don't have any use for business software and I don't have time to watch a youtube video. This entire conversation is crazy and pointless and a waste of time. I'm done.

@ceefour
Copy link
Contributor Author

ceefour commented Jul 9, 2014

I'm sorry to take your. It's really not my intention. My intention is to show you that there is a better, faster, easier way to do some of things and I tried to explain in writing and also to make a video demonstrating it. I'm aware that you don't like spreadsheet application, but it doesn't mean it's not a good fit for this purpose.

Anyway, I have an alternative which I hope you might like better. Would you mind moving the test data into TSV (tab-separated) format? It's purely text so I hope this is acceptable to you. This is how it looks like in a plain text editor, and I believe this is also easy for you and everyone to edit.

-home-ceefour-tmp-relex-test2 tsv kate_403

If you accept you can assign to me.

@githart
Copy link
Member

githart commented Jul 10, 2014

LibreOffice Sheets with odftoolkit may be the better and more convenient option; the TSV in this case is ugly and cumbersome. Although there may be a case for keeping a diff-friendly format, I do not know what this case may be.

And really it's just a way to edit test data, not a build dependency. I think the issue should be back open for discussion.

@githart githart reopened this Jul 10, 2014
@ceefour
Copy link
Contributor Author

ceefour commented Jul 10, 2014

While I personally prefer editing with LibreOffice (it's preinstalled in most distros including Ubuntu, Linux Mint, Fedora, and straight download for Windows/OSX), I can understand Dr. Vepstas' objections and to me, editing tests in TSV is still so much better than editing them in Java code:

-home-ceefour-tmp-relex-test2 tsv kate_403

the above contains exactly the same data as below. The Java code also uses indentation, so it's just like TSV, with the addition that we need boilerplate code, string quotes, + and \n and ); etc. that's not necessary in the TSV.

selection_407

If spreadsheet usage is approved, I'd probably use it this way:

relex2-spreadsheet ods - libreoffice calc_409

I'd color code green as "passing", red as "failed", yellow as "some subtests fail", and orange as "although this test passes, actually the logic is hardcoded and hacky, so please revisit it". (the coloring is up to debate, feel free to use colors easier on your eyes. If you don't like the colors then okay, no need to use them.)

Personally I feel it reduces cognitive overload since my brain doesn't need to process extraneous boilerplate stuff like rc &= test_sentence (, while at the same time providing visual cue "oh this one is broken, that one passes" and spatial information (ok, I've got ~50% coverage here, not because it says 50% but just by looking than half of the screen is green colored.)

Again it's up to you since this is your project, all I do is suggesting improvements and explaining its benefits, while I also acknowledge your concerns are valid. I'm already more than happy if TSV is accepted.

BTW if you're concerned about diff, LibreOffice can save FlatXML format if so desired, basically an uncompressed ODS file (since LibreOffice documents are technically zipped XML). I don't see why anyone would want to diff it though (it's test data, not code) so I'd suggest use the regular ODS format.

ceefour added a commit to ceefour/relex that referenced this issue Jul 10, 2014
@bgoertzel
Copy link

Hmmm...

Whether use of a spreadsheet makes sense here or not is a matter of taste,
I guess.... Of course, LibreOffice is free software and very easy to
use. But I can understand not wanting to use additional software to view
the test cases.

I do see some sense in putting the test sentences in text files of some
sort, rather than in the code, though. Wrapping data in java code does
seem a bit cumbersome IMO.

There are other things in OpenCog in more need of attention than these test
cases. But I can appreciate that Hendy, as a new contributor, is looking
at low-hanging fruit...

-- Ben

On Thu, Jul 10, 2014 at 12:32 PM, Hendy Irawan [email protected]
wrote:

While I personally prefer editing with LibreOffice, I can understand Dr.
Vepstas' objections and to me, editing tests in TSV is still so much
better than editing them in Java code:

[image: -home-ceefour-tmp-relex-test2 tsv kate_403]
https://cloud.githubusercontent.com/assets/24123/3530189/1bfe0614-07a2-11e4-8abf-502cf3fd00f6.png

the above contains exactly the same data as below. The Java code also uses
indentation, so it's just like TSV, with the addition that we need
boilerplate code, string quotes, + and \n and ); etc. that's not
necessary in the TSV.

[image: selection_407]
https://cloud.githubusercontent.com/assets/24123/3534436/1daee8f6-07e9-11e4-9b1a-ae55d1fa3dd6.png

If spreadsheet usage is approved, I'd probably use it this way:

[image: relex2-spreadsheet ods - libreoffice calc_408]
https://cloud.githubusercontent.com/assets/24123/3534454/b9d39d6c-07e9-11e4-83aa-1e80e0579373.png

I'd color code green as "passing", red as "failed", yellow as "some
subtests fail", and orange as "although this test passes, actually the
logic is hardcoded and hacky, so please revisit it". (the coloring is up to
debate, feel free to use colors easier on your eyes. If you don't like the
colors then okay, no need to use them.)

Personally I feel it reduces cognitive overload since my brain doesn't
need to process extraneous boilerplate stuff like rc &= test_sentence (,
while at the same time providing visual cue "oh this one is broken,
that one passes" and spatial information (ok, I've got ~50% coverage
here, not because it says 75% but just by looking than 1/2 of the screen is
green colored.)

Again it's up to you since this is your project, all I do is suggesting
improvements and explaining its benefits, while I also acknowledge your
concerns are valid. I'm already more than happy if TSV is accepted.

BTW if you're concerned about diff, LibreOffice can save FlatXML format if
so desired, basically an uncompressed ODS file (since LibreOffice documents
are technically zipped XML). I don't see why anyone would want to diff it
though (it's test data, not code) so I'd suggest use the regular ODS format.


Reply to this email directly or view it on GitHub
#103 (comment).

Ben Goertzel, PhD
http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James
T. Kirk

"Emancipate yourself from mental slavery / None but ourselves can free our
minds" -- Robert Nesta Marley

@amebel
Copy link
Contributor

amebel commented Jul 11, 2014

@ceefour Separating the code from the test-data-set had been discussed before here . Thus it would be great if you could follow on that.

With regards to the .ods file, i think it is better if it is inside a txt file (you could name it *.test for clarity); then any one using an editor like vim can easily work with it.

Thanks :-)

@ceefour
Copy link
Contributor Author

ceefour commented Jul 11, 2014

@amebel sure, I'd implement it like I suggested in #103 (comment) . Thanks :)

Another benefit of separating test cases is that if one day RelEx is ported to another architecture, these test cases can be reused as-is. Or used concurrently by both project variants.

@ceefour ceefour changed the title Move test fixtures into LibreOffice spreadsheet Move test fixtures into TSV plain text files Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 11, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 14, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 14, 2014
ceefour added a commit to ceefour/relex that referenced this issue Jul 14, 2014
Test data for RelEx and Stanford are moved into TSV files.

The Test classes (`TestRelEx` and `TestStanford`), in addition to having most of its content factored out, also had unnormalized line endings, so to Git these classes look replaced entirely.

The entirety of test data now reside in these `.tsv` files. I think the format should be self-explanatory just by looking at it.

Ant build and tests still work.

@linas I hope this is acceptable.

Fixed opencog#103.
ceefour added a commit to ceefour/relex that referenced this issue Jul 14, 2014
Test data for RelEx and Stanford are moved into TSV files.

The Test classes (`TestRelEx` and `TestStanford`), in addition to having most of its content factored out, also had unnormalized line endings, so to Git these classes look replaced entirely.

The entirety of test data now reside in these `.tsv` files. I think the format should be self-explanatory just by looking at it.

Ant build and tests still work.

@linas I hope this is acceptable.

Fixed opencog#103.
ceefour added a commit to ceefour/relex that referenced this issue Jul 15, 2014
Test data for RelEx and Stanford are moved into TSV files.

The Test classes (`TestRelEx` and `TestStanford`), in addition to having most of its content factored out, also had unnormalized line endings, so to Git these classes look replaced entirely.

The entirety of test data now reside in these `.tsv` files. I think the format should be self-explanatory just by looking at it.

Ant build and tests still work.

@linas I hope this is acceptable.

Fixed opencog#103.
ceefour added a commit to ceefour/relex that referenced this issue Jul 15, 2014
Test data for RelEx and Stanford are moved into TSV files.

The Test classes (`TestRelEx` and `TestStanford`), in addition to having most of its content factored out, also had unnormalized line endings, so to Git these classes look replaced entirely.

The entirety of test data now reside in these `.tsv` files. I think the format should be self-explanatory just by looking at it.

Ant build and tests still work.

@linas I hope this is acceptable.

Fixed opencog#103.
ceefour added a commit to ceefour/relex that referenced this issue Jul 15, 2014
Test data for RelEx and Stanford are moved into TSV files.

The Test classes (`TestRelEx` and `TestStanford`), in addition to having most of its content factored out, also had unnormalized line endings, so to Git these classes look replaced entirely.

The entirety of test data now reside in these `.tsv` files. I think the format should be self-explanatory just by looking at it.

Ant build and tests still work.

@linas I hope this is acceptable.

Fixed opencog#103.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants