-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]: WordTokenizers.jl: Basic tools for tokenizing natural language in Julia #1956
Comments
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @leios, @ninjin it looks like you're currently assigned to review this paper 🎉. ⭐ Important ⭐ If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿 To fix this do the following two things:
For a list of things I can do to help you, just type:
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
|
|
|
|
cc @Ayushk4 |
Hey @will-rowe I am really sorry. I had a week to do it, but was finishing my thesis revisions during that time, then I had to move to my new position. I haven't had time to set up my stuff in my new place, so I don't know when I can do a review. By the earliest, this weekend. I am making it a priority to do then, but I understand if you want to try to find a new reviewer. Sorry again for the trouble. |
@will-rowe: I should be able to handle it within the week. |
Hey, trying to start the review. Sorry I'm late. Also sorry if this is a stupid question, but how do I check the boxes? Last time I did a review, I could just click on them, but this time I do not seem to have permission. |
I will submit a PR fixing a number of typos in the README and paper. I have a general question on the scope of this software regarding asian language processing, such as Japanese. I know that this is possible with TinySegmenter.jl, but is there a way to use this package with WordTokenizers? |
WordTokenizer's default tokeniser doesn't work well on some Asian languages like Chinese, Japanese. But, we can set up TinySegmenter.jl in the same way as RevTok.jl example given in readme.
|
Ah. I was missing the With that, I think I can check all the boxes on the review! |
For interest the default tokenizer TokTok has:
Which is to say a bunch of languages using the Latin (English, French, German, Vietnamese), or the Cyrillic (Russian, Czech, Tajik), or Persian alphabet (Farsi). We don't yet have anything that supports the morpheme segementation needed for languages like Japanese or Chinese. I would speculate, just based on the people involved in JuliaText, we are likely to get support for 1 or more Indian languages before anything else.
@Ayushk4 updated the readme now with that example. |
I don't see the updated readme yet, if you are going to change it, could you use something like, "ジュリアプログラミング言語が大好きです!" It translate to "I love the Julia programming language!" but also shows how it segments all three alphabets (it lumps Hiragana words together, which might be an issue for some people). |
Hi @leios. You should be able to edit the top comment in this thread - click the 3 dots and then edit to update your checklist (via markdown). Let me know if you're stuck - and thanks for the review! |
@will-rowe I don't have that option, only "copy link" and "quote reply" |
hmmm - did you accept the invite at the top of the thread? Link here. If you have already done this, let me know and I'll have to ask someone as I'm out of ideas! |
Sorry, I should have known that was a requirement. Filled in everything now. Thanks! |
Hi @ninjin - just pinging you to see how your review is going? Thanks! |
@will-rowe: Ouch, sorry. On it today. Sorry for being the slowest reviewer you have on the roster and thank you for the poke. @leios: I would prefer: “日本語はひらがなとカタカナと漢字で表されます。” (“Japanese is expressed using hiragana, katakana, and kanji.”) over your example sentence. Sure, this one is way more formal, but yours somehow sounds a bit odd to me. Perhaps because it expresses something somewhat intimate in a formal way? |
Repository:
Paper:
Non-mandatory (just comments, really):
All in all, a nice solid package. Well done! |
@oxinabox I'm taking over for the rest of your submission. Can you edit the Zenodo data a little more, so that the title matches that of your paper? |
done, thanks good changes |
@oxinabox Is the metadata updated when you look at it online? It still isn't for me. Maybe this will help? https://github.com/geodynamics/best_practices/blob/master/ZenodoBestPractices.md |
Ok great! |
@whedon set 10.5281/zenodo.3663390 as archive |
OK. 10.5281/zenodo.3663390 is the archive. |
@whedon accept |
|
|
Check final proof 👉 openjournals/joss-papers#1322 If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#1322, then you can now move forward with accepting the submission by compiling again with the flag
|
@whedon accept deposit=true |
|
🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦 |
🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨 Here's what you must now do:
Any issues? notify your editorial technical team... |
@openjournals/dev There is something wrong with the paper DOI. This seems to keep happening to me. How can I prevent it/watch for it in the future? |
@kthyng the DOI resolves for me, so it may just be a caching issue with your browser? Though sometimes it does take a few minutes or longer for it to initially resolve, which is a Crossref issue, I think. |
It takes a few mins for the DOI to register, plus sometimes, the DOI resolves before the GitHub Pages build is ready (so the PDF doesn't render). Basically I recommend waiting a few mins before clicking the DOI link, and when rechecking, it sometimes helps to (re)try in incognito/private mode too to help your browser avoid caching issues. |
Sometimes I retry from a different network (e.g., phone vs wifi), since the caching seems to happen in the specific network somewhere. |
🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉 If you would like to include a link to your paper from your README use the following code snippets:
This is how it will look in your documentation: We need your help! Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:
|
@arfon @kyleniemeyer Sorry about the unwarranted alert! When I checked, it was showing up as the orange-ish not-working page, which in the past had meant a different problem, rather than the blue-ish the-DOI-just-hasn't-resolved-yet page. |
A belated congratulations to @oxinabox on your new paper! Thanks to @will-rowe for editing and to reviewers @leios and @ninjin — we really appreciate your time and expertise! |
Submitting author: @oxinabox (Lyndon White)
Repository: https://github.com/JuliaText/WordTokenizers.jl/
Version: v0.5.4
Editor: @will-rowe
Reviewer: @leios, @ninjin
Archive: 10.5281/zenodo.3663390
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@leios & @ninjin, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @will-rowe know.
✨ Please try and complete your review in the next two weeks ✨
Review checklist for @leios
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @ninjin
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
The text was updated successfully, but these errors were encountered: