Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add worex tool to add melismatic word extenders #67

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

WolfgangDrescher
Copy link
Contributor

This PR adds the functionality discussed in humdrum-tools/verovio-humdrum-viewer#657 (comment) to enable melismatic underlines at ending syllables of words with a dedicated filter for this. I used a generic name lyricsformatter so we can potentially add more features into this tool.

The name lyricsformatter is a bit verbose. Do you have a better suggestion? Maybe lf, fmtlyrics or something else?

Current usage do add melismatic underlines is

cat test.krn | lyricsformatter --ul
cat test.krn | lyricsformatter --underline
cat test.krn | lyricsformatter --add-underline

And to remove them:

cat test.krn | lyricsformatter --xul
cat test.krn | lyricsformatter --remove-underline

Do we need a pluralized alias for these options?

You mentioned a possible new *ul interpretation. I have not implemented this.

Tool_lyricsformatter::addUnderlines and Tool_lyricsformatter::removeUnderlines only need to get passed a list with start tokens. So I hope this makes it usable in verovio iohumdrum.cpp, if needed. Should I make these methods static?

@craigsapp
Copy link
Owner

Would this tool do anything other than add/remove the underline? What to call the tool and what it does would influence how the options are named as well.

Perhaps a better name that "underline" could be used. More precise would be "word extender" since it is a line that follows the ending syllable of a word. Perhaps it is better to not call it an "underline" since that typically means to underline words in standard text. And in addition, the word extension line is sometimes centered like dashes rather than on the more typical baseline position (which is itself higher than standard prose "underlines" which would be placed below the baseline).

In that case the tool could be called worex or wordex or wordext (I like the first one since it is the shortest and easiest to say).

For example worex with no options would add underlines, while worex -r would remove them (long form: worex --remove).

There are a couple of corner cases to consider, the main one being: should the extension line extend across rests. In general I think so.

There is a difficult corner case to consider: Suppose that you have multiple sections to the music, such as three verse lines at the start of the music, two in the middle and three again at the end of the music. If the end of the third verse in the first section ends in a melisma, you would want the word extender to stop at the last note of the first section, and not continue throughout the entire middle section. In such cases, there could be a maximum size rest (such as a full measure rest) that would prevent the word extender from continuing.

This last case is what annoys me. There is a semi-official character that I use to stop word extenders explicitly for such cases (@jacekiwaszko1 can remind me what that character is, as he suggested it).

Maybe the *ul can be revised to *worex and *Xworex. *Xworex can be used to manually control the complicated case. This would be implemented in iohumdrum.cpp and allows display or suppression in a convenient manner in verovio rendering.

Another related interpretation that could be implemented is *elision and *Xelision which controls whether or not syllables containing spaces should display an elision or not. Currently spaces map to elisions as the hard-wired display style.

@WolfgangDrescher
Copy link
Contributor Author

Would this tool do anything other than add/remove the underline?

My idea was that something like the *elision feature that you mentioned could also be implemented in the same tool.

For example worex with no options would add underlines, while worex -r would remove them (long form: worex --remove).

getBoolean() for options doesn't know if it is passed or not, right? So it can either by true or false but not null? This given it would indeed make more sense to implement the feature as you suggest. An alternativ syntax would be to make the --ul flag an integer. 1 → add word extenders, 0 → remove word extenders, default value could be -1. This is also not ideal, but it would allow to implement other features, such as *elision into lyricsformatter.

There are a couple of corner cases to consider, the main one being: should the extension line extend across rests. In general I think so.

Bildschirm­foto 2023-02-05 um 12 24 42

VHV

Does it really happen in music that after a rest there are no new syllables? What text should the singer sing in this case? So I thinks it's not wrong to simply extend the word until the next syllables occurs in the lyrics. One question would be if the extender line should stop under a rest and start again after it but I don't think this is necessary.

There is a semi-official character that I use to stop word extenders explicitly for such cases (@jacekiwaszko1 can remind me what that character is, as he suggested it).

Okay this would be good to know. So currently this character is encoded in some scores so far but not implemented intro humdrum/verovio? The only solution I could come up with is using   but it's of course no ideal. I can add theses infos to the docs once everything is discussed and implemented.

Suppose that you have multiple sections to the music, such as three verse lines at the start of the music, two in the middle and three again at the end of the music.

What do you mean by section, *>A? If I understand your example correctly the word extender should be stoped directly before all *> interpretations. But in your example the second section will have no lyrics, correct?

Maybe the *ul can be revised to *worex and *Xworex. *Xworex can be used to manually control the complicated case. This would be implemented in iohumdrum.cpp and allows display or suppression in a convenient manner in verovio rendering.

Okay. But the _ characters would always have priority to the *worex/ *Xworex interpretations?

Another related interpretation that could be implemented is *elision and *Xelision which controls whether or not syllables containing spaces should display an elision or not. Currently spaces map to elisions as the hard-wired display style.

This would be implemented directly into verovio and there is no need for an additional tool, isn't it?

@WolfgangDrescher WolfgangDrescher marked this pull request as draft February 5, 2023 11:59
@craigsapp
Copy link
Owner

Does it really happen in music that after a rest there are no new syllables? What text should the singer sing in this case? So I thinks it's not wrong to simply extend the word until the next syllables occurs in the lyrics. One question would be if the extender line should stop under a rest and start again after it but I don't think this is necessary.

Here is more of what I was thinking of:

Screenshot 2023-02-05 at 5 26 14 AM

test data
**kern	**text
4g	Word
4g	ex-
4g	-ten-
4g	-der_
4g	.
4g	.
4r	.
4g	new
4g	word
=||	=||
4g	Word
4g	ex-
4g	-ten-
4g	-der_
4g	.
4g	.
4r	.
4r	.
4r	.
4r	.
4g	new
4g	word
=||	=||
4g	Word
4g	ex-
4g	-ten-
4g	-der_
4g	.
4g	.
8r	.
4g	new
4g	word
=||	=||
4g	Word
4g	ex-
4g	-ten-
4g	-der_
4g	.
4g	.
8r	.
8r	.
8r	.
8r	.
4g	new
4g	word
=||	=||
4g	Word
4g	ex-
4g	-ten-
4g	-der_
4g	.
4g	.
2r	.
4g	new
4g	word
=||	=||
*-	*-

Which is not really a problem. When the rest comes after the melisma and before the next note, then it is good to stop the word extender before the rest. Breaking the extender for rests in the middle would never be done, so that is not a problem as well.

There is some complicated case, which I will have to think more about.

@craigsapp
Copy link
Owner

getBoolean() for options doesn't know if it is passed or not, right? So it can either by true or false but not null? This given it would indeed make more sense to implement the feature as you suggest. An alternative syntax would be to make the --ul flag an integer. 1 → add word extenders, 0 → remove word extenders, default value could be -1. This is also not ideal, but it would allow to implement other features, such as *elision into lyricsformatter.

Boolean options are false by default, and true if they are specified. So there is no other state of default or null. The boolean option is false if not given on the command-line, and true if it is given on the command line. (This is similar to truthiness in javascript where null and false have the same truthiness state).

If you need a three-state option, then yes, the option would have to be defined as an integer.

default value could be -1.

What is the behavior you are thinking "default" or "null" would do in this case?

@WolfgangDrescher
Copy link
Contributor Author

Which is not really a problem. When the rest comes after the melisma and before the next note, then it is good to stop the word extender before the rest. Breaking the extender for rests in the middle would never be done, so that is not a problem as well.

But verovio is already rendering the word extender like this. So there is no need to change something for this? Or would you prefer to stop the word extender manually with the mentioned semi-official character?

What is the behavior you are thinking "default" or "null" would do in this case?

If we keep lyricsformatter to bundle other lyrics functions like *elision into it, using a boolean option would always remove the word extensions when the options is not passed on the command-line (if you don't want multiple options like --ul and --xul). But since I think it's better to refactor it as you suggested worex and worex -r this is irrelevant.

@WolfgangDrescher WolfgangDrescher changed the title Add lyricsformatter tool to enable melismatic underlines Add worex tool to add melismatic word extenders Feb 6, 2023
@craigsapp
Copy link
Owner

I implemented *elision and *Xelision directly in verovio: rism-digital/verovio@2f1ce00

It should be fairly easy to implement *worex and *Xworex at the same time that *elision is processed in verovio...

@WolfgangDrescher
Copy link
Contributor Author

In this case I think it's best to migrate this functionality into verovio iohumdrum.cpp and the worex tool should only add *worex or *Xworex tandem interpretations. Similar to what you recently implemented with deg --circle. Is this what you had in mind?

There are some cases that we should consider more carefully:

  • What should happen if *worex or *Xworex are already existing in the score? There could be an additional option to remove all of the existing interpretations.
  • What should happen if there are already underscored encoded in the **text data records?

Should I revert it back to lyricsformatter (or with a better name) and add options to enable support to add *elision and *Xelision as well? The options could be something like this: lyricsformatter --elision, lyricsformatter --worex, lyricsformatter --xelision, lyricsformatter --xworex; or more dynamic: lyricsformatter -i Xelision where -i stands for --add-interpretation (before first data record).

Currently there is a bug where _ gets added to syllables for single notes when they are followed by a null token caused by another voice. They won't be visible but verovio did show a warning: Syllable with underline extender under one single note. I could not find a good solution for this. Do you have an idea on how to resolve this? I tried with current->getDurationFromStart() but this was not helpful as it has a different duration than the assigned note. How can I get the corresponding **kern note for a **text data record?

@craigsapp
Copy link
Owner

In this case I think it's best to migrate this functionality into verovio iohumdrum.cpp and the worex tool should only add *worex or *Xworex tandem interpretations. Similar to what you recently implemented with deg --circle. Is this what you had in mind?

It is subjective as to what should be a tool (or a HumdrumFileContent function) and what should be incorporated directly into verovio. Mostly it would depend on if you need text formatting as an independent tool outside of verovio. If it is anticipated that a CLI tool would be useful, then implementing as a tool is good. If it is some notational aspect that you don't care about doing outside of verovio, then implementing in iohumdrum.cpp is better. But there is also the consideration of the size of the code. If it takes a lot of code to implement, it is probably better to implement as a tool even if it is primarily focused on notation rather than analysis of data processing. One example of this is the beaming and tuplet analysis. That is currently done in verovio since the code started out small and simple, but it is quite complex, and I will want the same code for reuse when converting to other formats such as MusicXML.

Otherwise, there is the consideration of computational efficiency. The filter system is not very optimized in terms of computational speed: At the moment it seems that embedding in verovio requires text output from the filter (the reason for this I do not remember, and it is annoying to have this limitation). I also allow manipulating the Humdrum data directly by a tool in order to pass to another tool without reparsing the data. When passing Humdrum data as a string between tools, each tool has to reparse the data.

Eventually I will be (re)writing a Humdrum-to-MusicXML tool, and when I do so, I will want to pull out a lot of functionality that is currently in iohumdrum.cpp into shared code (this will mostly be done by moving code to the HumdrumFileContent class). Implementing tools as functions in that class will allow them to be applied directly to a HumdrumFile object without needing to use the code as a tool. Tools can be primarily implemented in HumdrumFileContent, and then utilized in a tool as well by calling those functions from within the tool.

The scope of what should be a tool verses what should be implemented in HumdrumFileContent (or in verovio) depends on the complexity of the code: tools should be used for complicated code with many member functions needed to do what is needed (such as myank). Tools are also useful to implement messy code that is safely contained in a class separate from the core HumdrumFile classes. Smaller tasks that might be useful to share across multiple tools should go into the HumdrumFileContent class. But also I don't want the HumdrumFileContent class to get too bulky and unmanageable.

When the action is something related to rendering music notation, that is preferably something that goes into iohumdrum.cpp. This is somewhat related to deg --circle. The main purpose of the deg tool is to identify the scale degrees, and the data should mostly be about the semantic content of scale degrees (plus the approach and departure intervals as a bonus). The general aesthetic is to focus on semantics in the data and not on graphical formatting. The graphical formatting is best expressed as interpretations (which function similar to the scoreDef/staffDef in MEI). However, I am planning on allowing RDF reference records for deg data to allow individual scale degrees to be given graphical formatting. For example:
!!!RDF
deg: @ = circle
could be allowed to circle individual degrees, such as to emphasize degrees compared to others.

The lyricsformatter (worex) tool is pushing against the general aesthetic in that it focuses on the visual aspects rather than an analytic or data processing aspect. This is not necessarily a problem, but I have not had time to think of all of the implications of doing it one way or the other (see next posting for next questions related to this).

@craigsapp
Copy link
Owner

What should happen if *worex or *Xworex are already existing in the score? There could be an additional option to remove all of the existing interpretations.

This one would be fairly easy to solve: use shed:

To force all word extensions to be removed when using the *worex system:

!!!filter: shed -x text -e "s/^worex$/Xworex/I"

What should happen if there are already underscored encoded in the **text data records?

This is one thing I have been thinking about. I would have the *Xworex interpretation automatically suppress any _ in the data when it is active. In other words, *worex/*Xworex has priority over the _ encoding in the data.

It could also be done with `shed:

!!!filter: shed -x text -e "s/_$//"

@craigsapp
Copy link
Owner

Should I revert it back to lyricsformatter (or with a better name) and add options to enable support to add *elision and *Xelision as well? The options could be something like this: lyricsformatter --elision, lyricsformatter --worex, lyricsformatter --xelision, lyricsformatter --xworex; or more dynamic: lyricsformatter -i Xelision where -i stands for --add-interpretation (before first data record).

I would say that dealing with *elision is clearly better to implement directly in verovio. There is no good reason to stick   characters in the data other than for graphical rendering. It is also very simple which also supports putting it in verovio rather than a tool.

For word extenders the choice is more subjective (so I will try implementing in verovio first before deciding). One argument for this is that the _ characters could somehow be used for melisma analysis. But against this argument, it would be better to implement a separate melisma analysis tool for doing melisma analysis.

I have done that already for TiMP https://www.tassomusic.org/analysis/melisma which is far more complicated than just looking for _ characters. Also word extenders may be trivial and not for a melisma: there could be a series of tied notes for the syllable, and the extender just indicates a single note. Also when implementing the TiMP melisma tool, it is noticed that a two-note melisma is not a real melisma, and you might want to know the number of notes for a threshold on how you define a melisma (and there could be other factors such as the duration of the notes in the melisma, etc).

So don't do anything yet (unless there is particular urgency).

Also, I did not mention another path to generating tools. If only a CLI interface is needed and a tool is not needed to run inside of verovio, the code can be placed in the CLI source code file rather than in a Tool_ class. An example of this is the npvi.cpp tool, which outputs analytic data only (so not too useful to use with verovio at the moment):

https://github.com/craigsapp/humlib/blob/master/cli/npvi.cpp

@craigsapp
Copy link
Owner

Currently there is a bug where _ gets added to syllables for single notes when they are followed by a null token caused by another voice. They won't be visible but verovio did show a warning: Syllable with underline extender under one single note. I could not find a good solution for this. Do you have an idea on how to resolve this? I tried with current->getDurationFromStart() but this was not helpful as it has a different duration than the assigned note. How can I get the corresponding **kern note for a **text data record?

For the first part, remind me too look into it later. For the last question the answer can be found in the function which I will be adding to iohumdrum.cpp if word extender styling will end up being processed inside of verovio:

//////////////////////////////
//
// HumdrumInput::hasParallelNote -- Go backwards on the line and count
//   any note attack (or tied note) on the first staff-like spine (track)
//   found to the left.  If there is a spine split in the text and or
//   **kern data, then this algorithm needs to be refined further.
//

int HumdrumInput::hasParallelNote(hum::HTp token) {
   hum::HTp current = token;
   int track = -1;
   while (current) {
      current = current->getPreviousField();
      if (!current) {
         break;
      }
      if (current->isStaffLike()) {
         int ctrack = current->getTrack();
         if (track < 0) {
            track = ctrack;
         }
         if (track != ctrack) {
            return 0;
         }
         if (current->isNull()) {
            continue;
         }
         if (current->isNote()) {
            return 1;
         }
      }
   }
   return 0;
}

To summarize: in order to identify the note that a text syllable is attached to, you look to the left of the **text (or **silbe) spine for the first **kern spine. I actually use token->isStaffLike() rather than specifically **kern. This allows the code to also work with **mens data (and any future representation that renders as a musical staff).

There is a complication related to spine splitting (which I will deal with whenever there is an example). When there are two voices in the **kern data, it is not clear which note the syllable is attached to (or which voice). Verovio may have problems with that as well, since the syllable has to be attached to a note, which may not exist in one voice but the other. And other problem would be what to do if the **text spine splits (which I would wait for a real-world example before trying to solve).

Note that the above code has not been debugged yet, so there may be some off-by-one errors that I will check on later.

@craigsapp
Copy link
Owner

craigsapp commented Feb 7, 2023

But verovio is already rendering the word extender like this. So there is no need to change something for this? Or would you prefer to stop the word extender manually with the mentioned semi-official character?

I would not deprecate the _ system, although I might discourage it after the *worex system were implemented. The latter system allows toggling between styles, but the first one can only be initially on, and then turned off once via shed (but textFormatter is designed to get around that problem).

Hyphens to indicate if a syllable is attached to other syllables before it or after it are important, and not removing them would cause serious problems since they cannot be automatically generated.

Word extenders do not have that sort of criticality. There is no ambiguity when adding them automatically. So in that sense they are more disposable, and easier to turn them on/off with an interpretation. So the general aesthetic is to not include them in the data and instead control though use of interpretations. However, there are cases when _ might be useful for doing basic identification of melismas (but they are not a great melisma indicator as mentioned above).

when there were slices between a syllable token and the next syllable caused by the rhythm of other voices
@WolfgangDrescher
Copy link
Contributor Author

Also, I did not mention another path to generating tools. If only a CLI interface is needed and a tool is not needed to run inside of verovio, the code can be placed in the CLI source code file rather than in a Tool_ class.

Good point. But this is the big advantage of Humdrum to have a dynamic way to interact with the score and this is the reason why I started using humdrum in the first place. To stay as flexibel as possible within the browser context (JS) it's good to have such tool available in verovio. E.g. I'm planning (a student of mine already started) to implement humdrum-tools/verovio-humdrum-viewer#778 which is crucial for one of my next project ideas.

So if it is a feature that people can benefit of using in within the browser directly I will keep making the available within verovio.

Particularly the worex tool in my opinion would be nice to have as a easy configurable filter for verovio. Since the tool is small probably it will be a good idea to keep it as it is and not porting it to iohumdrum.cpp. Otherwise I would have to write a tool anyways that adds the *worex interpretation. It does not do a difference for me. But currently I think I would keep it as it is.

But just tell me if you want me to implement this into iohumdrum.cpp.

@WolfgangDrescher
Copy link
Contributor Author

For the first part, remind me too look into it later. For the last question the answer can be found in the function which I will be adding to iohumdrum.cpp if word extender styling will end up being processed inside of verovio:

Thank you this helped. I thought there is a convenient method to use that I did not find so far. But I added your code to Tool_worex (returning the token instead of a boolean).

I fixed this bug now and it will only print _ to tokens that really need it. Also there was an additional bug with syllables followed by a rest, which is fixed now.

@WolfgangDrescher
Copy link
Contributor Author

have done that already for TiMP https://www.tassomusic.org/analysis/melisma which is far more complicated than just looking for _ characters.

This is a nice view for displaying melisma. Thank you for sharing.

So don't do anything yet (unless there is particular urgency).

Not urgent at all. I was just going through open issues and saw that the word extender tool is still open an thought it's a small tool that can be implemented relatively quick.

I can continue on this when you clarify how you want it implemented.

@craigsapp
Copy link
Owner

Commit rism-digital/verovio@9363775 implements *worex and *Xworex in the Humdrum-to-MEI converter.

Thinking more about the lyricsformatter tool, I think it would be useful to have. It will allow easy automated stying for word extensions. Removing them would be easy by using shed:

!!!filter: shed -x text -e "s/_$//"

But adding them is the complicated part.

There is still consideration as to what to call the tool. If the tool only manages word extensions, then the best name would be worex. If it also handles elision styling, then that would need more thought. My current thoughts are that there should be two separate tools. One reason will be that it will keep the input options simpler.

For worex there would be these options:

Option Meaning
none Analyzes lyrics for melismas, and places an underscore after the ending syllable of a word that has a melisma (or tied group of notes). Also remove incorrect use of _ when there is not a melisma (or tied group) for the ending syllable of a word. (Do not alter any *worex or *Xworex in the file.)
-r Remove all underscores after syllables. (Do not alter any *worex or *Xworex in the file.)
-i Manage line extensions via the *worex/*Xworex interpretations rather than altering the data to add/remove underscore characters. When -i is used by itself, then add *worex at the start of the score before any data in the **text (or **silbe) interpretations. If there is a *Xworex interpretation before the start of the lyrics, then change that to *worex.
-i -r (or -ir) Similar to -i but use *Xwordex instead of *worex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants