Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sankey diagram alias support #6178

Open
TomasHubelbauer opened this issue Jan 10, 2025 · 2 comments
Open

Sankey diagram alias support #6178

TomasHubelbauer opened this issue Jan 10, 2025 · 2 comments
Labels
Graph: Sankey Status: Triage Needs to be verified, categorized, etc Type: Enhancement New feature or request

Comments

@TomasHubelbauer
Copy link
Contributor

Proposal

I searched around to see if there were any issues related to aliases for names related to Sankey diagrams and found nothing.

By aliases I mean the same concept as found in e.g. sequence diagrams: https://mermaid.js.org/syntax/sequenceDiagram.html#aliases

I am aware that Sankey diagrams are in beta, so the lack of such functionality is not surprising, however I want to make sure to make a case for this, because the way the Sankey documentation reads right now, it seems pretty set on interpreting three-columnar CSVs with little to no wiggle room for any sort of authoring improvements like aliases would be:

The idea behind syntax is that a user types sankey-beta keyword first, then pastes raw CSV below and get the result.

I propose adding support for mapping aliases to names atop the section where the CSV goes so that it can be placed in easily and not have to be wrangled itself, but names could still be aliased via the definition which would live in its own section. Please see the example below.

In this scenario, the raw data can contain short names or identifiers and be easily pastable in and it would still be possible to easily and swiftly remap some or all of these names to more appropriate display names without doing a replace on the raw CSV contents and meticulously trying to keep everything in check.

This would be especially beneficial whenever a Sankey diagram would be getting regenerated, because the raw data could be replaced without affecting the alias map incurring no extra work to repeat the same renames on the new dataset.

Example

```mermaid
sankey-beta

A = Alice
B = Bob
C = Chad

A,B,100
A,C,100
```

Screenshots

Before:

```mermaid
sankey-beta

A,B,100
A,C,100
```
image

After:

```mermaid
sankey-beta

A = Alice
B = Bob
C = Chad

A,B,100
A,C,100
```
image
@TomasHubelbauer TomasHubelbauer added Status: Triage Needs to be verified, categorized, etc Type: Enhancement New feature or request labels Jan 10, 2025
@TomasHubelbauer
Copy link
Contributor Author

TomasHubelbauer commented Jan 10, 2025

I've been trying to figure out how to contribute this functionality and this is my attempt on extending the Jison grammar for Sankey to support this:

diff --git a/packages/mermaid/src/diagrams/sankey/parser/sankey.jison b/packages/mermaid/src/diagrams/sankey/parser/sankey.jison
index 9d66b69a4..cbdd68efb 100644
--- a/packages/mermaid/src/diagrams/sankey/parser/sankey.jison
+++ b/packages/mermaid/src/diagrams/sankey/parser/sankey.jison
@@ -13,8 +13,11 @@
 %options case-insensitive
 
 %x escaped_text
-%x csv
 
+%x map
+EQUALS =
+
+%x csv
 // as per section 6.1 of RFC 2234 [2]
 COMMA \u002C
 CR \u000D
@@ -26,12 +29,12 @@ TEXTDATA [\u0020-\u0021\u0023-\u002B\u002D-\u007E]
 
 %%
 
-<INITIAL>"sankey-beta"                         { this.pushState('csv'); return 'SANKEY'; }
-<INITIAL,csv><<EOF>>                           { return 'EOF' } // match end of file
-<INITIAL,csv>({CRLF}|{LF})                     { return 'NEWLINE' }
+<INITIAL>"sankey-beta"                         { this.pushState('opt_map'); return 'SANKEY'; }
+<INITIAL,map,csv><<EOF>>                           { return 'EOF' } // match end of file
+<INITIAL,map,csv>({CRLF}|{LF})                     { return 'NEWLINE' }
 <INITIAL,csv>{COMMA}                           { return 'COMMA' }
 <INITIAL,csv>{DQUOTE}                          { this.pushState('escaped_text'); return 'DQUOTE'; }
-<INITIAL,csv>{TEXTDATA}*                       { return 'NON_ESCAPED_TEXT' }
+<INITIAL,map,csv>{TEXTDATA}*                       { return 'NON_ESCAPED_TEXT' }
 <INITIAL,csv,escaped_text>{DQUOTE}(?!{DQUOTE}) {this.popState('escaped_text'); return 'DQUOTE'; } // unescaped DQUOTE closes string
 <INITIAL,csv,escaped_text>({TEXTDATA}|{COMMA}|{CR}|{LF}|{DQUOTE}{DQUOTE})* { return 'ESCAPED_TEXT'; }
 
@@ -41,7 +44,19 @@ TEXTDATA [\u0020-\u0021\u0023-\u002B\u002D-\u007E]
 
 %% // language grammar
 
-start: SANKEY NEWLINE csv opt_eof;
+start: SANKEY NEWLINE opt_map csv opt_eof;
+
+opt_map: map | ;
+map: entry map_tail;
+map_tail: NEWLINE map | ;
+
+entry
+  : non_escaped\[alias] EQUALS non_escaped\[name] {
+      const alias = $source.trim();
+      const name = $target.trim();
+      yy.addAlias(alias,name);
+    }
+  ;
 
 csv: record csv_tail;
 csv_tail: NEWLINE csv | ;

Plus the accompanying change in sankeyDB.ts to add addAlias.

This currently reports this error:

TypeError: Cannot read properties of undefined (reading 'rules')

I have got actual parser errors when messing around with the Jison changes a bit more, but ultimately I can't figure out where I'm going wrong. My thinking:

  1. I changed <INITIAL>"sankey-beta" to go to opt_map instead of csv so it can be parsed or skipped if it doesn't exist
  2. I added map to all state lists for common tokens like NEWLINE and EOF and NON_ESCAPED_TEXT
  3. The map and entry definitions look fine to me, I don't think that's where the problem is

I think I am lacking a definition of a transition from missing (or even not missing?) map to the csv state. I thought the sequence at start: SANKEY NEWLINE opt_map csv opt_eof; would ensure this, but I am not certain of this. If it was doing that, the parser should have seen that opt_map can be nothing and gone over it to csv where it should have picked the parsing off as if I've not done any changes to this grammar… I think.

Clearly I am missing a connection here so anyone with more Jison experience than me (AKA non-zero), please let me know where the issue is and how I can make this INI-style map structure optional in the grammar.

Also, if this is difficult to express in Jison, maybe I can choose a different syntax for the aliases, something with explicit tokens like in the sequence diagrams case?

participant A as Alice
participant J as John

So

alias Solar PV as Solar Photovoltaics

@nirname
Copy link
Contributor

nirname commented Jan 13, 2025

Great! Open pull request, this would be easier to discuss.

From what I saw it seems you were pushing state 'opt_map', but using 'map' state in rules definitions. No state pop as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Graph: Sankey Status: Triage Needs to be verified, categorized, etc Type: Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants