2d print #1155

mmatera · 2024-11-05T22:52:17Z

This is a kind of experiment. In WMA, in opposition to InputForm which produces an "inline" representation of an expression, OutputForm consists in a "2D" text-like representation of an expression. There are some examples:

In[1]:= a/b                                                                     

        a
Out[1]= -
        b

In[2]:= Sqrt[a]                                                                 

Out[2]= Sqrt[a]

In[3]:= Integrate[f[x],x]                                                       

Out[3]= Integrate[f[x], x]

In[4]:= Sqrt[a]^q                                                               

         q/2
Out[4]= a

In[5]:= MatrixForm[Table[i*j,{i,2},{j,2}]]                                      

Out[5]//MatrixForm= 1   2

                    2   4

In this way, the output in OutputForm is formatted in a similar way to the prettyprint format of Sympy.

This PR provides a way to partially reproduce this behavior when an expression is wrapped by OutputForm, and a variable $Use2DOutputForm is set to True.

The support is not complete, and probably can be improved using the code in sympy.prettyprint. Also, it currently does not work in Mathics-Django, because it is not able to print strings with line breaks.

Here are some examples in Mathics, under this branch:

In[1]:=  $Use2DOutputForm=True
Out[1]= True

In[2]:= OutputForm[a/b]
Out[2]= 
         a 
        ---
         b 

In[3]:= OutputForm[Sqrt[a]]
Out[3]= 
        __
        |a

In[4]:= Integrate[F[x],x]^2//OutputForm
Out[4]= 
                   2
         /+         
         |  F[x] dx 
        +/          

In[5]:= MatrixForm[Table[i*j,{i,2},{j,2}]]
Out[5]//MatrixForm= 1   2
                    
                    2   4

* Improving handling for Infix, Prefix and Postfix format

rocky · 2024-11-06T12:30:58Z

Character-based printing does not feel like a "core" function. Instead, it is purely a formatting function, done after boxing.

Can this be reformulated as a formatting process instead?

mmatera · 2024-11-06T12:39:12Z

@rocky, indeed, my idea when I started writting this (circa July 2023) was to out in a module. But still I was not able to redefine the boxing mechanism and the package interface to make it works as an independent package. In any case, most of the code is independent enough to move it to a package when the other things get ready.

rocky · 2024-11-06T12:45:53Z

It does not necessarily have to be in a Mathics3 module outside of the mathics core repository, but it should be outside of mathics.core and we should figure out what adjustments to the boxing mechanism we need to make in order to allow this to work.

(And yes, if we could hook into SymPy's format routines that would also be awesome.)

I fear that we are making things worse for us in the long run by violating modularity or separation of phases. This kind of thing has happened in this project in the past, and it has caused a lot of extra work that has taken a long time to address (and some of it hasn't been fully addressed even now).

rocky · 2024-11-06T12:50:58Z

n[1]:=  $Use2DOutputForm=True
Out[1]= True

In[2]:= OutputForm[a/b]
Out[2]= 
         a 
        ---
         b

wolframscript has:

     a
     -
     b

Similarly: these appear differently in wolframscript:

In[3]:= OutputForm[Sqrt[a]]
Out[3]= 
        __
        |a

In[4]:= Integrate[F[x],x]^2//OutputForm
Out[4]= 
                   2
         /+         
         |  F[x] dx 
        +/

rocky · 2024-11-06T12:59:39Z

Comparing with what SymPy produces:

>>> x, y, z, t = symbols("x y z t")
>>> x / y
x
─
y
>>> sqrt(2)
√2
>>> xx, yy, = symbols("xx, yy")
>>> xx / yy
xx
──
yy

This is different from wolframscript in that some Unicode symbols seem to be used: ─ (a longer version of -) and √ and I think that is okay if we hook into SymPy's formatting. However, as is, we have a 3rd invented formulation and we have encountered this kind of problem as well in the past.

mmatera · 2024-11-07T22:04:56Z

√2

Indeed, what I build was a kind here is a kind of experiment. I didn't try too hard to mimic exactly what Sympy or WMA does, apart from building a "2D" text representation, that is what OutputForm is about.
In any case, regarding modularity, we could also move the mathics/core/prettyprint.py module to the formatting folder, or even to a different project, since it is more or less independent.

rocky · 2024-11-09T20:39:10Z

mathics/format/pane_text.py

@@ -233,7 +233,7 @@ def fraction(a: Union[TextBlock, str], b: Union[TextBlock, str]) -> TextBlock:
        a = TextBlock(a)
    if isinstance(b, str):
        b = TextBlock(b)
-    width = max(b.width, a.width) + 2
+    width = max(b.width, a.width)


Instead of adding more original code, please look into hooking into SymPy's character-based formatting mechanism.

Otherwise, this may be another kind of thing where effort is put into creating something that is later removed, because there is something that is more likely to be more complete and that we won't have to maintain.

If it turns out that we can't use SymPy's character-based printing, then we can go down this road.

Indeed, the plan is to replace the use of mathics.format.pane_text by sympy.printing.pretty.stringpict, which is something that I already started to do.

What is not possible to do is just using sympy.pretty(expression.to_sympy()) because the translation function which is useful for evaluation, is not useful for formatting.

For example

>>> import sympy >>> from mathics.session import MathicsSession >>> session=MathicsSession() >>> expr=session.evaluate("Integrate[f[x]/g[x]^2,x]") >>> sympy.pretty(expr.to_sympy())

produces

⌠ ⎮ SympyExpression(_uGlobal`f, _uGlobal`x)[Global`f[Global`x]]) ⎮ ───────────────────────────────────────────────────────────── d(_uGlobal`x) ⎮ 2 ⎮ SympyExpression(_uGlobal`g, _uGlobal`x)[Global`g[Global`x]]) ⌡

while

>>> expr=session.evaluate("Integrate[f[x]/g[x]^2,{x,a,b}]") >>> print(sympy.pretty(expr.to_sympy()))

is not even able to identify the integrate symbol:

SympyExpression(_uSystem`Integrate, SympyExpression(_uGlobal`f, _uGlobal`x)/SympyExpression(_uGlobal`g, _uGlobal`x)**2, SympyExpression(_uSystem`List, _uGlobal`x, _uGlobal` ↪ ↪ a, _uGlobal`b))[System`Integrate[System`Times[Global`f[Global`x], System`Power[Global`g[Global`x], -2]], {Global`x,Global`a,Global`b}]])

In any case, the purpose of this PR is to

considering if this 2D format is something that we would like to use in the REPL

bring the formatting routines closer to the one used in WL

explore a possible design pattern (based on what we already have) to connect a formatted Mathics expression to a prettyForm object.

Indeed, the plan is to replace the use of mathics.format.pane_text by sympy.printing.pretty.stringpict, which is something that I already started to do.

What is not possible to do is just using sympy.pretty(expression.to_sympy()) because the translation function which is useful for evaluation, is not useful for formatting.

For example

>>> import sympy >>> from mathics.session import MathicsSession >>> session=MathicsSession() >>> expr=session.evaluate("Integrate[f[x]/g[x]^2,x]") >>> sympy.pretty(expr.to_sympy())

This is clearly wrong because expr needs to be boxed first. Calling SymPy formatting routines are triggered by the formatting process of boxed expressions.

This is setting up a strawman or superficial argument only to be able to shoot it down.

The time spent adjusting the bar in a division I think is better spent towards getting to the skeleton of a possible solution. The plan that you wanted written down has you working on revising Boxing. When that is in place, we might be in an even better position to work on the boxing to formatting step needed in character-based printing.

This is clearly wrong because expr needs to be boxed first. Calling SymPy formatting routines are triggered from formatting boxes.

This is setting up a strawman or superficial argument only to be able to shoot it down.

Please, do not take this in a wrong way: the example was just a way to show some of the challenges in hooking up sympy.pretty: even at the level of symbols, a preprocessing must be done. Probably what is more easy to hook is the sympy.printing.pretty.stringpict.prettyForm, which is quite analogous to what I put in pane_text.

sympy.pretty works using a sympy.printing.pretty.PrettyPrinter object that does something similar to what I did in mathics.format.prettyprint.

The time spent adjusting the bar in a division I think is better spent towards getting to the skeleton of a possible solution. The plan that you wanted written down has you working on revising Boxing. When that is in place, we might be in an even better position to work on the boxing to formatting step needed in character-based printing.

OK, but the skeleton of the solution that I propose is already here: When an expression is wrapped by OutputForm (even if $Use2DOutputForm is set to False) the formatting process follows a sequence closer to the one (I think) WMA follows.

In any case, I wanted to put this here, because I will need it to present my case when I propose the other changes.

Please, do not take this in a wrong way: the example was just a way to show some of the challenges in hooking up sympy.pretty: even at the level of symbols, a preprocessing must be done.

I was aware of the challenges and the process well before you started this PR. "preprocessing" is the wrong word/concept. In the formatting process, objects like SymPy symbols get transformed to strings.

Probably what is more easy to hook is the sympy.printing.pretty.stringpict.prettyForm, which is quite analogous to what I put in pane_text.

sympy.pretty works using a sympy.printing.pretty.PrettyPrinter object that does something similar to what I did in mathics.format.prettyprint.

Except more effort, time, and thought was probably put into the sympy.printing.pretty.PrettyPrinter object. It may be that it is more elucidating for you to write some code so you understand the basic concepts rather than look at someone else's code. For me though this kind of thing is more of a distraction and possibly a dangerous activity, because I get the feeling that if I don't mention something, this kind of thing will get into the code base and then we'll want to remove it later on. We have seen this kind of thing too often. Furthermore, we haven't dug out of the previous messes fully yet.

OK, but the skeleton of the solution that I propose is already here: When an expression is wrapped by OutputForm (even if $Use2DOutputForm is set to False) the formatting process follows a sequence closer to the one (I think) WMA follows.

There are probably very many situations that aren't covered and haven't been considered. And in the first few things I tried, I saw differences.

BTW, I don't like the term "2D". This is character-based output. Most output, such as SVG, MathML, and LaTeX, is 2D.

In any case, I wanted to put this here, because I will need it to present my case when I propose the other changes.

Personally, I would prefer if you discuss what changes you want to propose at a high level first. If I need detailed code to understand, then we can code this out. If you need to write some sample code for yourself , sure do that. But unless others express interest in seeing this, these branches don't help me in a positive way. Rather, it feels negative because I see flailing about where it feels to me there shouldn't be flailing like this.

In my view, the developer docs describe in pretty good detail how the system transforms M-expressions to boxed-expressions, and then to formatted output.

OK, but for naming functions, it would be a little bit long, isn't it? Here I didn´t want to call it "OutputForm", because in all the other places, "OutputForm" is still a "one-dimensional using only keyboard characters" which most of the time is the same than InputForm, but with spaces between infix operators and operands.

In any case, the question is: supposing we found and agree on an implementation of this "two-dimensional using only keyboard characters", is it something that we want to have (at least optional) available in the command line interface?

I guess what I was trying to say is that if you have to drop something in the description, dropping "character-based" is bad, in the same way that condensing "strawberry" to "straw" rather than "berry" is not helpful.
"Character2D" is not too long. But the word "character" is as important as 2D.

In any case, the question is: supposing we found and agree on an implementation of this "two-dimensional using only keyboard characters", is it something that we want to have (at least optional) available in the command line interface?

It feels to me that we are thinking about this the wrong way. To me, this is like asking if TeXForm should be in the command-line interface. And whether we should be able to return MathML formatted output in the command-line interface.

To me, the focus of the implementation should be on how things are boxed and formatted in a generic and general way. Not about what is appropriate for a particular front end. The current implementation is lacking in the generic and general nature. It was particularly more evident when this was in mathics.core which 2D character-based formatting is totally inappropriate for.

Here is another example from our code base. We have these whacky and complicated regular expressions for handling doctests inside of docstrings. I imagine the person that started this may have thought it cool to recreate sphinx using regular expressions. Those regular expressions use some pretty advanced and little-used tagging mechanisms. If you want to show off how clever you can code, great. But as far as handling the underlying problem in a uniform, maintainable, and comprehensible way, this code totally fails.

Let's defer character-based 2D output formatting until after we have Boxing under control. Formatting is intimately tied to Boxing. And if we have a good Boxing mechanism, I think you'll see how easily character-based 2D formatting falls out from that.

In any case, the question is: supposing we found and agree on an implementation of this "two-dimensional using only keyboard characters", is it something that we want to have (at least optional) available in the command line interface?

It feels to me that we are thinking about this the wrong way. To me, this is like asking if TeXForm should be in the command-line interface. And whether we should be able to return MathML formatted output in the command-line interface.

TeXForm is useful and needed at last to generate the PDF documentation. MathML is used in the Django front-end. PrettyPrint is something "pretty" but maybe is a waste of time to implement/maintain it. For example, probably we do not want to use to check doctests, because writing the expected results would be awkward.

To me, the focus of the implementation should be on how things are boxed and formatted in a generic and general way. Not about what is appropriate for a particular front end. The current implementation is lacking in the generic and general nature. It was particularly more evident when this was in mathics.core which 2D character-based formatting is totally inappropriate for.

Yes, but at least for me, it helps me to think why WMA implements boxing as it does. And this is all the reason I wrote and put this here.
Now I am putting another PR, where all the Character2D code is stripped away.

Here is another example from our code base. We have these whacky and complicated regular expressions for handling doctests inside of docstrings. I imagine the person that started this may have thought it cool to recreate sphinx using regular expressions. Those regular expressions use some pretty advanced and little-used tagging mechanisms. If you want to show off how clever you can code, great. But as far as handling the underlying problem in a uniform, maintainable, and comprehensible way, this code totally fails.

OK

Let's defer character-based 2D output formatting until after we have Boxing under control. Formatting is intimately tied to Boxing. And if we have a good Boxing mechanism, I think you'll see how easily character-based 2D formatting falls out from that.

Sure. In any case, if you are OK with it, I will leave this here for a while.

mathics/format/pane_text.py

mmatera added 14 commits January 5, 2023 21:11

not ready

8e42c96

second round

d990b15

tmp

2b51c37

another little step

9595b07

merge

161705b

partial

14889d7

* Improving precedence comparison.

331c9a7

* Improving handling for Infix, Prefix and Postfix format

partial

03dc038

rocky's observations

c37afe7

2d_print support

bee14ae

working

845fb99

mypy

760e04e

Merge branch 'master' into 2d_print

19c9904

removing testing function

a591742

Merge branch 'master' into 2d_print

c5faf8f

mmatera added 3 commits November 9, 2024 10:40

moving and renaming modules

1cd0bc8

fixing formatting in fractions and square roots

baa276e

Merge branch 'master' into 2d_print

f91168f

rocky reviewed Nov 9, 2024

View reviewed changes

mmatera commented Nov 9, 2024

View reviewed changes

mathics/format/pane_text.py Show resolved Hide resolved

Merge branch 'master' into 2d_print

af8e4c2

This was referenced Nov 10, 2024

Pretty print #1162

Draft

Prettyform mmatera/sympy#1

Closed

Improving sympy.printing.pretty.stringpict API sympy/sympy#27257

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2d print #1155

2d print #1155

mmatera commented Nov 5, 2024

rocky commented Nov 6, 2024 •

edited

Loading

mmatera commented Nov 6, 2024

rocky commented Nov 6, 2024 •

edited

Loading

rocky commented Nov 6, 2024

rocky commented Nov 6, 2024 •

edited

Loading

mmatera commented Nov 7, 2024

rocky Nov 9, 2024 •

edited

Loading

mmatera Nov 9, 2024

rocky Nov 9, 2024 •

edited

Loading

mmatera Nov 9, 2024

rocky Nov 9, 2024 •

edited

Loading

mmatera Nov 10, 2024

mmatera Nov 10, 2024

rocky Nov 10, 2024

rocky Nov 10, 2024 •

edited

Loading

mmatera Nov 11, 2024

2d print #1155

Are you sure you want to change the base?

2d print #1155

Conversation

mmatera commented Nov 5, 2024

rocky commented Nov 6, 2024 • edited Loading

mmatera commented Nov 6, 2024

rocky commented Nov 6, 2024 • edited Loading

rocky commented Nov 6, 2024

rocky commented Nov 6, 2024 • edited Loading

mmatera commented Nov 7, 2024

rocky Nov 9, 2024 • edited Loading

Choose a reason for hiding this comment

mmatera Nov 9, 2024

Choose a reason for hiding this comment

rocky Nov 9, 2024 • edited Loading

Choose a reason for hiding this comment

mmatera Nov 9, 2024

Choose a reason for hiding this comment

rocky Nov 9, 2024 • edited Loading

Choose a reason for hiding this comment

mmatera Nov 10, 2024

Choose a reason for hiding this comment

mmatera Nov 10, 2024

Choose a reason for hiding this comment

rocky Nov 10, 2024

Choose a reason for hiding this comment

rocky Nov 10, 2024 • edited Loading

Choose a reason for hiding this comment

mmatera Nov 11, 2024

Choose a reason for hiding this comment

rocky commented Nov 6, 2024 •

edited

Loading

rocky commented Nov 6, 2024 •

edited

Loading

rocky commented Nov 6, 2024 •

edited

Loading

rocky Nov 9, 2024 •

edited

Loading

rocky Nov 9, 2024 •

edited

Loading

rocky Nov 9, 2024 •

edited

Loading

rocky Nov 10, 2024 •

edited

Loading