Print matrices using unicode large delimiters (on demand) #18270

gagern · 2015-04-21T16:53:55Z

Unicode provides nice symbols which can be combined to form large delimiters. It would be nice if we could print matrices using these, instead of the same ASCII brackets on all the lines. So instead of

[1 2|3]
[4 5|6]
[7 8|9]

I'd like to see one of these:

⎛1 2│3⎞    ⎡1 2│3⎤
⎜4 5│6⎟ or ⎢4 5│6⎥
⎝7 8│9⎠    ⎣7 8│9⎦

Perhaps it's best to do this in small increments: start with a keyword argument to the str method of matrices, then later on make this the default. So the ticket here is only for on-demand support of this feature, not for its automatic use by default. For later reference, see #14733 which switched the banner to Unicode, and thereby made the choice that Sage may look broken on non-Unicode terminals.

Component: user interface

Keywords: unicode matrix

Author: Martin von Gagern

Branch/Commit: 6b6f089

Reviewer: Volker Braun

Issue created by migration from https://trac.sagemath.org/ticket/18270

The text was updated successfully, but these errors were encountered:

gagern · 2015-04-21T16:56:12Z

Branch: u/gagern/MatrixUnicodeDelimiters

gagern · 2015-04-21T16:56:46Z

New commits:

`e174830`	`Trac #18270: Introduce unicode_symbols keyword argument to Matrix.str`

gagern · 2015-04-21T16:56:46Z

Commit: e174830

videlec · 2015-04-21T17:14:13Z

comment:3

This banner is already a mess. If I ssh + screen + sage at my laboratory, then I got a ugly

�����������������������������������

And then I am not able to see anything else (but still inputting Sage commands work).

The new matrices look nice though. For the on-demand feature it would be nice to have a global flag allowing (or avoiding) unicode

$ sage --with-no-unicode

Vincent

videlec · 2015-04-21T17:14:46Z

comment:4

Why not only unicode for the argument name?

gagern · 2015-04-21T18:15:16Z

comment:5

Replying to @videlec:

Why not only unicode for the argument name?

Because in Python 2 unicode names a type as well, and I feared that at some point we might want to convert element types to unicode, thus have to call that coercion function.

gagern · 2015-04-22T00:00:31Z

comment:6

Replying to @videlec:

This banner is already a mess. If I ssh + screen + sage at my laboratory, …

Perhaps you should file that as a bug, so it can be addressed? Does using screen -U help? What does locale print on your client's terminal, inside the ssh and inside screen respectively?

For the on-demand feature

I meant “on-demand” as opposed to “automatic”, with the demand being expressed for each matrix that gets printed, i.e. by using the keyword argument from my commit. I believe you're talking about something far more automatic here.

it would be nice to have a global flag allowing (or avoiding) unicode

The canonical way, at least on Linux, would be to inspect LC_CTYPE facet of the current locale, and detect whether that refers to UTF8 or not. If one used this to automatically choose a sane default, then setting the LC_CTYPE environment variable manually would serve the same function as the switch you suggest:

$ LC_CTYPE=en_US      sage  # without unicode
$ LC_CTYPE=en_US.utf8 sage  # with unicode

Of course, using en_US in this example is just the common default. I guess you as well as I might be using a different setting there in practice. Adding a switch might still make sense to add visibility to this feature. But all of this should probably be discussed in a follow-up ticket once we get the change here accepted.

videlec · 2015-04-22T06:04:15Z

comment:7

Replying to @gagern:

Replying to @videlec:

This banner is already a mess. If I ssh + screen + sage at my laboratory, …

Perhaps you should file that as a bug, so it can be addressed? Does using screen -U help? What does locale print on your client's terminal, inside the ssh and inside screen respectively?

Indeed, screen -U solves the problem. My locale on the client is

LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
LC_CTYPE="en_GB.UTF-8"
...

and on the remote is

LANG=C
LANGUAGE=fr_FR:
LC_CTYPE="C"
...

For the on-demand feature

I meant “on-demand” as opposed to “automatic”, with the demand being expressed for each matrix that gets printed, i.e. by using the keyword argument from my commit. I believe you're talking about something far more automatic here.

it would be nice to have a global flag allowing (or avoiding) unicode

The canonical way, at least on Linux, would be to inspect LC_CTYPE facet of the current locale, and detect whether that refers to UTF8 or not. If one used this to automatically choose a sane default, then setting the LC_CTYPE environment variable manually would serve the same function as the switch you suggest:
$ LC_CTYPE=en_US      sage  # without unicode
$ LC_CTYPE=en_US.utf8 sage  # with unicode
Of course, using en_US in this example is just the common default. I guess you as well as I might be using a different setting there in practice. Adding a switch might still make sense to add visibility to this feature. But all of this should probably be discussed in a follow-up ticket once we get the change here accepted.

I like your suggestion very much.

Vincent

videlec · 2015-04-22T06:51:43Z

comment:8

review comment:

I would very much prefer that you avoid replace as well as the final step which deals with the top and bottom line. You should rather define all the characters needed as variables at the begining (as you did for left_bracket, right_bracket, etc). That would help for readability and if at some point we decide that all these characters are arguments of the function.
why not curly bracket?

Vincent

gagern · 2015-04-22T07:09:27Z

comment:9

Replying to @videlec:

review comment:

I would very much prefer that you avoid replace as well as the final step which deals with the top and bottom line. You should rather define all the characters needed as variables at the begining (as you did for left_bracket, right_bracket, etc). That would help for readability and if at some point we decide that all these characters are arguments of the function.

Indeed when I started this patch, I had a line like

 top_left_bracket, mid_left_bracket, … = u"⎡⎢⎣⎤⎥⎦"

but the long names made things very hard to read. And shorter alternatives, like tlb, make things a bit hard to understand and therefore harder to maintain. Quite the opposite of “help for readability”, in my opinion. But if you insist, I can go with this approach, using the short names.

Keep in mind that the left_bracket and right_bracket strings, which have been there before, are not only representing the brackets, but also collecting the vertical lines at the ends.

Plugging the delimiting brackets in the right rows is a non-trivial affair, and would therefore in my opinion be far harder to maintain. The possibility of an arbitrary number of horizontal lines at either end complicates things considerably. I haven't been able to come up with a reasonably simple code snippet to achieve correct symbols automatically while building these strings. So unless you absolutely insist on this point, or can give a good reason on why my approach has very undesirable reprecussions, or can suggest some formulation, I'd rather stick with my approach. Or adjust it in such a way that it assembles the matrix body without any brackets, and then adds all the bracket parts in a single loop instead of replacing existing incorrect parts.

why not curly bracket?

Because I don't encounter them in my day-to-day work. Can you give me a hint as to where these might occur? For an even number of rows (except two), big curly brackets would look slightly unsymmetric, but I doubt that should be a concern.

videlec · 2015-04-22T07:37:13Z

comment:10

Replying to @gagern:

Replying to @videlec:

review comment:

I would very much prefer that you avoid replace as well as the final step which deals with the top and bottom line. You should rather define all the characters needed as variables at the begining (as you did for left_bracket, right_bracket, etc). That would help for readability and if at some point we decide that all these characters are arguments of the function.

Indeed when I started this patch, I had a line like
 top_left_bracket, mid_left_bracket, … = u"⎡⎢⎣⎤⎥⎦"
but the long names made things very hard to read. And shorter alternatives, like tlb, make things a bit hard to understand and therefore harder to maintain. Quite the opposite of “help for readability”, in my opinion. But if you insist, I can go with this approach, using the short names.

If there are comments it is fine. For example

tlb = u"⎡"    # top left bracket
...

or

# we set delimiters as a string composed of 6 characters
#   - top left bracket (tlb)
#   ...
if unicode_symbols:
    delimiters = u"⎡⎢⎣⎤⎥⎦"
else:
    delimiters = "[[[]]]"
tlb, ... = delimiters

why not curly bracket?

Because I don't encounter them in my day-to-day work. Can you give me a hint as to where these might occur? For an even number of rows (except two), big curly brackets would look slightly unsymmetric, but I doubt that should be a concern.

By "curly", I meant only using ⎧ ⎫ ⎩ ⎭ instead of your more angular version. The term was probably wrong. See

sage: m = matrix([[2,1],[0,1]])
sage: view(m)

... no angle...

gagern · 2015-04-22T10:44:04Z

comment:11

Replying to @videlec:

If there are comments it is fine. For example …

OK, I can do that. It would be 11 characters: the six you used, plus two for single-row matrices, plus three line-drawing characters. The latter could be a separate string if you prefer. I'd make it one string because I'm lazy, but that's a weak argument.

By "curly", I meant only using ⎧ ⎫ ⎩ ⎭ instead of your more angular version.

Ah, round parentheses. Personally, I much prefer these round ones as well. I had the impression that in the USA especially, the square brackets are somewhat more common. But that impression (based mostly on Wikipedia) may be wrong, and in any case the comparison with the LaTeX rendering is a strong motivations to make things consistent and satisfy my personal preference at the same time. We might of course let the user choose between these alternatives, by allowing a string like "round" or "square" instead of True for the keyword argument.

vbraun · 2015-05-03T19:12:34Z

comment:12

I've made #18357: Unicode Art.

IMHO its

British: round bracket ()
American: parentheses ()
square bracket []
curly bracket {}
I also prefer round for matrices. Could you also change it to unicode=<bool> instead of unicode_symbol (shorter and clearer).

vbraun · 2015-05-03T22:28:42Z

comment:13

Fixed:

British: round bracket ()
American: parentheses ()
square bracket []
curly brace {}

sagetrac-git · 2015-05-04T20:18:09Z

Branch pushed to git repo; I updated commit sha1. New commits:

`e4e5f4b`	`Improved unicode delimiters for matrices, allow choosing the shape`

sagetrac-git · 2015-05-04T20:18:09Z

Changed commit from e174830 to e4e5f4b

sagetrac-git · 2015-05-04T20:25:49Z

Changed commit from e4e5f4b to 206b036

sagetrac-git · 2015-05-04T20:25:49Z

Branch pushed to git repo; I updated commit sha1. New commits:

`206b036`	`Format link to trac ticket`

vbraun · 2015-05-05T00:30:56Z

Reviewer: Volker Braun

vbraun · 2015-05-05T00:30:56Z

comment:17

Looks great!

I would prefer unicodedata.lookup('RIGHT SQUARE BRACKET EXTENSION') over ⎢, pasting the unicode in makes it hard to review that the correct code point is used. I'll refactor that in #18357 if you don't mind.

vbraun · 2015-05-05T08:30:18Z

comment:18

PDF docs don't build:

! Package inputenc Error: Unicode char \u8:⎛ not set up for use with LaTeX.

See the inputenc package documentation for explanation.
Type  H <return>  for immediate help.
 ...                                              
                                                  
l.7663 \PYG{g+go}{⎛1 2\textbar{}3⎞}
                                       
? 
! Emergency stop.
 ...                                              
                                                  
l.7663 \PYG{g+go}{⎛1 2\textbar{}3⎞}

vbraun · 2015-05-05T08:35:14Z

comment:19

You just have to add a suitable DeclareUnicodeCharacter to src/doc/common/conf.py

sagetrac-git · 2015-05-05T20:01:38Z

Branch pushed to git repo; I updated commit sha1. New commits:

`6b6f089`	`TeX rendering for unicode big delimiters`

sagetrac-git · 2015-05-05T20:01:38Z

Changed commit from 206b036 to 6b6f089

gagern · 2015-05-05T20:30:34Z

comment:21

This is my attempt to make things look reasonably useful, even though the vertical spacing and placement is far from optimal. But seeing how many other unicode symbols are represented with crude ASCII work-arounds, I think the amount of effort I put into this should be at least on a similar level.

I just filed #18370 about better unicode support by switching to XeTeX or luaTeX. I have no experience with these, but I've read on several occasions that they offer far superior Unicode support. In this sense, I hope that my unicode symbol declarations will be a temporary workaround, although I have no idea just how temporary.

vbraun · 2015-05-05T20:36:13Z

comment:22

Sounds good. The main point of running latex on the docs is to validate the markup, nobody is going to print out the pdf.

vbraun · 2015-05-06T21:03:24Z

Changed branch from u/gagern/MatrixUnicodeDelimiters to 6b6f089

gagern mannequin added this to the sage-6.7 milestone Apr 21, 2015

gagern mannequin added c: user interface labels Apr 21, 2015

gagern mannequin added the s: needs review label Apr 21, 2015

videlec added s: needs work and removed s: needs review labels Apr 22, 2015

This comment has been minimized.

Sign in to view

gagern mannequin added s: needs review and removed s: needs work labels May 4, 2015

vbraun added s: positive review and removed s: needs review labels May 5, 2015

vbraun added s: needs work and removed s: positive review labels May 5, 2015

gagern mannequin added s: needs review and removed s: needs work labels May 5, 2015

vbraun added s: positive review and removed s: needs review labels May 5, 2015

vbraun removed the s: positive review label May 6, 2015

vbraun closed this as completed in 8c347e2 May 6, 2015

gagern mannequin mentioned this issue Aug 25, 2023

Unicode support in reference manual and other docs, e.g. developer manual #18370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Print matrices using unicode large delimiters (on demand) #18270

Print matrices using unicode large delimiters (on demand) #18270

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

videlec commented Apr 21, 2015

videlec commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

videlec commented Apr 22, 2015

gagern mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

gagern mannequin commented Apr 22, 2015

vbraun commented May 3, 2015

vbraun commented May 3, 2015

sagetrac-git mannequin commented May 4, 2015

sagetrac-git mannequin commented May 4, 2015

This comment has been minimized.

sagetrac-git mannequin commented May 4, 2015

sagetrac-git mannequin commented May 4, 2015

vbraun commented May 5, 2015

vbraun commented May 5, 2015

vbraun commented May 5, 2015

vbraun commented May 5, 2015

sagetrac-git mannequin commented May 5, 2015

sagetrac-git mannequin commented May 5, 2015

gagern mannequin commented May 5, 2015

vbraun commented May 5, 2015

vbraun commented May 6, 2015

Print matrices using unicode large delimiters (on demand) #18270

Print matrices using unicode large delimiters (on demand) #18270

Comments

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

videlec commented Apr 21, 2015

videlec commented Apr 21, 2015

gagern mannequin commented Apr 21, 2015

gagern mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

videlec commented Apr 22, 2015

gagern mannequin commented Apr 22, 2015

videlec commented Apr 22, 2015

gagern mannequin commented Apr 22, 2015

vbraun commented May 3, 2015

vbraun commented May 3, 2015

sagetrac-git mannequin commented May 4, 2015

sagetrac-git mannequin commented May 4, 2015

This comment has been minimized.

sagetrac-git mannequin commented May 4, 2015

sagetrac-git mannequin commented May 4, 2015

vbraun commented May 5, 2015

vbraun commented May 5, 2015

vbraun commented May 5, 2015

vbraun commented May 5, 2015

sagetrac-git mannequin commented May 5, 2015

sagetrac-git mannequin commented May 5, 2015

gagern mannequin commented May 5, 2015

vbraun commented May 5, 2015

vbraun commented May 6, 2015