Math on GitHub: Following Up

[+] simonw|3 years ago|reply

Just spotted this on the GitHub blog, posted 33 minutes ago: https://github.blog/changelog/2022-06-28-fenced-block-syntax...

> Users can now delineate mathematical expressions using ```math fenced code block syntax in addition to the already supported delimiters.

[+] infogulch|3 years ago|reply

I really don't like overloading ```X as a way to render the output of running the code, because now you have no way to display a block of raw mermaid or math code which is the whole point of ``` in the first place. I would be much happier if they introduced a syntax variation to enable this feature more generally instead of overloading the purpose of backticks, something like ```!X (```!math or ```!mermaid) for "run the contents as X and render the output in the document".

[+] xeonmc|3 years ago|reply

I wonder if the initial choice to $ and $$ syntax was to keep the syntax identical to VSCode's built-in KaTeX rendering for Markdown preview? You'd think that they could just port over the VSCode implementation directly which already works quite well without most of the inconsistency issues in the website's MathJax counterpart, especially considering that VSCode is their own Electron product too.

[+] nschloe|3 years ago|reply

This is great news! It's not working for me quite yet, but I'm refreshing my browser every 5 minutes. Next stop: Proper inline math.

[+] red_admiral|3 years ago|reply

This sounds like a solved problem in theory, but someone has to fix the parser.

The commonmark spec defines left- and right-flanking delimiters, and provides a reference implementation that can get this right for backticks to represent inline code. Doing the same with dollar signs - and treating the content delimited by them, like the content delimited by backticks, as not to be processed further as markdown should handle most use cases.

Admittedly, the phrase "an apple costs $1 but an orange is 2$ for some reason" would be incorrectly recognised as math, but only because you're switching how you place the dollar signs halfway through.

Not recognising $[a+b](c+d)$ because the brackets are also link syntax sounds like the parsing operations are done in the wrong order. The $ comes first, so it should win.

I've used jupyter-book to typeset math-in-markdown before now, and it gets this just right, showing that it's a solved problem in practice too unless I'm missing something.

[+] nschloe|3 years ago|reply

I agree: If you're choosing dollar signs for your math delimiters, you can't expect them to work as regular dollar signs anymore; just like backticks.

> sounds like the parsing operations are done in the wrong order.

Indeed!

> I've used jupyter-book to typeset math-in-markdown before now,

They're using $-math as well?

[+] prash_ant|3 years ago|reply

The `$...$' and `$$...$$' delimiters are used in TeX. LaTeX uses the more superior delimiter as `$...$' and `\[...\]'.

All the MathJax and KaTeX related markdown for math should use the LaTeX delimiters and avoid the TeX delimiters.

For more info see https://docs.mathjax.org/en/v2.5-latest/tex.html#tex-and-lat...

[+] chrismorgan|3 years ago|reply

$…$ and \[…\] would not be without problems in Markdown either, as backslash is used for escaping, and there are situations where square brackets and parentheses require escaping.

As a simple example, if you want to write something in literal square brackets, you can normally just write […], but if there’s a link target with a matching name or you’re in an environment that may introduce a link target with a matching name (e.g. rustdoc will see [Foo] and try to find an item to link it to), you may choose to write \[…\] instead. You only need to escape one of the square brackets, but if you’ve escaped both.

[+] chaoxu|3 years ago|reply

I understand this is useful for people who doesn't type math regularly, but as someone who write a huge amount of math, `$...$` is so much less friction than the LaTeX one `$...$`.

Even in LaTeX, mix and match. `$...$` for normal mode, and `\[...\]` for display math, and it works well for two reasons. Inline math are generally short, less prone to mistakes, so `$` saves a lot of time.

[+] unknown|3 years ago|reply

[deleted]

[+] idianal|3 years ago|reply

A GitHub bug I recently noticed that seems related:

Expected: When a repo's readme is named `README` (without the `.md` suffix), it is rendered as plain text. When a repo's readme is named `README.md`, it is rendered as Markdown.

Actual: When a repo's readme is named `README` (without the `.md` suffix), the presence of `$` causes parts of the file to be rendered as math. For a real-life example, see the readme in https://github.com/idianal/personal-site.

Can someone please point me where I can submit a bug report/issue for this?

[+] tmm|3 years ago|reply

There's a second bug related to this. If you click on any file in your repo, then click the browser's back button, instead of parts of your README rendered as math, you get a red box with 'Unable to render expression'. Neat.

[+] colejohnson66|3 years ago|reply

They make it hard to find by pointing you to community support, but you can open a support (or bug) ticket here: https://support.github.com/request

[+] cdubzzz|3 years ago|reply

Good timing! https://github.blog/changelog/2022-06-28-fenced-block-syntax...

[+] nschloe|3 years ago|reply

In May, GitHub added native math support. Unfortunately, it left lots to be desired. Six weeks later, I'm taking another look. Have the major issues been fixed? (Spoiler: No.)

[+] ptsneves|3 years ago|reply

I have been using gist for my Spivak calculus problems. Often I write down on a notebook and then transcribe it to the $<latex>$ format. Along with the other markdown features it has been enough.

Plain latex has a steep curve and not friendly for mobile typing. Plain text is hard to read and hinders the actual math understanding. Therefore, i use gist markdown.

[+] cpp_frog|3 years ago|reply

A few weeks ago I discovered Franklin.jl ([0], [1]), a static site generator that lets you render math easily, and I chose it because the alternatives either took too long to appear or are slow or take too many steps to set up. It has direct KaTeX support and I've been pleased with the results. There is no need for adding or tweaking things unlike Jekyll or Hugo. And KaTeX is faster than MathJax in general.

[0] https://0x0f0f0f.github.io/blog/newblog/

[1] https://franklinjl.org/

[+] throwaway71271|3 years ago|reply

this feature is super garbage: `this costs 5$ but that is 8$, what do you think` is now math expression with all kinds of broken rendering

i had to go and fix like 20 markdown files

how can someone thing this is good syntax is beyond me.

[+] nschloe|3 years ago|reply

Well, it _is_ (La)TeX syntax. With the exception that there, if you want a dollar sign, you have to type \$. But yeah, the syntax isn't made for Markdown.

[+] gbraad|3 years ago|reply

I agree with backticks/code block being the solution. They also do for mermaid, so this would be consistent

[+] bachmeier|3 years ago|reply

> GitHub’s current choice of syntax goes against the Markdown grain.

I don't see that at all. It's pretty common. Far more common IME than all other choices combined. The problems are caused by parsing the markdown first, then slapping on math support to what's left, and of course that doesn't work.

This really is an easy problem to solve (and has been for ages). Handle the math sections first and make math support opt-in. The Gitlab syntax is great unless you want to do something other than have Gitlab render your markdown files. That's not much of a solution.

[+] woodruffw|3 years ago|reply

> Handle the math sections first and make math support opt-in.

Unless I'm misunderstanding what you mean, this isn't possible in a backwards-compatible manner: It's perfectly reasonable for pre-existing Markdown to have content like "Bob pays $1 for bananas, Susan pays $2," which would mis-render a normal sentence as if it had inline math content.

The blog post author's proposal is the most reasonable one: it's both backwards-compatible (the triple-tick "math" group was not already defined or, if it was, a new unique identifier could be used) and doesn't produce any ambiguities with other parts of the Markdown grammar or non-semantic text. Finally, it avoids parser composition, which is a source of all kinds of nasty differential bugs.

[+] tendstofortytwo|3 years ago|reply

I wonder how pandoc[1] does this. You can convert Markdown to PDF (`pandoc test.md -o test.pdf`), it uses the same syntax as GitHub ($ signs only, no backticks) and it fares a lot better than GitHub in a few of the tests outlined in the article[2]. It's not perfect but clearly something better can be done.

[1]: https://pandoc.org/

[2]: https://nsood.in/hn-latex/test.pdf

[+] nschloe|3 years ago|reply

Probably pandoc protects whatever is inside $...$ or $$...$$. GitHub doesn't. I would be curious to know how pandoc handles the other failing cases.

[+] tpoacher|3 years ago|reply

My 2c:

Anki got it right with

    [$] ... [/$]

    [$$] ... [/$$]

I often change the mathjax defaults to these in my own documents. Never ever had a clash.

Worst case scenario? If you need to type a literal "[$]", insert an empty span or something.

No need to mess with "backticks vs no backticks" semantics at all.

[+] chrismorgan|3 years ago|reply

Markdown is just generally awful because it’s not designed to be extensible. And so people make a total hash of things like this when trying to add custom inline syntax (because it’s not possible to do it compatibly), and abuse preformatted code blocks to do something other than show code. (Seriously, if you make ```mermaid … ``` turn it into a diagram, how am I supposed to show syntax-highlighted Mermaid code? Or ```math … ```, same deal. In this regard, I actually prefer the $$ … $$ GitHub have used, for all its problems.) Alternatives like reStructuredText and AsciiDoc are just worlds ahead in sanity. (Markdown’s HTML foundations don’t help, either.)

Markdown is a complete dead end.

I’ve been making a lightweight markup language of my own, and I thought long and hard about this kind of thing, with the goal of making something extremely consistent and easily parseable by human and machine alike. (All popular LMLs are surprisingly hard to parse correctly, so that text editors never have fully correct syntax highlighting unless they use something like LSP-backed highlighting with the real parser.) There’s a common problem with syntax extensions needing semantic understanding before you can actually parse their bodies. I’m using two different syntaxes for the bodies of what I’m calling macros (here shown without arguments to the macros, partly because I’m still not entirely satisfied with any of the syntaxes I’ve tried for them):

  @macro-name{interpreted-macro-body}
  @macro-name`raw-macro-body`

In the case of an interpreted macro body, it will be parsed fully (supporting both block and inline formatting) and fed to the macro so; in the case of a raw body, it will will be fed to the macro uninterpreted, just like with the `…` monospace code syntax. (If you wanted to pass a monospace code element as the body, that’d be @macro-name{`…`}. In the general context, the monospaced code syntax `…` is basically just shorthand for @code`…`, like **bold** can be shorthand for @bold{bold}.) This would lead to the shortest possible syntax for mathematics being @m`…`, which I think is acceptable, and much more syntactically robust. If you needed backticks inside the body, you’d currently have to use @m{…} syntax, backslash-escaping any special syntax, because I haven’t come up with any satisfactory other syntax (allowing delimiter repetition, like @m```…```, doesn’t solve all cases as you can’t use the delimiter at the start or end of the value, a problem that most LMLs that go this way seem to ignore, e.g. I think there are some things that you genuinely can’t express in reStructuredText because of this, and others have awful syntactic hacks like backslash space being special; I’m contemplating @m#`…`# and @m#{…}# with arbitrary but matching number of hashes, like Rust’s raw strings, but it’s still not as neat, so I could end up just leaving it at “use an interpreted body and escape everything”). All up, I think this raw/interpreted body distinction should work pretty well, and is sound.

[+] rhn_mk1|3 years ago|reply

Markdown is a dead end, and that's why I'm using it. There's only so much syntax you can put into plain text and remain readable more or less as text, as opposed to code.

I, too, would prefer some other language for complex expressions, although that's to keep Markdown simple rather than to obtain power.

I never want to decypher latex in READMEs I read, unless it's a readme for a latex library.

[+] nschloe|3 years ago|reply

An interesting take, thanks for the input! As a layman (pretty much), I had always a bit frowned upon reST since I never got used to syntax. That was probably because Markdown was already so popular when I started using it.

One little remark: You can still have highlighted math code blocks in gh's Markdown. The lang here is ```latex. Anyway, I see this might not be satisfactory.

[+] WorldMaker|3 years ago|reply

reStrutucturedText is still useful to look at for inspiration here. It had the concepts of extensible metadata ("field lists"), spans ("interpreted text"), and blocks ("directives"). Including things like applying metadata to spans (using essentially Footnotes to provide field lists to interpreted text sections, like but better than Markdown's reference style for hyperlinks which almost no one uses but were much more common in rST).

I still sometimes wonder if reStructuredText had better acceptance outside of just the Python community if it might have had a better run for "default" versus Markdown's quirkier approach.

https://docutils.sourceforge.io/rst.html

[+] stevejb|3 years ago|reply

It may be worth looking at org-mode syntax. Yes, org-mode is part of Emacs, but the syntax of the file seems to meet some of your requirements, namely parseable by human and machine alike.

[+] eimrine|3 years ago|reply

Is it possible to do all kinds of mathematical research upto ABC-hypothesis using GH only for publishing and maybe getting some help from random folks who also can read math?

44 comments