Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow line and document termination to be equivalent at | when parsing pipe table rows #316

Open
criloz opened this issue Sep 10, 2024 · 3 comments

Comments

@criloz
Copy link

criloz commented Sep 10, 2024

problem

based on the behavior of the JS implementation, this is considered a table

| lklk  | lklkk |

https://djot.net/playground/?text=%7C+lklk++%7C+lklkk+%7C%0A&sourcepos=false

but this is not

| lklk  | lklkk 

https://djot.net/playground/?text=%7C+lklk++%7C+lklkk+%7C%0A&sourcepos=false

I am currently writing a stream parser for djot and the table spec make it unnecessary complex to do it without buffering.

The parser needs to wait until the line ends to process the whole line as a row or as a paragraph, and weird things can happen with texts like this | lklk *|* lklkk and | lklk *|* lklkk|

solution

Would not be better to just consider the line and document termination events as valid, to parse the line as a table row when it starts with |.

It looks that it does not create any kind of conflicts with any other rules and will allow writing less complex, more secure and faster parsers.

@jgm
Copy link
Owner

jgm commented Sep 10, 2024

I think I may have been reserving the possibility of using the syntax

| blah blah blah
| blah blah

for something else -- cf. reST (and pandoc markdown) "line blocks." That's the only drawback I see to your proposal; it would prevent us from using this distinctive syntax for something.

@criloz
Copy link
Author

criloz commented Sep 10, 2024

@jgm Ahh, I see, that is a good reason, but if is not imperative to have the same syntax from Pandoc a new kind of token could be introduced like |: or |> and still maintain the simplicity of the parser, it will reduce ambiguity and the need to have parallel states, buffering or backtracking

|> The limerick packs laughs anatomical
|> In space that is quite economical.
|>    But the good ones I've seen
|>    So seldom are clean
|> And the clean ones so seldom are comical

edit

a better token could be |> because |: will also create an ambiguity with table

@Omikhleia
Copy link

Just for the cross-reference: #29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants