Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed lexers to use Sedlex #392

Open
wants to merge 9 commits into
base: dev-0-1-0
Choose a base branch
from

Conversation

puripuri2100
Copy link
Contributor

@puripuri2100 puripuri2100 commented Feb 24, 2023

@gfngfn
Copy link
Owner

gfngfn commented Feb 26, 2023

Memo:

satysfi doc-lang.saty -o doc-lang.pdf
 ---- ---- ---- ----
  target file: 'doc-lang.pdf'
  dump file: 'doc-lang.satysfi-aux' (will be created)
  parsing 'doc-lang.saty' ...
satysfi: internal error, uncaught exception:
         Failure("TODO: MATHCHARS (\"![\", \"doc-lang.saty\", line 18, characters 19-21)")
         Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33
         Called from Main__Parser.Tables.semantic_action.(fun) in file "src/parser.mly", line 1273, characters 21-101
         Called from MenhirLib.Engine.Make.reduce in file "lib/pack/menhirLib.ml", line 1416, characters 16-42
         Called from MenhirLib.Engine.Make.loop in file "lib/pack/menhirLib.ml", line 1702, characters 25-52
         Called from MenhirLib.Convert.Simplified.traditional2revised in file "lib/pack/menhirLib.ml" (inlined), line 193, characters 4-144
         Called from Main__ParserInterface.process in file "src/frontend/parserInterface.ml", line 16, characters 6-18
         Called from Main__FileDependencyResolver.register_document_file in file "src/frontend/fileDependencyResolver.ml", line 86, characters 4-118
         Called from Main__FileDependencyResolver.main in file "src/frontend/fileDependencyResolver.ml", line 150, characters 10-49
         Called from Main.build.(fun) in file "src/frontend/main.ml", line 1158, characters 17-55
         Called from Main.error_log_environment in file "src/frontend/main.ml", line 382, characters 4-16
         Called from Cmdliner_term.app.(fun) in file "cmdliner_term.ml", line 24, characters 19-24
         Called from Cmdliner_eval.run_parser in file "cmdliner_eval.ml", line 34, characters 37-44

@gfngfn
Copy link
Owner

gfngfn commented Feb 26, 2023

Thank you so much for developing a lexer using Sedlex!

What motivated me to consider re-implemention of the lexer with Sedlex was that it enables us to make every character token in math formulae have exactly one codepoint by replacing %token<Range.t * string> MATHCHARS with %token<Range.t * Uchar.t> MATHCHAR.

Would you mind modifying MATHCHARS as mentioned above? (Of course I will try it if you are not ready, so feel free to turn down!) Probably it will remedy the Failure("TODO: MATHCHARS (\"![\", ...") exception.

@puripuri2100
Copy link
Contributor Author

I modified MATHCHARS of Range.t * string to MATHCHAR of Range.t * Uchar.t

@leque
Copy link
Contributor

leque commented Feb 28, 2023

Great! This PR will also fixes #312. Would you mind adding parser tests about positions around multi-byte characters?

@puripuri2100
Copy link
Contributor Author

I added parser test about multi-byte characters (2c4a50d).

@puripuri2100
Copy link
Contributor Author

This PR fixes #313 :

$ cat e.saty
あ

$ satysfi e.saty
 ---- ---- ---- ----
  target file: 'e.pdf'
  dump file: 'e.satysfi-aux' (will be created)
  parsing 'e.saty' ...
! [Syntax Error at Lexer] at "e.saty", line 1, characters 0-1:
    illegal token 'あ' in a program area

@gfngfn gfngfn modified the milestones: v0.0.12, v0.1.0 Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants