Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic whitespace handling #49

Open
ghazel opened this issue Jul 24, 2011 · 8 comments
Open

automatic whitespace handling #49

ghazel opened this issue Jul 24, 2011 · 8 comments
Assignees
Milestone

Comments

@ghazel
Copy link

ghazel commented Jul 24, 2011

Parslet grammers are littered with whitespace checks, making them harder to read. Leaving them out fails to parse valid things properly. Take the javascript parser as an example: https://github.com/matthewd/capuchin/blob/d47f4b19eb888b6a4fc5428d3d1fdfcdb551b183/lib/capuchin/parser.rb

There is sp? everywhere. There are very few cases where whitespace is not allowed, and decorating those cases with a different operator to join the atoms seems sufficient.

So, this is a feature request for some sort of functionality like this. pyPEG has a skipws option which seems to work ok.

@kschiess
Copy link
Owner

I can see why you would want this, but am not convinced if we really need it. After all, we can process parslet atoms as if they were data, so appending whitespace to all and everything will not be hard. This really belongs to the mailing list - and if you provide a patch/ an implementation idea, we'll consider it more thoroughly.

@mikeyhew
Copy link

I have some code that implements this: master...mikeyhew:ignore-whitespace. It changes the >> operator so that it consumes 0 or more spaces in between parslets, and adds << for when you don't want to allow spaces. I'm been using it in this project and it has worked well so far, making it more pleasant to write the grammar.

@kschiess It would be interesting to hear what you think about the general idea, as well as whether this would break anything. (I think it caused an error with the infix_expression helper already, but didn't spend much time debugging.)

@kschiess kschiess reopened this Nov 24, 2016
@kschiess
Copy link
Owner

I'll take a look soon.

@kschiess
Copy link
Owner

I like the idea that this is an option you give to the whole parse process. Perhaps we could (as an implementation) create a source that skips whitespace? I do realize this is a problem for a lot of people.

@aaronlippold
Copy link

Hi, any progress on this? This would be a valuable addition. Thanks.

@kschiess
Copy link
Owner

We would welcome a PR that solves this, however we won't be able to dedicate our time to this.

@mikeyhew
Copy link

@kschiess the problem with a global option is that it restricts what you can parse. Even if your grammar is mostly whitespace-insensitive, there are still times when you need >> without whitespace in between. For example, parsing identifiers:

rule(:ident) { match['a-zA-Z'] >> match['a-zA-Z0-9'] }
# how would you do this if the `Source` ignores whitespace?

@kschiess
Copy link
Owner

I'll merge any kind of solution that doesn't lock people into whitespace-agnostic parsers. The default should be not to ignore whitespace. But I think we can make it easy to have a choice.

@kschiess kschiess self-assigned this Jan 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants