Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for formatting .yrl and .xrl files #41

Open
michalmuskala opened this issue Jul 3, 2020 · 14 comments
Open

Add support for formatting .yrl and .xrl files #41

michalmuskala opened this issue Jul 3, 2020 · 14 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@michalmuskala
Copy link
Member

michalmuskala commented Jul 3, 2020

cc @ilya-klyuchnikov

@michalmuskala michalmuskala added the enhancement New feature or request label Jul 3, 2020
@michalmuskala michalmuskala changed the title Consider parsing .yrl and .xrl files Consider formatting .yrl and .xrl files Jul 3, 2020
@awalterschulze awalterschulze self-assigned this Sep 4, 2020
@awalterschulze awalterschulze added the help wanted Extra attention is needed label Mar 31, 2021
@awalterschulze awalterschulze changed the title Consider formatting .yrl and .xrl files Add support for formatting .yrl and .xrl files Mar 31, 2021
@awalterschulze awalterschulze removed their assignment May 12, 2021
@slaykachu
Copy link
Contributor

Jumping on this task, I have some questions:

erlfmt can process arbitrary files. So am I correct by summarising the task as:

  • Manually identify the rules that shouldn't be applied to yrl/xrl.
  • Filter them in the code when file have extension yrk/xrl (maybe the ignore-list is different for each, not big deal).
  • Mute the warnings (syntax error) for non erlang-like sections (or should we plug a dedicated grammar?)
  • Process yrl/xrl files when formatting a whole directory.

@awalterschulze
Copy link
Member

awalterschulze commented May 12, 2021

Each erlang form ends with a .
In a .yrl file, these forms can be mixed and either be a normal erlang form or a yrl specific form.
Currently we warn on any forms (including .yrl specific forms) that we cannot parse.
We should keep on warning on forms we cannot parse, but we want to add the ability to parse .yrl specific forms.

Here are some things we will need to add:

  1. I think we will need to add the ability to parse .yrl specific forms into our erlfmt_parse.yrl grammar OR
    we can create a separate grammar that can only parse the specific .yrl forms and use it as a fallback when parsing normal erlang forms fails.
  2. erlfmt_recomment.erl will also need to be able to add comments back into the AST
  3. Finally we need to add the ability to erlfmt_format.erl to be able to format these new expressions.

@awalterschulze
Copy link
Member

@michalmuskala I think I saw a specification for .yrl grammars previously, but I can't find it again for some reason. Do you know if I was imagining things or if this exists somewhere?

@michalmuskala
Copy link
Member Author

It's in here a bit lower: https://erlang.org/doc/man/yecc.html#more-examples

I agree adding first-class support for yecc forms would be probably the "nicest" solution

@ilya-klyuchnikov
Copy link
Member

It can be done gradually. - There is the explicit "Erlang code." part of .yrl files - adding support for understanding and formatting it may be the first step.

@awalterschulze
Copy link
Member

It's in here a bit lower: https://erlang.org/doc/man/yecc.html#more-examples

I agree adding first-class support for yecc forms would be probably the "nicest" solution

Amazing thank you, I somehow missed that.

Screen Shot 2021-05-12 at 3 55 39 PM

@michalmuskala
Copy link
Member Author

It can be done gradually. - There is the explicit "Erlang code." part of .yrl files - adding support for understanding and formatting it may be the first step.

Looks like this actually mostly works today - just issues a lot of warnings. #289

@awalterschulze
Copy link
Member

Yes, this task is about also formatting the yrl specific parts without needing to warn, because with work we should be able to also parse and format it.

@awalterschulze
Copy link
Member

awalterschulze commented May 12, 2021

But technically, like Ilya maybe meant, this can also be done gradually, by for example, first only supporting declaration and only in a second pull request support rule, etc.

@slaykachu
Copy link
Contributor

Yes, this task is about also formatting the yrl specific parts without needing to warn, because with work we should be able to also parse and format it.

Until everything is supported, I think a good compromise could be to warn once that the file wasn't entirely formatted.

@slaykachu
Copy link
Contributor

slaykachu commented May 18, 2021

I agree adding first-class support for yecc forms would be probably the "nicest" solution

Then, is there a reference on how it should be formatted?
Anyway, looking at erlfmt_parsle.yrl as an exhaustive example, it seems that would could only make things worse:

  1. Nice tabulation would be broken:
    Right now:
attribute -> '-' 'if' attr_val               : build_attribute('$1', '$2', '$3').
attribute -> '-' atom                        : build_attribute('$1', '$2', no_parens).
attribute -> '-' atom attr_val               : build_attribute('$1', '$2', '$3').

Potential worse result:

attribute -> '-' 'if' attr_val : build_attribute('$1', '$2', '$3').
attribute -> '-' atom : build_attribute('$1', '$2', no_parens).
attribute -> '-' atom attr_val : build_attribute('$1', '$2', '$3').
  1. Compact rules would take much more vertical space
    Right now:
type -> type '::' type : ?mkop2('$1', '$2', '$3').
type -> type '|' type : ?mkop2('$1', '$2', '$3').
type -> type '..' type : ?mkop2('$1', '$2', '$3').

Potential worse result:

type -> type '::' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '|' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '..' type : 
    ?mkop2('$1', '$2', '$3').

Or

type -> 
    type '::' type : ?mkop2('$1', '$2', '$3').
type -> 
    type '|' type : ?mkop2('$1', '$2', '$3').
type -> 
    type '..' type : ?mkop2('$1', '$2', '$3').

@awalterschulze
Copy link
Member

Great question and thank you for bringing examples in the ask, this points out great things to consider and is making me think.

Of the ideas here, I would prefer

type -> type '::' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '|' type : 
    ?mkop2('$1', '$2', '$3').
type -> type '..' type : 
    ?mkop2('$1', '$2', '$3').

The reason is that erlfmt does not align things anywhere else.

I have looked at how I have manually formatted a bnf for another grammar I worked on and I naturally conformed to this style sometimes. Although we had opening and closing braces for the SDT rule. So it was a bit of an easier style choice.

Trying out some other styles:

attribute ->
    '-' 'if' attr_val               
    : build_attribute('$1', '$2', '$3').
attribute
    -> '-' 'if' attr_val               
    : build_attribute('$1', '$2', '$3').
attribute -> 
    '-' 'if' attr_val :
    build_attribute('$1', '$2', '$3')
.

After that I think I still prefer your idea

attribute -> '-' 'if' attr_val :
    build_attribute('$1', '$2', '$3').

But we can discuss this for a while.
The first step is parsing and creating a first format.
Then it can be easier to iterate with tests.

But great to start thinking about this already.
Thank you for bringing this up.

@slaykachu
Copy link
Contributor

I'm a big fan of vertical alignment. Helps to see the structure / patterns (either it's repeated, or on the contrary the variations are more easily seen). It's also helpful to spot copy-paste mistake.

Some people cannot stand it, so it would be both complicated and controversial to make a pretty printer generate that.
Maybe what we could do at some point is to preserve existing alignements, without the need to flag %% format off.

@slaykachu
Copy link
Contributor

slaykachu commented Jun 14, 2021

Jumping out of this task! (Exercise is good for you). Rationale:

  • As @michalmuskala has demonstrated, that is already working for erlang parts of the files.
  • Isolating non-erlang parts and plugging corresponding ad-hoc grammars for those would be too much work for too little benefit.
  • In my book that would be actually negative benefit (a.k.a nuisance). See discussion about lost of vertical alignment.
  • One of the advantage of formatting is uniformity/consistency. Yet yrl and xrl files are very rare in most codebases.
  • I suggested to quiet or summarise the warnings for non-erlang parts. But we're rather keep them as is: more explicit, consistent and systematic. A bit verbose for users, yet handier for toolchain integration.
    • Available mitigations: exclude such files or use %% format off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants