Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added new rule “attr-space-between” #1459

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

actengage
Copy link

@actengage actengage commented Aug 12, 2024

Consider the following the HTML:

<div id="bar"class="foo"></div>

The following error is what I get in the current version of HTMLHint.

Special characters must be escaped : [ &lt; ].
Special characters must be escaped : [ &gt; ].
Tag must be paired, no start tag: [ &lt;/div&gt; ]

The attr-space-between rule ensures HTML tags with attributes must have spaces between them. Without this change, tags with attributes without the required space results in the parser interpreting the tag as text. The new error would be:

Attribute "class" must be separated with a space

Proposed changes:

Change this Regex in htmlparser.ts:

const regTag =
      // eslint-disable-next-line no-control-regex
      /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s+[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g

To the following:

const regTag =
      // eslint-disable-next-line no-control-regex
      /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s*[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g

I also added the attr-space-between rule. The name of this rule could be subject to change, but this was the best I came up with. All tests pass with these changes, and docs were updated accordingly. Let me know what you think, and if I should make any additional changes.

This rule ensures HTML tags with attributes must have spaces between them. Without this change, tags with attributes without the required space results in the parser interpretting the tag as text.
@github-actions github-actions bot added core Relates to HTMLHint's core APIs and features test labels Aug 12, 2024
@@ -17,7 +17,7 @@
}
parse(html) {
const mapCdataTags = this._mapCdataTags;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s+[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s*[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;

Check failure

Code scanning / CodeQL

Bad HTML filtering regexp High

Comments ending with --> are matched differently from comments ending with --!>. The first is matched with capture group 2 and comments ending with --!> are matched with capture group 3.
@@ -17,7 +17,7 @@
}
parse(html) {
const mapCdataTags = this._mapCdataTags;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s+[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s*[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;

Check failure

Code scanning / CodeQL

Inefficient regular expression High

This part of the regular expression may cause exponential backtracking on strings starting with '<-' and containing many repetitions of '!'.
@@ -30,7 +30,7 @@
}
parse(html) {
const mapCdataTags = this._mapCdataTags;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s+[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s*[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;

Check failure

Code scanning / CodeQL

Bad HTML filtering regexp High

Comments ending with --> are matched differently from comments ending with --!>. The first is matched with capture group 2 and comments ending with --!> are matched with capture group 3.
@@ -30,7 +30,7 @@
}
parse(html) {
const mapCdataTags = this._mapCdataTags;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s+[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;
const regTag = /<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s*[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g;

Check failure

Code scanning / CodeQL

Inefficient regular expression High

This part of the regular expression may cause exponential backtracking on strings starting with '<-' and containing many repetitions of '!'.
@@ -54,7 +54,7 @@

const regTag =
// eslint-disable-next-line no-control-regex
/<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s+[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g
/<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s*[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g

Check failure

Code scanning / CodeQL

Bad HTML filtering regexp High

Comments ending with --> are matched differently from comments ending with --!>. The first is matched with capture group 2 and comments ending with --!> are matched with capture group 3.
@@ -54,7 +54,7 @@

const regTag =
// eslint-disable-next-line no-control-regex
/<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s+[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g
/<(?:\/([^\s>]+)\s*|!--([\s\S]*?)--|!([^>]*?)|([\w\-:]+)((?:\s*[^\s"'>\/=\x00-\x0F\x7F\x80-\x9F]+(?:\s*=\s*(?:"[^"]*"|'[^']*'|[^\s"'>]*))?)*?)\s*(\/?))>/g

Check failure

Code scanning / CodeQL

Inefficient regular expression High

This part of the regular expression may cause exponential backtracking on strings starting with '<-' and containing many repetitions of '!'.
@coliff
Copy link
Member

coliff commented Aug 13, 2024

Thanks for the PR! It looks good to me at a glance and could be an helpful addition to HTMLHint.

We need @thedaviddias to review it.

@coliff coliff added keep-unstale The issue will not be marked as stale by the stale-bot rule Relates to HTMLHint's core rules labels Aug 26, 2024
@actengage
Copy link
Author

I didn't see this reply before. Thanks for looking into this. I'd love to get this approved in a new release so I don't have to use my fork in production.

@coliff
Copy link
Member

coliff commented Sep 6, 2024

I didn't see this reply before. Thanks for looking into this. I'd love to get this approved in a new release so I don't have to use my fork in production.

No problem. I'd like to see this added too but we need @thedaviddias to review it as he is the project owner but he hasn't been active here for a couple of years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Relates to HTMLHint's core APIs and features keep-unstale The issue will not be marked as stale by the stale-bot rule Relates to HTMLHint's core rules test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants