[llvm-dev] [RFC] Formalizing FileCheck Features

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
Background
----------

FileCheck [0] is a cornerstone testing tool for the LLVM project.  It
has grown new features over the years to meet new needs, but these
sometimes have surprising and counter-intuitive behavior [1].  This
has become even more evident in Joel Denny's recent quest to repair
what seemed like an obvious defect [2] but which led me to the
conclusion [3] that FileCheck sorely needed a clear, intuitive
conceptual model.  And then someone to make it work that way (hi
Joel!).

Basic Conceptual Model
----------------------

FileCheck should operate on the basis of these three fundamental
concepts.

(1) Search range.  This is some substring of the input text where one
or more directives will do their pattern-matching magic.

(2) Match range.  This is a substring of a search range where a
directive (or in one case, a group of directives) has matched a
pattern.

(3) Directive groups.  These are sequences of adjacent directives that
operate in a related way on a search range.  Directives within a group
are processed in order, except as noted in the directive description.

Finally we add The Rule:  No match ranges may overlap.

(This is largely formalizing what FileCheck already does, except that
it didn't have The Rule with respect to DAG matches.  That's the bug
that Joel was originally trying to fix, until I stuck my nose into
it.)

Directive Descriptions Based On Conceptual Model
------------------------------------------------

Given the conceptual model, all directives can be defined in terms of
it. This is possibly going overboard with the formalism but hey, we're
all compiler geeks here.

CHECK: Scans the search range for a pattern match. Fails if no match
is found.  The end of the match range becomes the start of the search
range for subsequent directives.

CHECK-SAME: Like CHECK, plus there must be zero newlines prior to the
start of the match range.

CHECK-NEXT: Like CHECK, plus there must be exactly one newline prior
to the start of the match range.

CHECK-LABEL: All LABEL directives are processed before any other
directives.  These directives have two effects.  First, they act like
CHECK directives, but also partition the input text into disjoint
search ranges, delimited by the match ranges of the LABEL directives.
Second, they partition the remaining directives into Label Groups,
each of which operates on the corresponding search range.  For truly
pedantic formalism, we can say there are implicit LABEL directives
matching the start and end of the entire input text, thus all
non-LABEL directives are always in some Label Group.

CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
is not executed immediately; instead the next non-NOT directive is
executed first, and the start of that directive's match range becomes
the end of the NOT Group's search range.  (If the next directive is
LABEL, it has already executed and has a match range, which is already
the end of the search range.)  After the NOT Group's search range is
defined, each NOT directive scans the range for a match, and fails if
a match is found.

CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
is not executed immediately; instead the next non-DAG directive is
executed first, and the start of that directive's match range becomes
the end of the DAG Group's search range.  If the next directive is
CHECK-NOT, the end of the DAG Group's search range is
unaffected. (This might or might not be FileCheck's historical
behavior; I didn't check.)  After the DAG Group's search range is
defined, each DAG directive scans the range for a match, and fails if
a match is not found.  Per The Rule, match ranges for DAG directives
may not overlap. (This is not historical FileCheck behavior, and the
bug Joel Denny wanted to fix.)  After all DAG directives run, the
match range for the entire DAG Group extends from the start of the
earliest match to the end of the latest match.  The end of that match
range becomes the start of the search range for subsequent directives.

Observations
------------

A CHECK-NOT still separates surrounding CHECK-DAG directives into
disjoint groups, and does not permit matches from the two groups to
overlap. DAG was originally implemented to detect and diagnose an
overlap, but this worked only for the first DAG after a NOT. This can
lead to counter-intuitive behavior and potentially makes certain kinds
of matches impossible.

Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
behavior, but it's unlikely to be useful.  Putting SAME or NEXT as the
first directive in a file likewise has defined behavior, matching
precisely the first or second line (respectively) of the input text.


References
----------
[0] https://llvm.org/docs/CommandGuide/FileCheck.html
[1] https://www.youtube.com/watch?v=4rhW8knj0L8
[2] https://lists.llvm.org/pipermail/llvm-dev/2018-May/123010.html
[3] https://reviews.llvm.org/D47106

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev

On 05/24/2018 08:46 AM, via llvm-dev wrote:

> Background
> ----------
>
> FileCheck [0] is a cornerstone testing tool for the LLVM project.  It
> has grown new features over the years to meet new needs, but these
> sometimes have surprising and counter-intuitive behavior [1].  This
> has become even more evident in Joel Denny's recent quest to repair
> what seemed like an obvious defect [2] but which led me to the
> conclusion [3] that FileCheck sorely needed a clear, intuitive
> conceptual model.

Thanks for writing this up. I definitely think that it will be good to
add this to FileCheck's documentation.

>   And then someone to make it work that way (hi
> Joel!).
>
> Basic Conceptual Model
> ----------------------
>
> FileCheck should operate on the basis of these three fundamental
> concepts.
>
> (1) Search range.  This is some substring of the input text where one
> or more directives will do their pattern-matching magic.
>
> (2) Match range.  This is a substring of a search range where a
> directive (or in one case, a group of directives) has matched a
> pattern.
>
> (3) Directive groups.  These are sequences of adjacent directives that
> operate in a related way on a search range.  Directives within a group
> are processed in order, except as noted in the directive description.
>
> Finally we add The Rule:  No match ranges may overlap.
>
> (This is largely formalizing what FileCheck already does, except that
> it didn't have The Rule with respect to DAG matches.  That's the bug
> that Joel was originally trying to fix, until I stuck my nose into
> it.)
>
> Directive Descriptions Based On Conceptual Model
> ------------------------------------------------
>
> Given the conceptual model, all directives can be defined in terms of
> it. This is possibly going overboard with the formalism but hey, we're
> all compiler geeks here.
>
> CHECK: Scans the search range for a pattern match. Fails if no match
> is found.  The end of the match range becomes the start of the search
> range for subsequent directives.
>
> CHECK-SAME: Like CHECK, plus there must be zero newlines prior to the
> start of the match range.
>
> CHECK-NEXT: Like CHECK, plus there must be exactly one newline prior
> to the start of the match range.
>
> CHECK-LABEL: All LABEL directives are processed before any other
> directives.  These directives have two effects.  First, they act like
> CHECK directives, but also partition the input text into disjoint
> search ranges, delimited by the match ranges of the LABEL directives.
> Second, they partition the remaining directives into Label Groups,
> each of which operates on the corresponding search range.  For truly
> pedantic formalism, we can say there are implicit LABEL directives
> matching the start and end of the entire input text, thus all
> non-LABEL directives are always in some Label Group.
>
> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
> is not executed immediately; instead the next non-NOT directive is
> executed first, and the start of that directive's match range becomes
> the end of the NOT Group's search range.

Both here, and for CHECK-DAG, we should say something about reaching the
end of the input.

 -Hal

>   (If the next directive is
> LABEL, it has already executed and has a match range, which is already
> the end of the search range.)  After the NOT Group's search range is
> defined, each NOT directive scans the range for a match, and fails if
> a match is found.
>
> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
> is not executed immediately; instead the next non-DAG directive is
> executed first, and the start of that directive's match range becomes
> the end of the DAG Group's search range.  If the next directive is
> CHECK-NOT, the end of the DAG Group's search range is
> unaffected. (This might or might not be FileCheck's historical
> behavior; I didn't check.)  After the DAG Group's search range is
> defined, each DAG directive scans the range for a match, and fails if
> a match is not found.  Per The Rule, match ranges for DAG directives
> may not overlap. (This is not historical FileCheck behavior, and the
> bug Joel Denny wanted to fix.)  After all DAG directives run, the
> match range for the entire DAG Group extends from the start of the
> earliest match to the end of the latest match.  The end of that match
> range becomes the start of the search range for subsequent directives.
>
> Observations
> ------------
>
> A CHECK-NOT still separates surrounding CHECK-DAG directives into
> disjoint groups, and does not permit matches from the two groups to
> overlap. DAG was originally implemented to detect and diagnose an
> overlap, but this worked only for the first DAG after a NOT. This can
> lead to counter-intuitive behavior and potentially makes certain kinds
> of matches impossible.
>
> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
> behavior, but it's unlikely to be useful.  Putting SAME or NEXT as the
> first directive in a file likewise has defined behavior, matching
> precisely the first or second line (respectively) of the input text.
>
>
> References
> ----------
> [0] https://llvm.org/docs/CommandGuide/FileCheck.html
> [1] https://www.youtube.com/watch?v=4rhW8knj0L8
> [2] https://lists.llvm.org/pipermail/llvm-dev/2018-May/123010.html
> [3] https://reviews.llvm.org/D47106
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev


> -----Original Message-----
> From: Hal Finkel [mailto:[hidden email]]
> Sent: Thursday, May 24, 2018 10:33 AM
> To: Robinson, Paul; [hidden email]
> Subject: Re: [llvm-dev] [RFC] Formalizing FileCheck Features
>
>
> On 05/24/2018 08:46 AM, via llvm-dev wrote:
> > Background
> > ----------
> >
> > FileCheck [0] is a cornerstone testing tool for the LLVM project.  It
> > has grown new features over the years to meet new needs, but these
> > sometimes have surprising and counter-intuitive behavior [1].  This
> > has become even more evident in Joel Denny's recent quest to repair
> > what seemed like an obvious defect [2] but which led me to the
> > conclusion [3] that FileCheck sorely needed a clear, intuitive
> > conceptual model.
>
> Thanks for writing this up. I definitely think that it will be good to
> add this to FileCheck's documentation.
>
> >   And then someone to make it work that way (hi
> > Joel!).
> >
> > Basic Conceptual Model
> > ----------------------
> >
> > FileCheck should operate on the basis of these three fundamental
> > concepts.
> >
> > (1) Search range.  This is some substring of the input text where one
> > or more directives will do their pattern-matching magic.
> >
> > (2) Match range.  This is a substring of a search range where a
> > directive (or in one case, a group of directives) has matched a
> > pattern.
> >
> > (3) Directive groups.  These are sequences of adjacent directives that
> > operate in a related way on a search range.  Directives within a group
> > are processed in order, except as noted in the directive description.
> >
> > Finally we add The Rule:  No match ranges may overlap.
> >
> > (This is largely formalizing what FileCheck already does, except that
> > it didn't have The Rule with respect to DAG matches.  That's the bug
> > that Joel was originally trying to fix, until I stuck my nose into
> > it.)
> >
> > Directive Descriptions Based On Conceptual Model
> > ------------------------------------------------
> >
> > Given the conceptual model, all directives can be defined in terms of
> > it. This is possibly going overboard with the formalism but hey, we're
> > all compiler geeks here.
> >
> > CHECK: Scans the search range for a pattern match. Fails if no match
> > is found.  The end of the match range becomes the start of the search
> > range for subsequent directives.
> >
> > CHECK-SAME: Like CHECK, plus there must be zero newlines prior to the
> > start of the match range.
> >
> > CHECK-NEXT: Like CHECK, plus there must be exactly one newline prior
> > to the start of the match range.
> >
> > CHECK-LABEL: All LABEL directives are processed before any other
> > directives.  These directives have two effects.  First, they act like
> > CHECK directives, but also partition the input text into disjoint
> > search ranges, delimited by the match ranges of the LABEL directives.
> > Second, they partition the remaining directives into Label Groups,
> > each of which operates on the corresponding search range.  For truly
> > pedantic formalism, we can say there are implicit LABEL directives
> > matching the start and end of the entire input text, thus all
> > non-LABEL directives are always in some Label Group.
> >
> > CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
> > is not executed immediately; instead the next non-NOT directive is
> > executed first, and the start of that directive's match range becomes
> > the end of the NOT Group's search range.
>
> Both here, and for CHECK-DAG, we should say something about reaching the
> end of the input.
>
>  -Hal

It seemed intuitive to me that a range can't extend past the end of the
input, and under CHECK-LABEL I did say there are implicit directives
matching the start and end of the input; but it does no harm to add some
words about that to DAG and NOT.
--paulr

>
> >   (If the next directive is
> > LABEL, it has already executed and has a match range, which is already
> > the end of the search range.)  After the NOT Group's search range is
> > defined, each NOT directive scans the range for a match, and fails if
> > a match is found.
> >
> > CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
> > is not executed immediately; instead the next non-DAG directive is
> > executed first, and the start of that directive's match range becomes
> > the end of the DAG Group's search range.  If the next directive is
> > CHECK-NOT, the end of the DAG Group's search range is
> > unaffected. (This might or might not be FileCheck's historical
> > behavior; I didn't check.)  After the DAG Group's search range is
> > defined, each DAG directive scans the range for a match, and fails if
> > a match is not found.  Per The Rule, match ranges for DAG directives
> > may not overlap. (This is not historical FileCheck behavior, and the
> > bug Joel Denny wanted to fix.)  After all DAG directives run, the
> > match range for the entire DAG Group extends from the start of the
> > earliest match to the end of the latest match.  The end of that match
> > range becomes the start of the search range for subsequent directives.
> >
> > Observations
> > ------------
> >
> > A CHECK-NOT still separates surrounding CHECK-DAG directives into
> > disjoint groups, and does not permit matches from the two groups to
> > overlap. DAG was originally implemented to detect and diagnose an
> > overlap, but this worked only for the first DAG after a NOT. This can
> > lead to counter-intuitive behavior and potentially makes certain kinds
> > of matches impossible.
> >
> > Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
> > behavior, but it's unlikely to be useful.  Putting SAME or NEXT as the
> > first directive in a file likewise has defined behavior, matching
> > precisely the first or second line (respectively) of the input text.
> >
> >
> > References
> > ----------
> > [0] https://llvm.org/docs/CommandGuide/FileCheck.html
> > [1] https://www.youtube.com/watch?v=4rhW8knj0L8
> > [2] https://lists.llvm.org/pipermail/llvm-dev/2018-May/123010.html
> > [3] https://reviews.llvm.org/D47106
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > [hidden email]
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
In reply to this post by 韩玉 via llvm-dev
Hi Paul,

On Thu, May 24, 2018 at 9:46 AM, <[hidden email]> wrote:
Background
----------

FileCheck [0] is a cornerstone testing tool for the LLVM project.  It
has grown new features over the years to meet new needs, but these
sometimes have surprising and counter-intuitive behavior [1].  This
has become even more evident in Joel Denny's recent quest to repair
what seemed like an obvious defect [2] but which led me to the
conclusion [3] that FileCheck sorely needed a clear, intuitive
conceptual model. 

Agreed.  Thanks for doing this.

 
And then someone to make it work that way (hi
Joel!).

Sure, I can help with the implementation given that I'm running into these issues a lot in my own work.  As I'm a bit too close to the FileCheck implementation at this point, I would suggest that someone else write the initial specification-based tests to find the deviations from the description we arrive at.  Paul, you're the obvious person for that one.  I can of course work on further implementation-based testing.

I also recommend we make changes toward the new specification incrementally.


Basic Conceptual Model
----------------------

FileCheck should operate on the basis of these three fundamental
concepts.

(1) Search range.  This is some substring of the input text where one
or more directives will do their pattern-matching magic.

(2) Match range.  This is a substring of a search range where a
directive (or in one case, a group of directives) has matched a
pattern.

(3) Directive groups.  These are sequences of adjacent directives that
operate in a related way on a search range.  Directives within a group
are processed in order, except as noted in the directive description.

Finally we add The Rule:  No match ranges may overlap.

(This is largely formalizing what FileCheck already does, except that
it didn't have The Rule with respect to DAG matches.  That's the bug
that Joel was originally trying to fix, until I stuck my nose into
it.)
 
I agree with The Rule.  I haven't found any real use case yet that needs to violate that rule.


Directive Descriptions Based On Conceptual Model
------------------------------------------------

Given the conceptual model, all directives can be defined in terms of
it. This is possibly going overboard with the formalism but hey, we're
all compiler geeks here.

I think it's great.
 

CHECK: Scans the search range for a pattern match. Fails if no match
is found.  The end of the match range becomes the start of the search
range for subsequent directives.

CHECK-SAME: Like CHECK, plus there must be zero newlines prior to the
start of the match range.

... within the search range.

Should it be possible for CHECK-SAME match range to include newlines?
 

CHECK-NEXT: Like CHECK, plus there must be exactly one newline prior
to the start of the match range.

... within the search range.

Your choice to talk about the match range rather than the search range for CHECK-SAME and CHECK-NEXT implies you like the current behavior that extends the search range beyond these match range restrictions and then complains if the match range restrictions aren't met.  For example, CHECK-SAME searches past the newline and then complains if the match range starts after the newline.  Is that what you prefer?

I'd note that, in the case of CHECK-NEXT, that choice can restrict what CHECK-NEXT can match.  That is, it will complain about a match on the previous line rather than skip it and look on the next line.
 

CHECK-LABEL: All LABEL directives are processed before any other
directives.  These directives have two effects.  First, they act like
CHECK directives, but also partition the input text into disjoint
search ranges, delimited by the match ranges of the LABEL directives.
Second, they partition the remaining directives into Label Groups,
each of which operates on the corresponding search range.  For truly
pedantic formalism, we can say there are implicit LABEL directives
matching the start and end of the entire input text, thus all
non-LABEL directives are always in some Label Group.

CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
is not executed immediately; instead the next non-NOT directive is
executed first, and the start of that directive's match range becomes
the end of the NOT Group's search range.

Based on the following, that wording is not quite right when a DAG group follows, so there should probably be some note about that here.

 
  (If the next directive is
LABEL, it has already executed and has a match range, which is already
the end of the search range.)  After the NOT Group's search range is
defined, each NOT directive scans the range for a match, and fails if
a match is found.

CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
is not executed immediately; instead the next non-DAG directive is
executed first, and the start of that directive's match range becomes
the end of the DAG Group's search range.

That's definitely a change from the current behavior.  Currently, the DAG group finds its own end based on the farthest match.
 
  If the next directive is
CHECK-NOT, the end of the DAG Group's search range is
unaffected.

Unaffected means that it's as if there's no following directive?  So next CHECK-LABEL (possibly the implicit one at EOF)?  What if there's a CHECK, CHECK-NEXT, or CHECK-SAME after all the DAGs and NOTs?
 
(This might or might not be FileCheck's historical
behavior; I didn't check.)  After the DAG Group's search range is
defined, each DAG directive scans the range for a match, and fails if
a match is not found.  Per The Rule, match ranges for DAG directives
may not overlap. (This is not historical FileCheck behavior, and the
bug Joel Denny wanted to fix.)  After all DAG directives run, the
match range for the entire DAG Group extends from the start of the
earliest match to the end of the latest match.  The end of that match
range becomes the start of the search range for subsequent directives.

That last sentence contradicts the first few sentences: the subsequent directive has already been matched.

One point not addressed here is the start of the DAG group's search range.  Currently, if the DAG group is preceded by a NOT group preceded by a DAG group, the last DAG group's search range starts at the start of the first DAG group's match range.  Any matches in the first DAG group's match range produces a reordering error.  This is somewhat similar to the CHECK-SAME and CHECK-NEXT behavior I mentioned earlier: the search ranges permit invalid match ranges and then complain about them in an effort to diagnose mistakes.  However, that restricts what can be matched.

I'm not claiming that either behavior is best.  It's not clear to me.  The best use of DAG-NOT-DAG is very confusing to me.  An effort to prescribe the right semantics to it needs to be informed by real use cases, in my opinion.
 

Observations
------------

A CHECK-NOT still separates surrounding CHECK-DAG directives into
disjoint groups, and does not permit matches from the two groups to
overlap. DAG was originally implemented to detect and diagnose an
overlap, but this worked only for the first DAG after a NOT. This can
lead to counter-intuitive behavior and potentially makes certain kinds
of matches impossible.

I definitely agree it shouldn't be just the first DAG.  The reordering detection should happen for all consecutive DAGs after the NOT or none of them.
 

Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
behavior, but it's unlikely to be useful.

I believe they had predictable behavior before (their search ranges started at the end of the match range for the entire CHECK-DAG), but it's different with the above description (they define the end of the search range for the preceding CHECK-DAG group).

Thanks.

Joel
 
  Putting SAME or NEXT as the
first directive in a file likewise has defined behavior, matching
precisely the first or second line (respectively) of the input text.


References
----------
[0] https://llvm.org/docs/CommandGuide/FileCheck.html
[1] https://www.youtube.com/watch?v=4rhW8knj0L8
[2] https://lists.llvm.org/pipermail/llvm-dev/2018-May/123010.html
[3] https://reviews.llvm.org/D47106



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
In reply to this post by 韩玉 via llvm-dev
Awesome, thanks for helping to improve the conceptual model here, I agree that this is really important - particularly for a tool like FileCheck.

One observation: if certain combinations of directives have dubious or surprising behavior, it is perfectly fine for FileCheck to reject them.

-Chris


> On May 24, 2018, at 6:46 AM, via llvm-dev <[hidden email]> wrote:
>
> Background
> ----------
>
> FileCheck [0] is a cornerstone testing tool for the LLVM project.  It
> has grown new features over the years to meet new needs, but these
> sometimes have surprising and counter-intuitive behavior [1].  This
> has become even more evident in Joel Denny's recent quest to repair
> what seemed like an obvious defect [2] but which led me to the
> conclusion [3] that FileCheck sorely needed a clear, intuitive
> conceptual model.  And then someone to make it work that way (hi
> Joel!).
>
> Basic Conceptual Model
> ----------------------
>
> FileCheck should operate on the basis of these three fundamental
> concepts.
>
> (1) Search range.  This is some substring of the input text where one
> or more directives will do their pattern-matching magic.
>
> (2) Match range.  This is a substring of a search range where a
> directive (or in one case, a group of directives) has matched a
> pattern.
>
> (3) Directive groups.  These are sequences of adjacent directives that
> operate in a related way on a search range.  Directives within a group
> are processed in order, except as noted in the directive description.
>
> Finally we add The Rule:  No match ranges may overlap.
>
> (This is largely formalizing what FileCheck already does, except that
> it didn't have The Rule with respect to DAG matches.  That's the bug
> that Joel was originally trying to fix, until I stuck my nose into
> it.)
>
> Directive Descriptions Based On Conceptual Model
> ------------------------------------------------
>
> Given the conceptual model, all directives can be defined in terms of
> it. This is possibly going overboard with the formalism but hey, we're
> all compiler geeks here.
>
> CHECK: Scans the search range for a pattern match. Fails if no match
> is found.  The end of the match range becomes the start of the search
> range for subsequent directives.
>
> CHECK-SAME: Like CHECK, plus there must be zero newlines prior to the
> start of the match range.
>
> CHECK-NEXT: Like CHECK, plus there must be exactly one newline prior
> to the start of the match range.
>
> CHECK-LABEL: All LABEL directives are processed before any other
> directives.  These directives have two effects.  First, they act like
> CHECK directives, but also partition the input text into disjoint
> search ranges, delimited by the match ranges of the LABEL directives.
> Second, they partition the remaining directives into Label Groups,
> each of which operates on the corresponding search range.  For truly
> pedantic formalism, we can say there are implicit LABEL directives
> matching the start and end of the entire input text, thus all
> non-LABEL directives are always in some Label Group.
>
> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
> is not executed immediately; instead the next non-NOT directive is
> executed first, and the start of that directive's match range becomes
> the end of the NOT Group's search range.  (If the next directive is
> LABEL, it has already executed and has a match range, which is already
> the end of the search range.)  After the NOT Group's search range is
> defined, each NOT directive scans the range for a match, and fails if
> a match is found.
>
> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
> is not executed immediately; instead the next non-DAG directive is
> executed first, and the start of that directive's match range becomes
> the end of the DAG Group's search range.  If the next directive is
> CHECK-NOT, the end of the DAG Group's search range is
> unaffected. (This might or might not be FileCheck's historical
> behavior; I didn't check.)  After the DAG Group's search range is
> defined, each DAG directive scans the range for a match, and fails if
> a match is not found.  Per The Rule, match ranges for DAG directives
> may not overlap. (This is not historical FileCheck behavior, and the
> bug Joel Denny wanted to fix.)  After all DAG directives run, the
> match range for the entire DAG Group extends from the start of the
> earliest match to the end of the latest match.  The end of that match
> range becomes the start of the search range for subsequent directives.
>
> Observations
> ------------
>
> A CHECK-NOT still separates surrounding CHECK-DAG directives into
> disjoint groups, and does not permit matches from the two groups to
> overlap. DAG was originally implemented to detect and diagnose an
> overlap, but this worked only for the first DAG after a NOT. This can
> lead to counter-intuitive behavior and potentially makes certain kinds
> of matches impossible.
>
> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
> behavior, but it's unlikely to be useful.  Putting SAME or NEXT as the
> first directive in a file likewise has defined behavior, matching
> precisely the first or second line (respectively) of the input text.
>
>
> References
> ----------
> [0] https://llvm.org/docs/CommandGuide/FileCheck.html
> [1] https://www.youtube.com/watch?v=4rhW8knj0L8
> [2] https://lists.llvm.org/pipermail/llvm-dev/2018-May/123010.html
> [3] https://reviews.llvm.org/D47106
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
In reply to this post by 韩玉 via llvm-dev
Thanks Joel and Chris, comments inline.

>> CHECK: Scans the search range for a pattern match. Fails if no match
>> is found.  The end of the match range becomes the start of the search
>> range for subsequent directives.
>>
>> CHECK-SAME: Like CHECK, plus there must be zero newlines prior to the
>> start of the match range.
>
> ... within the search range.

Yes, thanks.

> Should it be possible for CHECK-SAME match range to include newlines?
 
It is possible to write a regex that matches newlines.  Doing that in
CHECK-SAME seems a bit odd but I don't think it's worth trying to forbid
it.

>> CHECK-NEXT: Like CHECK, plus there must be exactly one newline prior
>> to the start of the match range.
>
> ... within the search range.

Again yes.

> Your choice to talk about the match range rather than the search range
> for CHECK-SAME and CHECK-NEXT implies you like the current behavior
> that extends the search range beyond these match range restrictions and
> then complains if the match range restrictions aren't met.  For example,
> CHECK-SAME searches past the newline and then complains if the match
> range starts after the newline.  Is that what you prefer?
>
> I'd note that, in the case of CHECK-NEXT, that choice can restrict what
> CHECK-NEXT can match.  That is, it will complain about a match on the
> previous line rather than skip it and look on the next line.

Ah, so we could define CHECK-NEXT as: move the start of the search
range past the first newline, then behaves as CHECK-SAME?

But, appending {{.*$}} to the previous pattern should have the same
effect if you have a CHECK-NEXT that runs into that problem.  And I
do think it's valuable for SAME and NEXT to tell you they found
matches but not on the line you asked for.  So I'd prefer to leave
these defined as they are.

>> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
>> is not executed immediately; instead the next non-NOT directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the NOT Group's search range.
>
> Based on the following, that wording is not quite right when a DAG
> group follows, so there should probably be some note about that here.

So, "the next non-NOT directive or DAG group is executed ... the start
of that directive or group's match range ..." ?
 

>>  (If the next directive is
>> LABEL, it has already executed and has a match range, which is already
>> the end of the search range.)  After the NOT Group's search range is
>> defined, each NOT directive scans the range for a match, and fails if
>> a match is found.
>>
>> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
>> is not executed immediately; instead the next non-DAG directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the DAG Group's search range.
>
> That's definitely a change from the current behavior.  Currently, the
> DAG group finds its own end based on the farthest match.

Oh good catch.  Copy-thinko from the NOT description.  NOT is the only
kind of directive that has deferred execution.
 
>>  If the next directive is
>> CHECK-NOT, the end of the DAG Group's search range is
>> unaffected.
>
> Unaffected means that it's as if there's no following directive?  So
> next CHECK-LABEL (possibly the implicit one at EOF)?  What if there's
> a CHECK, CHECK-NEXT, or CHECK-SAME after all the DAGs and NOTs?

If DAG doesn't have deferred execution then the end of the search range
is the next (explicit or implicit) CHECK-LABEL point, end of story.
 
>>  After all DAG directives run, the
>> match range for the entire DAG Group extends from the start of the
>> earliest match to the end of the latest match.  The end of that match
>> range becomes the start of the search range for subsequent directives.
>
> That last sentence contradicts the first few sentences: the subsequent
> directive has already been matched.

Right, fixing the previous bug means this sentence says the right thing.

> One point not addressed here is the start of the DAG group's search
> range.  Currently, if the DAG group is preceded by a NOT group
> preceded by a DAG group, the last DAG group's search range starts at
> the start of the first DAG group's match range.  Any matches in the
> first DAG group's match range produces a reordering error.  This is
> somewhat similar to the CHECK-SAME and CHECK-NEXT behavior I mentioned
> earlier: the search ranges permit invalid match ranges and then
> complain about them in an effort to diagnose mistakes.  However, that
> restricts what can be matched.
>
> I'm not claiming that either behavior is best.  It's not clear to me.
> The best use of DAG-NOT-DAG is very confusing to me.  An effort to
> prescribe the right semantics to it needs to be informed by real use
> cases, in my opinion.

I did some email archaeology, and found this exchange on llvm-dev between
myself and Michael Liao (original DAG implementor) 13 Mar 2016:

pr> Commentary in FileCheck itself can easily be interpreted to mean the
pr> intent was that –NOT would scan the region between the points defined
pr> by the last match of the preceding DAG group (which the code gets
pr> right) and the first match of the following DAG group (which the code
pr> does not get right). But the commentary is not really that clear.

ml> That's the intention of the original design. CHECK-NOT never occurs
ml> before we find the start point (the start of file by default) and end
ml> point (the end of file by default.) All other points are through other
ml> CHECKs, including CHECK-DAG but excluding CHECK-NOT.  So that, if you
ml> use CHECK-NOT, you need to be aware of how that range is defined. As
ml> CHECK-DAG pattern matches a group of pattern in any order, the match
ml> point of that group of CHECK-DAG (a consecutive CHECK-DAGs without any
ml> other CHECKs interleaved) is always the point where one of that pgroup
ml> is matched. If one CHECK-DAG is separated by any other CHECKs
ml> (including CHECK-NOT) from preceding CHECK-DAGs, it is not in the
ml> preceding group of CHECK-DAG. That's way how we could check the order
ml> where a group of patterns should never occur before another group of
ml> patterns.

So, I believe my specification for the interaction between DAG and NOT
does match the original intent.  Regarding the diagnostic aid, it does
make some sequences really hard to match, and I don't have a general
idea how to fix that (versus {{.*$}} for the similar NEXT situation).
It's also a reasonable continuation of the behavior of plain CHECK, in
that a second CHECK doesn't search the prior text to complain about
ordering issues.

SAME and NEXT are, I think, a different category; that has to do with
line-breaks that are not explicitly described by user-written patterns,
and my own experience is that it's helpful to be told that something
matches but isn't on the line I expected.

So, I don't have a definitive answer for changing DAG-NOT-DAG, but
intuitively the spec makes sense to me and my inclination is to think
the diagnostic isn't hugely valuable.

>> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
>> behavior, but it's unlikely to be useful.
>
> I believe they had predictable behavior before (their search ranges
> started at the end of the match range for the entire CHECK-DAG), but
> it's different with the above description (they define the end of the
> search range for the preceding CHECK-DAG group).

You're right, it was predictable before, and I am fixing the bug where
the directive after DAG gets executed first so the range isn't affected.

Taking Chris Lattner's point into consideration, we might want to say
SAME or NEXT after a DAG should be an error.  But we could also leave
that for a later round.

--paulr

P.S. I am away next week but expect to keep an eye on the lists.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
Hi Paul,

On Fri, May 25, 2018 at 10:40 AM, <[hidden email]> wrote:
> Should it be possible for CHECK-SAME match range to include newlines?
 
It is possible to write a regex that matches newlines.  Doing that in
CHECK-SAME seems a bit odd but I don't think it's worth trying to forbid
it.

OK, so SAME has the sense of matching *starting* on the same line rather than *within* the same line.  Seems fine.
 
> I'd note that, in the case of CHECK-NEXT, that choice can restrict what
> CHECK-NEXT can match.  That is, it will complain about a match on the
> previous line rather than skip it and look on the next line.

Ah, so we could define CHECK-NEXT as: move the start of the search
range past the first newline, then behaves as CHECK-SAME?

Right. 

But, appending {{.*$}} to the previous pattern should have the same
effect if you have a CHECK-NEXT that runs into that problem.

So the current behavior is more flexible even if less intuitive at first glance (to me, at least).  It's also more consistent with the way search ranges work in general.

I think this subtlety and this tip should be mentioned in the user documentation. Also, because sometimes the previous directive isn't nearby or could be one of many directives due to multiple check prefixes, the docs should also offer this formula: 

CHECK-SAME: {{.*}}
CHECK-NEXT: your pattern

And I
do think it's valuable for SAME and NEXT to tell you they found
matches but not on the line you asked for. So I'd prefer to leave these defined as they are.

Agreed.

>> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
>> is not executed immediately; instead the next non-NOT directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the NOT Group's search range.
>
> Based on the following, that wording is not quite right when a DAG
> group follows, so there should probably be some note about that here.

So, "the next non-NOT directive or DAG group is executed ... the start
of that directive or group's match range ..." ?

Sounds good.
 
>>  (If the next directive is
>> LABEL, it has already executed and has a match range, which is already
>> the end of the search range.)  After the NOT Group's search range is
>> defined, each NOT directive scans the range for a match, and fails if
>> a match is found.
>>
>> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
>> is not executed immediately; instead the next non-DAG directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the DAG Group's search range.
>
> That's definitely a change from the current behavior.  Currently, the
> DAG group finds its own end based on the farthest match.

Oh good catch.  Copy-thinko from the NOT description.  NOT is the only
kind of directive that has deferred execution.
 
>>  If the next directive is
>> CHECK-NOT, the end of the DAG Group's search range is
>> unaffected.
>
> Unaffected means that it's as if there's no following directive?  So
> next CHECK-LABEL (possibly the implicit one at EOF)?  What if there's
> a CHECK, CHECK-NEXT, or CHECK-SAME after all the DAGs and NOTs?

If DAG doesn't have deferred execution then the end of the search range
is the next (explicit or implicit) CHECK-LABEL point, end of story.

>>  After all DAG directives run, the
>> match range for the entire DAG Group extends from the start of the
>> earliest match to the end of the latest match.  The end of that match
>> range becomes the start of the search range for subsequent directives.
>
> That last sentence contradicts the first few sentences: the subsequent
> directive has already been matched.

Right, fixing the previous bug means this sentence says the right thing.

Yep, I agree it's fixed.
 

> One point not addressed here is the start of the DAG group's search
> range.  Currently, if the DAG group is preceded by a NOT group
> preceded by a DAG group, the last DAG group's search range starts at
> the start of the first DAG group's match range.  Any matches in the
> first DAG group's match range produces a reordering error.  This is
> somewhat similar to the CHECK-SAME and CHECK-NEXT behavior I mentioned
> earlier: the search ranges permit invalid match ranges and then
> complain about them in an effort to diagnose mistakes.  However, that
> restricts what can be matched.
>
> I'm not claiming that either behavior is best.  It's not clear to me.
> The best use of DAG-NOT-DAG is very confusing to me.  An effort to
> prescribe the right semantics to it needs to be informed by real use
> cases, in my opinion.

I did some email archaeology, and found this exchange on llvm-dev between
myself and Michael Liao (original DAG implementor) 13 Mar 2016:

pr> Commentary in FileCheck itself can easily be interpreted to mean the
pr> intent was that –NOT would scan the region between the points defined
pr> by the last match of the preceding DAG group (which the code gets
pr> right) and the first match of the following DAG group (which the code
pr> does not get right). But the commentary is not really that clear.

ml> That's the intention of the original design. CHECK-NOT never occurs
ml> before we find the start point (the start of file by default) and end
ml> point (the end of file by default.) All other points are through other
ml> CHECKs, including CHECK-DAG but excluding CHECK-NOT.  So that, if you
ml> use CHECK-NOT, you need to be aware of how that range is defined. As
ml> CHECK-DAG pattern matches a group of pattern in any order, the match
ml> point of that group of CHECK-DAG (a consecutive CHECK-DAGs without any
ml> other CHECKs interleaved) is always the point where one of that pgroup
ml> is matched. If one CHECK-DAG is separated by any other CHECKs
ml> (including CHECK-NOT) from preceding CHECK-DAGs, it is not in the
ml> preceding group of CHECK-DAG. That's way how we could check the order
ml> where a group of patterns should never occur before another group of
ml> patterns.

Thanks for digging that up.
 
So, I believe my specification for the interaction between DAG and NOT
does match the original intent.

I can't argue there.
 
  Regarding the diagnostic aid, it does
make some sequences really hard to match,

Theoretically, I agree.  But do you know of a real use case where it's a problem?

and I don't have a general
idea how to fix that (versus {{.*$}} for the similar NEXT situation).

Me neither.

It's also a reasonable continuation of the behavior of plain CHECK, in
that a second CHECK doesn't search the prior text to complain about
ordering issues.

Good point.

The main difference I see is that DAG is specifically about unordered text (and it might vary from run to run in the parallel programs I'm thinking of), so the chances of accidental reordering might be higher than with plain CHECK.
 

SAME and NEXT are, I think, a different category; that has to do with
line-breaks that are not explicitly described by user-written patterns,
and my own experience is that it's helpful to be told that something
matches but isn't on the line I expected.

Agreed.
 

So, I don't have a definitive answer for changing DAG-NOT-DAG, but
intuitively the spec makes sense to me and my inclination is to think
the diagnostic isn't hugely valuable.

You might be right. Again, I find it hard to think of solid arguments about DAG-NOT-DAG because it seems like such an unlikely use case.

You mentioned Chris Lattner's point.  DAG-NOT-DAG was the first thing that came to my mind.

DAG-NOT-DAG is a weird case where (1) you want two or more consecutive but non-overlapping DAG groups, and (2) you want to exclude certain patterns in between.  Strangely, with existing directives, you cannot accomplish #1 without #2, right?  Why do those go together?  It feels like a use case that arose from an accident in a language specification and not from a real need.

Well, maybe the best approach is just to go with a clear specification (as you have now) and hope for the best. 


>> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
>> behavior, but it's unlikely to be useful.
>
> I believe they had predictable behavior before (their search ranges
> started at the end of the match range for the entire CHECK-DAG), but
> it's different with the above description (they define the end of the
> search range for the preceding CHECK-DAG group).

You're right, it was predictable before, and I am fixing the bug where
the directive after DAG gets executed first so the range isn't affected.

Makes sense, so your specification keeps the old behavior.
 
Taking Chris Lattner's point into consideration, we might want to say
SAME or NEXT after a DAG should be an error.  But we could also leave
that for a later round.

With your specification, I think the meaning of those cases is clear and potentially useful.  The only potential problem I see is that people who haven't studied your specification carefully might think SAME and NEXT constrain the end of the search range of the DAG group.  It might be worthwhile to emphasize in the docs that, no, really, DAG does not work that way.

Actually, I wish there were a way to do that for the sake of matching unordered text on a single line.  SAME after DAGs is as close as I can get to that.  Maybe we need a CHECK-DAG-SAME.

Speaking of wish lists, I've been thinking it would be nice to have some way to apply a NOT pattern among a range of matches:

CHECK-NOT-PUSH: pattern
...
CHECK-NOT-POP:

For example, with a pattern of {{.}} and DAGs in between PUSH and POP, I can check for an unordered set of strings while rejecting any other text among them. (Now that's a use case for DAG plus NOT that seems very clear to me.)

Like normal NOT, PUSH's action would be deferred until the next directive or group.  At that point, it would push the specified NOT pattern along with the next non-NOT directive's match range end as its search range start. POP would pop and apply those using the previous non-NOT directive's match range start as its search range end.  The Rule would apply to its matches.  PUSH and POP would be like normal NOT in terms of their effect on neighboring directives: each would terminate any preceding DAG group, and, because there's no match in a successful run, each would have no effect on any neighboring directive's search range.  PUSH and POP with no directives in between other than those in the NOT family would be an error.

Your formal specification of FileCheck makes it straight-forward to describe this behavior precisely.
 

--paulr

P.S. I am away next week but expect to keep an eye on the lists.


Sure.  Have fun.  No rush.

Thanks.

Joel


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev

A few replies, then I'll post a revised spec v2 which ought to incorporate all the other feedback.  If I missed something, give a shout.

 

Actually, I wish there were a way to do that [constrain DAG to a single line] for the sake of matching unordered text on a single line.  SAME after DAGs is as close as I can get to that.  Maybe we need a CHECK-DAG-SAME.

 

Hmmm. You know, there were cases where people wrote tests that tried to use combo suffixes like that, but of course those directives were not actually executed (because they were treated like a SAME with prefix CHECK-DAG (or whatever), rather than CHECK with two suffixes).  That's why some while ago I added checks within FileCheck to try to detect duplicate suffixes and complain.  There were, I don’t remember, a couple dozen or so cases.

 

After we get the spec done and The Rule implemented, if you want to take a run at some combo prefixes that had a use-case for you, that could be interesting.

 

Speaking of wish lists, I've been thinking it would be nice to have some way to apply a NOT pattern among a range of matches:

 

CHECK-NOT-PUSH: pattern

 

Well, there is the `--implicit-check-not` option, which applies to the entire input text; it looks like you want it just for a subrange, though?  If you aren't talking about DAGs, then repeating a CHECK-NOT between the other directives would work although it's pretty tedious (voice of experience) and easy to mess up (voice of experience).  If you have an example where CHECK-DAG-NOT would actually be useful, the formalism I'm going for does seem like it would help.

 

--paulr

 

From: Joel E. Denny [mailto:[hidden email]]
Sent: Saturday, May 26, 2018 12:11 PM
To: Robinson, Paul
Cc: [hidden email]
Subject: Re: [RFC] Formalizing FileCheck Features

 

Hi Paul,

 

On Fri, May 25, 2018 at 10:40 AM, <[hidden email]> wrote:

> Should it be possible for CHECK-SAME match range to include newlines?
 
It is possible to write a regex that matches newlines.  Doing that in
CHECK-SAME seems a bit odd but I don't think it's worth trying to forbid
it.

 

OK, so SAME has the sense of matching *starting* on the same line rather than *within* the same line.  Seems fine.

 

> I'd note that, in the case of CHECK-NEXT, that choice can restrict what
> CHECK-NEXT can match.  That is, it will complain about a match on the
> previous line rather than skip it and look on the next line.

Ah, so we could define CHECK-NEXT as: move the start of the search
range past the first newline, then behaves as CHECK-SAME?

 

Right. 

 

But, appending {{.*$}} to the previous pattern should have the same
effect if you have a CHECK-NEXT that runs into that problem.

 

So the current behavior is more flexible even if less intuitive at first glance (to me, at least).  It's also more consistent with the way search ranges work in general.

 

I think this subtlety and this tip should be mentioned in the user documentation. Also, because sometimes the previous directive isn't nearby or could be one of many directives due to multiple check prefixes, the docs should also offer this formula: 

 

CHECK-SAME: {{.*}}

CHECK-NEXT: your pattern

 

And I
do think it's valuable for SAME and NEXT to tell you they found
matches but not on the line you asked for. So I'd prefer to leave these defined as they are.

 

Agreed.

 

>> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
>> is not executed immediately; instead the next non-NOT directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the NOT Group's search range.
>
> Based on the following, that wording is not quite right when a DAG
> group follows, so there should probably be some note about that here.

So, "the next non-NOT directive or DAG group is executed ... the start
of that directive or group's match range ..." ?

 

Sounds good.

 

>>  (If the next directive is
>> LABEL, it has already executed and has a match range, which is already
>> the end of the search range.)  After the NOT Group's search range is
>> defined, each NOT directive scans the range for a match, and fails if
>> a match is found.
>>
>> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
>> is not executed immediately; instead the next non-DAG directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the DAG Group's search range.
>
> That's definitely a change from the current behavior.  Currently, the
> DAG group finds its own end based on the farthest match.

Oh good catch.  Copy-thinko from the NOT description.  NOT is the only
kind of directive that has deferred execution.
 
>>  If the next directive is
>> CHECK-NOT, the end of the DAG Group's search range is
>> unaffected.
>
> Unaffected means that it's as if there's no following directive?  So
> next CHECK-LABEL (possibly the implicit one at EOF)?  What if there's
> a CHECK, CHECK-NEXT, or CHECK-SAME after all the DAGs and NOTs?

If DAG doesn't have deferred execution then the end of the search range
is the next (explicit or implicit) CHECK-LABEL point, end of story.

 

>>  After all DAG directives run, the
>> match range for the entire DAG Group extends from the start of the
>> earliest match to the end of the latest match.  The end of that match
>> range becomes the start of the search range for subsequent directives.
>
> That last sentence contradicts the first few sentences: the subsequent
> directive has already been matched.

Right, fixing the previous bug means this sentence says the right thing.

 

Yep, I agree it's fixed.

 


> One point not addressed here is the start of the DAG group's search
> range.  Currently, if the DAG group is preceded by a NOT group
> preceded by a DAG group, the last DAG group's search range starts at
> the start of the first DAG group's match range.  Any matches in the
> first DAG group's match range produces a reordering error.  This is
> somewhat similar to the CHECK-SAME and CHECK-NEXT behavior I mentioned
> earlier: the search ranges permit invalid match ranges and then
> complain about them in an effort to diagnose mistakes.  However, that
> restricts what can be matched.
>
> I'm not claiming that either behavior is best.  It's not clear to me.
> The best use of DAG-NOT-DAG is very confusing to me.  An effort to
> prescribe the right semantics to it needs to be informed by real use
> cases, in my opinion.

I did some email archaeology, and found this exchange on llvm-dev between
myself and Michael Liao (original DAG implementor) 13 Mar 2016:

pr> Commentary in FileCheck itself can easily be interpreted to mean the
pr> intent was that –NOT would scan the region between the points defined
pr> by the last match of the preceding DAG group (which the code gets
pr> right) and the first match of the following DAG group (which the code
pr> does not get right). But the commentary is not really that clear.

ml> That's the intention of the original design. CHECK-NOT never occurs
ml> before we find the start point (the start of file by default) and end
ml> point (the end of file by default.) All other points are through other
ml> CHECKs, including CHECK-DAG but excluding CHECK-NOT.  So that, if you
ml> use CHECK-NOT, you need to be aware of how that range is defined. As
ml> CHECK-DAG pattern matches a group of pattern in any order, the match
ml> point of that group of CHECK-DAG (a consecutive CHECK-DAGs without any
ml> other CHECKs interleaved) is always the point where one of that pgroup
ml> is matched. If one CHECK-DAG is separated by any other CHECKs
ml> (including CHECK-NOT) from preceding CHECK-DAGs, it is not in the
ml> preceding group of CHECK-DAG. That's way how we could check the order
ml> where a group of patterns should never occur before another group of
ml> patterns.

 

Thanks for digging that up.

 

So, I believe my specification for the interaction between DAG and NOT
does match the original intent.

 

I can't argue there.

 

  Regarding the diagnostic aid, it does
make some sequences really hard to match,

 

Theoretically, I agree.  But do you know of a real use case where it's a problem?

 

and I don't have a general

idea how to fix that (versus {{.*$}} for the similar NEXT situation).

 

Me neither.

 

It's also a reasonable continuation of the behavior of plain CHECK, in
that a second CHECK doesn't search the prior text to complain about
ordering issues.

 

Good point.

 

The main difference I see is that DAG is specifically about unordered text (and it might vary from run to run in the parallel programs I'm thinking of), so the chances of accidental reordering might be higher than with plain CHECK.

 


SAME and NEXT are, I think, a different category; that has to do with
line-breaks that are not explicitly described by user-written patterns,
and my own experience is that it's helpful to be told that something
matches but isn't on the line I expected.

 

Agreed.

 


So, I don't have a definitive answer for changing DAG-NOT-DAG, but
intuitively the spec makes sense to me and my inclination is to think
the diagnostic isn't hugely valuable.

 

You might be right. Again, I find it hard to think of solid arguments about DAG-NOT-DAG because it seems like such an unlikely use case.

 

You mentioned Chris Lattner's point.  DAG-NOT-DAG was the first thing that came to my mind.

 

DAG-NOT-DAG is a weird case where (1) you want two or more consecutive but non-overlapping DAG groups, and (2) you want to exclude certain patterns in between.  Strangely, with existing directives, you cannot accomplish #1 without #2, right?  Why do those go together?  It feels like a use case that arose from an accident in a language specification and not from a real need.

 

Well, maybe the best approach is just to go with a clear specification (as you have now) and hope for the best. 

 


>> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
>> behavior, but it's unlikely to be useful.
>
> I believe they had predictable behavior before (their search ranges
> started at the end of the match range for the entire CHECK-DAG), but
> it's different with the above description (they define the end of the
> search range for the preceding CHECK-DAG group).

You're right, it was predictable before, and I am fixing the bug where
the directive after DAG gets executed first so the range isn't affected.

 

Makes sense, so your specification keeps the old behavior.

 

Taking Chris Lattner's point into consideration, we might want to say
SAME or NEXT after a DAG should be an error.  But we could also leave
that for a later round.

 

With your specification, I think the meaning of those cases is clear and potentially useful.  The only potential problem I see is that people who haven't studied your specification carefully might think SAME and NEXT constrain the end of the search range of the DAG group.  It might be worthwhile to emphasize in the docs that, no, really, DAG does not work that way.

 

Actually, I wish there were a way to do that for the sake of matching unordered text on a single line.  SAME after DAGs is as close as I can get to that.  Maybe we need a CHECK-DAG-SAME.

 

Speaking of wish lists, I've been thinking it would be nice to have some way to apply a NOT pattern among a range of matches:

 

CHECK-NOT-PUSH: pattern

...

CHECK-NOT-POP:

 

For example, with a pattern of {{.}} and DAGs in between PUSH and POP, I can check for an unordered set of strings while rejecting any other text among them. (Now that's a use case for DAG plus NOT that seems very clear to me.)

 

Like normal NOT, PUSH's action would be deferred until the next directive or group.  At that point, it would push the specified NOT pattern along with the next non-NOT directive's match range end as its search range start. POP would pop and apply those using the previous non-NOT directive's match range start as its search range end.  The Rule would apply to its matches.  PUSH and POP would be like normal NOT in terms of their effect on neighboring directives: each would terminate any preceding DAG group, and, because there's no match in a successful run, each would have no effect on any neighboring directive's search range.  PUSH and POP with no directives in between other than those in the NOT family would be an error.

 

Your formal specification of FileCheck makes it straight-forward to describe this behavior precisely.

 


--paulr

P.S. I am away next week but expect to keep an eye on the lists.

 

Sure.  Have fun.  No rush.

 

Thanks.

 

Joel

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
Spec for the model, version 2.  If this survives I'll start on
amendments to the FileCheck doc.
--paulr


Basic Conceptual Model
----------------------

FileCheck should operate on the basis of these three fundamental
concepts.

(1) Search range.  This is some substring of the input text where one
or more directives will do their pattern-matching magic.

(2) Match range.  This is a substring of a search range where a
directive (or in one case, a group of directives) has matched a
pattern.

(3) Directive groups.  These are sequences of adjacent directives that
operate in a related way on a search range.  Directives within a group
are processed in order, except as noted in the directive description.

Finally we add The Rule:  No match ranges may overlap.


Directive Descriptions Based On Conceptual Model
------------------------------------------------

Given the conceptual model, all directives can be defined in terms of
it.

CHECK: Scans the search range for a pattern match. Fails if no match
is found.  The end of the match range becomes the start of the search
range for subsequent directives.

CHECK-SAME: Like CHECK, plus there must be zero newlines within the
search range prior to the start of the match range.

CHECK-NEXT: Like CHECK, plus there must be exactly one newline within
the search range prior to the start of the match range.

Note: This definition means CHECK-NEXT will fail if the pattern
occurs both on the line where the search range starts, and on the
(expected) next line.  This can be avoided by putting a
`CHECK-SAME: {{.*}}` before the CHECK-NEXT.  We could also avoid
this by defining the CHECK-NEXT search range to be just the following
line of text.  We define CHECK-NEXT the way we do because it seems
valuable to diagnose mismatches that are simply on the wrong line,
and the problematic case is rare.

CHECK-LABEL: All LABEL directives are processed before any other
directives.  These directives have three effects.  First, they act like
CHECK directives. Second, they partition the input text into disjoint
search ranges, delimited by the match ranges of the LABEL directives.
Third, they partition the remaining directives into Label Groups,
each of which operates on the corresponding search range.  For truly
pedantic formalism, we can say there are implicit LABEL directives
matching the start and end of the entire input text, thus all
non-LABEL directives are always in some Label Group and there is
really nothing special about the end of the input text.

CHECK-NOT: A sequence of one or more consecutive NOT directives forms
a NOT Group. The group is not executed immediately; instead the next
non-NOT directive (or DAG Group, if the next directive is DAG) is
executed first, and the start of that directive's (or group's)
match range becomes the end of the NOT Group's search range.  (If the
next directive is LABEL, it has already executed and has a match range,
which is already the end of the search range.  If the NOT is the last
directive, the search range extends to the end of the input.)  After
the NOT Group's search range is defined, each NOT directive in the
group scans the range for a match, and fails if a match is found.

CHECK-DAG: A sequence of one or more consecutive DAG directives forms
a DAG Group. The search range for the group extends from the end of
the previous match (or start of the input, if there is no previous
directive) to the start of the next LABEL match, or to the end of the
input if there is no later LABEL.  Each directive in the DAG group
scans the search range of the group looking for a pattern match. A
directive fails if no match is found. Per The Rule, match ranges for
the individual DAG directives in a group may not overlap.  After all
DAG directives run, the match range for the entire DAG Group extends
from the start of the earliest match to the end of the latest match.  
The end of that match range becomes the start of the search range for
subsequent directives.

Observations
------------

A CHECK-NOT surrounded by CHECK-DAG directives separates the DAGs into
disjoint groups, and does not permit matches from the two groups to
overlap. DAG was originally implemented to detect and diagnose an
overlap in this situation, but the implementation worked only for the
first DAG after a NOT. This can lead to counter-intuitive behavior and
potentially makes certain kinds of matches impossible.


Technically, putting CHECK-SAME or CHECK-NEXT after CHECK-DAG has
defined behavior, but it's unlikely to be useful, so FileCheck rejects
that kind of sequence.  Similarly, putting SAME or NEXT as the
first directive in a file likewise has defined behavior (matching
precisely the first or second line respectively of the input text);
however this is far more likely to be a mistake than to be useful, so
again FileCheck rejects this.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
Hi Paul,

I've inlined some minor suggestions and questions.

On Thu, Jun 14, 2018 at 4:38 PM, <[hidden email]> wrote:
Spec for the model, version 2.  If this survives I'll start on
amendments to the FileCheck doc.
--paulr


Basic Conceptual Model
----------------------

FileCheck should operate on the basis of these three fundamental
concepts.

"should operate" -> "operates"
 

(1) Search range.  This is some substring of the input text where one
or more directives will do their pattern-matching magic.

(2) Match range.  This is a substring of a search range where a
directive (or in one case, a group of directives) has matched a
pattern.

(3) Directive groups.  These are sequences of adjacent directives that
operate in a related way on a search range.  Directives within a group
are processed in order, except as noted in the directive description.

Is there an exception?
 

Finally we add The Rule:  No match ranges may overlap.


Directive Descriptions Based On Conceptual Model
------------------------------------------------

Given the conceptual model, all directives can be defined in terms of
it.

CHECK: Scans the search range for a pattern match. Fails if no match
is found.  The end of the match range becomes the start of the search
range for subsequent directives.

CHECK-SAME: Like CHECK, plus there must be zero newlines within the
search range prior to the start of the match range.

CHECK-NEXT: Like CHECK, plus there must be exactly one newline within
the search range prior to the start of the match range.

Note: This definition means CHECK-NEXT will fail if the pattern
occurs both on the line where the search range starts, and on the
(expected) next line.

The first occurrence is sufficient for a failure.  Perhaps: "and on the" -> "even if it also occurs on the"
 
  This can be avoided by putting a
`CHECK-SAME: {{.*}}` before the CHECK-NEXT.  We could also avoid

To make it clearer to the naive user you're not describing a second option he can also try as a user: "We could also avoid" -> "We could have implemented FileCheck to avoid"
 
this by defining the CHECK-NEXT search range to be just the following
line of text.  We define CHECK-NEXT the way we do because it seems
valuable to diagnose mismatches that are simply on the wrong line,
and the problematic case is rare.

By the way, do you think it would be helpful for the diagnostic to suggest the CHECK-SAME trick?

CHECK-LABEL: All LABEL directives are processed before any other
directives.  These directives have three effects.  First, they act like
CHECK directives. Second, they partition the input text into disjoint
search ranges, delimited by the match ranges of the LABEL directives.
Third, they partition the remaining directives into Label Groups,
each of which operates on the corresponding search range.  For truly
pedantic formalism, we can say there are implicit LABEL directives
matching the start and end of the entire input text, thus all
non-LABEL directives are always in some Label Group and there is
really nothing special about the end of the input text.

CHECK-NOT: A sequence of one or more consecutive NOT directives forms
a NOT Group. The group is not executed immediately; instead the next
non-NOT directive (or DAG Group, if the next directive is DAG) is
executed first, and the start of that directive's (or group's)
match range becomes the end of the NOT Group's search range.  (If the
next directive is LABEL, it has already executed and has a match range,
which is already the end of the search range.  If the NOT is the last
directive, the search range extends to the end of the input.)  After
the NOT Group's search range is defined, each NOT directive in the
group scans the range for a match, and fails if a match is found.

CHECK-DAG: A sequence of one or more consecutive DAG directives forms
a DAG Group. The search range for the group extends from the end of
the previous match (or start of the input, if there is no previous
directive) to the start of the next LABEL match, or to the end of the
input if there is no later LABEL.

It reads to me like LABEL is relevant to the end but not the start.  You might replace "(or start of the input" with "(possibly a LABEL or start of the input".
 
On the other hand, in most of your directive descriptions (see CHECK, CHECK-NEXT, and CHECK-SAME), you don't define the directive's own search range.  Instead, you define how that directive impacts the start of the next search range.

The only difference here is that you have an entire group of directives with the same search range.  As FileCheck grows new directives, perhaps a more maintainable way to describe the search ranges for NOT groups and DAG groups is as follows:

"The search range for every member of the group is the search range that any single CHECK directive would have if it were to replace the entire group."

  Each directive in the DAG group
scans the search range of the group looking for a pattern match. A
directive fails if no match is found. Per The Rule, match ranges for
the individual DAG directives in a group may not overlap.

The last sentence is ambiguous.  It could mean you'll get a diagnostic if they do overlap.  Perhaps say "Per The Rule, each group member skips past any match whose range overlaps the range of an earlier group member's match."

 
  After all
DAG directives run, the match range for the entire DAG Group extends
from the start of the earliest match to the end of the latest match. 
The end of that match range becomes the start of the search range for
subsequent directives.

Observations
------------

A CHECK-NOT surrounded by CHECK-DAG directives separates the DAGs into

"A CHECK-NOT" -> "One or more CHECK-NOTs"
 
disjoint groups, and does not permit matches from the two groups to
overlap. DAG was originally implemented to detect and diagnose an
overlap in this situation, but the implementation worked only for the
first DAG after a NOT. This can lead to counter-intuitive behavior and
potentially makes certain kinds of matches impossible.

By the way, I have a patch that fixes the search ranges for DAG-NOT-DAG to match your formal description here.  I need to polish up the commit log, and then I'll post it for review.  It applies after my other patches because it was easier to implement that way.

Thanks.

Joel
 


Technically, putting CHECK-SAME or CHECK-NEXT after CHECK-DAG has
defined behavior, but it's unlikely to be useful, so FileCheck rejects
that kind of sequence.  Similarly, putting SAME or NEXT as the
first directive in a file likewise has defined behavior (matching
precisely the first or second line respectively of the input text);
however this is far more likely to be a mistake than to be useful, so
again FileCheck rejects this.



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
In reply to this post by 韩玉 via llvm-dev
Hi Paul,

On Thu, Jun 14, 2018 at 4:29 PM, <[hidden email]> wrote:

Speaking of wish lists, I've been thinking it would be nice to have some way to apply a NOT pattern among a range of matches:

 

CHECK-NOT-PUSH: pattern

 

Well, there is the `--implicit-check-not` option, which applies to the entire input text; it looks like you want it just for a subrange, though?


Right.
 

Agreed.
 

Yes, but I'd prefer a more general construct that also works without DAG.  That's why I suggested CHECK-NOT-PUSH and POP.  Jessica Paquette described a use case that I thought suggested she could benefit from that too, but it's possible I misunderstood her:

 

Seems to help with either approach.

Thanks.

Joel
 

 

--paulr

 

From: Joel E. Denny [mailto:[hidden email]]
Sent: Saturday, May 26, 2018 12:11 PM
To: Robinson, Paul
Cc: [hidden email]
Subject: Re: [RFC] Formalizing FileCheck Features

 

Hi Paul,

 

On Fri, May 25, 2018 at 10:40 AM, <[hidden email]> wrote:

> Should it be possible for CHECK-SAME match range to include newlines?
 
It is possible to write a regex that matches newlines.  Doing that in
CHECK-SAME seems a bit odd but I don't think it's worth trying to forbid
it.

 

OK, so SAME has the sense of matching *starting* on the same line rather than *within* the same line.  Seems fine.

 

> I'd note that, in the case of CHECK-NEXT, that choice can restrict what
> CHECK-NEXT can match.  That is, it will complain about a match on the
> previous line rather than skip it and look on the next line.

Ah, so we could define CHECK-NEXT as: move the start of the search
range past the first newline, then behaves as CHECK-SAME?

 

Right. 

 

But, appending {{.*$}} to the previous pattern should have the same
effect if you have a CHECK-NEXT that runs into that problem.

 

So the current behavior is more flexible even if less intuitive at first glance (to me, at least).  It's also more consistent with the way search ranges work in general.

 

I think this subtlety and this tip should be mentioned in the user documentation. Also, because sometimes the previous directive isn't nearby or could be one of many directives due to multiple check prefixes, the docs should also offer this formula: 

 

CHECK-SAME: {{.*}}

CHECK-NEXT: your pattern

 

And I
do think it's valuable for SAME and NEXT to tell you they found
matches but not on the line you asked for. So I'd prefer to leave these defined as they are.

 

Agreed.

 

>> CHECK-NOT: A sequence of NOT directives forms a NOT Group. The group
>> is not executed immediately; instead the next non-NOT directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the NOT Group's search range.
>
> Based on the following, that wording is not quite right when a DAG
> group follows, so there should probably be some note about that here.

So, "the next non-NOT directive or DAG group is executed ... the start
of that directive or group's match range ..." ?

 

Sounds good.

 

>>  (If the next directive is
>> LABEL, it has already executed and has a match range, which is already
>> the end of the search range.)  After the NOT Group's search range is
>> defined, each NOT directive scans the range for a match, and fails if
>> a match is found.
>>
>> CHECK-DAG: A sequence of DAG directives forms a DAG Group. The group
>> is not executed immediately; instead the next non-DAG directive is
>> executed first, and the start of that directive's match range becomes
>> the end of the DAG Group's search range.
>
> That's definitely a change from the current behavior.  Currently, the
> DAG group finds its own end based on the farthest match.

Oh good catch.  Copy-thinko from the NOT description.  NOT is the only
kind of directive that has deferred execution.
 
>>  If the next directive is
>> CHECK-NOT, the end of the DAG Group's search range is
>> unaffected.
>
> Unaffected means that it's as if there's no following directive?  So
> next CHECK-LABEL (possibly the implicit one at EOF)?  What if there's
> a CHECK, CHECK-NEXT, or CHECK-SAME after all the DAGs and NOTs?

If DAG doesn't have deferred execution then the end of the search range
is the next (explicit or implicit) CHECK-LABEL point, end of story.

 

>>  After all DAG directives run, the
>> match range for the entire DAG Group extends from the start of the
>> earliest match to the end of the latest match.  The end of that match
>> range becomes the start of the search range for subsequent directives.
>
> That last sentence contradicts the first few sentences: the subsequent
> directive has already been matched.

Right, fixing the previous bug means this sentence says the right thing.

 

Yep, I agree it's fixed.

 


> One point not addressed here is the start of the DAG group's search
> range.  Currently, if the DAG group is preceded by a NOT group
> preceded by a DAG group, the last DAG group's search range starts at
> the start of the first DAG group's match range.  Any matches in the
> first DAG group's match range produces a reordering error.  This is
> somewhat similar to the CHECK-SAME and CHECK-NEXT behavior I mentioned
> earlier: the search ranges permit invalid match ranges and then
> complain about them in an effort to diagnose mistakes.  However, that
> restricts what can be matched.
>
> I'm not claiming that either behavior is best.  It's not clear to me.
> The best use of DAG-NOT-DAG is very confusing to me.  An effort to
> prescribe the right semantics to it needs to be informed by real use
> cases, in my opinion.

I did some email archaeology, and found this exchange on llvm-dev between
myself and Michael Liao (original DAG implementor) 13 Mar 2016:

pr> Commentary in FileCheck itself can easily be interpreted to mean the
pr> intent was that –NOT would scan the region between the points defined
pr> by the last match of the preceding DAG group (which the code gets
pr> right) and the first match of the following DAG group (which the code
pr> does not get right). But the commentary is not really that clear.

ml> That's the intention of the original design. CHECK-NOT never occurs
ml> before we find the start point (the start of file by default) and end
ml> point (the end of file by default.) All other points are through other
ml> CHECKs, including CHECK-DAG but excluding CHECK-NOT.  So that, if you
ml> use CHECK-NOT, you need to be aware of how that range is defined. As
ml> CHECK-DAG pattern matches a group of pattern in any order, the match
ml> point of that group of CHECK-DAG (a consecutive CHECK-DAGs without any
ml> other CHECKs interleaved) is always the point where one of that pgroup
ml> is matched. If one CHECK-DAG is separated by any other CHECKs
ml> (including CHECK-NOT) from preceding CHECK-DAGs, it is not in the
ml> preceding group of CHECK-DAG. That's way how we could check the order
ml> where a group of patterns should never occur before another group of
ml> patterns.

 

Thanks for digging that up.

 

So, I believe my specification for the interaction between DAG and NOT
does match the original intent.

 

I can't argue there.

 

  Regarding the diagnostic aid, it does
make some sequences really hard to match,

 

Theoretically, I agree.  But do you know of a real use case where it's a problem?

 

and I don't have a general

idea how to fix that (versus {{.*$}} for the similar NEXT situation).

 

Me neither.

 

It's also a reasonable continuation of the behavior of plain CHECK, in
that a second CHECK doesn't search the prior text to complain about
ordering issues.

 

Good point.

 

The main difference I see is that DAG is specifically about unordered text (and it might vary from run to run in the parallel programs I'm thinking of), so the chances of accidental reordering might be higher than with plain CHECK.

 


SAME and NEXT are, I think, a different category; that has to do with
line-breaks that are not explicitly described by user-written patterns,
and my own experience is that it's helpful to be told that something
matches but isn't on the line I expected.

 

Agreed.

 


So, I don't have a definitive answer for changing DAG-NOT-DAG, but
intuitively the spec makes sense to me and my inclination is to think
the diagnostic isn't hugely valuable.

 

You might be right. Again, I find it hard to think of solid arguments about DAG-NOT-DAG because it seems like such an unlikely use case.

 

You mentioned Chris Lattner's point.  DAG-NOT-DAG was the first thing that came to my mind.

 

DAG-NOT-DAG is a weird case where (1) you want two or more consecutive but non-overlapping DAG groups, and (2) you want to exclude certain patterns in between.  Strangely, with existing directives, you cannot accomplish #1 without #2, right?  Why do those go together?  It feels like a use case that arose from an accident in a language specification and not from a real need.

 

Well, maybe the best approach is just to go with a clear specification (as you have now) and hope for the best. 

 


>> Putting CHECK-SAME and CHECK-NEXT after CHECK-DAG now has defined
>> behavior, but it's unlikely to be useful.
>
> I believe they had predictable behavior before (their search ranges
> started at the end of the match range for the entire CHECK-DAG), but
> it's different with the above description (they define the end of the
> search range for the preceding CHECK-DAG group).

You're right, it was predictable before, and I am fixing the bug where
the directive after DAG gets executed first so the range isn't affected.

 

Makes sense, so your specification keeps the old behavior.

 

Taking Chris Lattner's point into consideration, we might want to say
SAME or NEXT after a DAG should be an error.  But we could also leave
that for a later round.

 

With your specification, I think the meaning of those cases is clear and potentially useful.  The only potential problem I see is that people who haven't studied your specification carefully might think SAME and NEXT constrain the end of the search range of the DAG group.  It might be worthwhile to emphasize in the docs that, no, really, DAG does not work that way.

 

Actually, I wish there were a way to do that for the sake of matching unordered text on a single line.  SAME after DAGs is as close as I can get to that.  Maybe we need a CHECK-DAG-SAME.

 

Speaking of wish lists, I've been thinking it would be nice to have some way to apply a NOT pattern among a range of matches:

 

CHECK-NOT-PUSH: pattern

...

CHECK-NOT-POP:

 

For example, with a pattern of {{.}} and DAGs in between PUSH and POP, I can check for an unordered set of strings while rejecting any other text among them. (Now that's a use case for DAG plus NOT that seems very clear to me.)

 

Like normal NOT, PUSH's action would be deferred until the next directive or group.  At that point, it would push the specified NOT pattern along with the next non-NOT directive's match range end as its search range start. POP would pop and apply those using the previous non-NOT directive's match range start as its search range end.  The Rule would apply to its matches.  PUSH and POP would be like normal NOT in terms of their effect on neighboring directives: each would terminate any preceding DAG group, and, because there's no match in a successful run, each would have no effect on any neighboring directive's search range.  PUSH and POP with no directives in between other than those in the NOT family would be an error.

 

Your formal specification of FileCheck makes it straight-forward to describe this behavior precisely.

 


--paulr

P.S. I am away next week but expect to keep an eye on the lists.

 

Sure.  Have fun.  No rush.

 

Thanks.

 

Joel

 



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Formalizing FileCheck Features

韩玉 via llvm-dev
In reply to this post by 韩玉 via llvm-dev
On Tue, Jun 19, 2018 at 4:51 PM, Joel E. Denny <[hidden email]> wrote:
The only difference here is that you have an entire group of directives with the same search range.  As FileCheck grows new directives, perhaps a more maintainable way to describe the search ranges for NOT groups and DAG groups is as follows:

"The search range for every member of the group is the search range that any single CHECK directive would have if it were to replace the entire group."

Oops.  That's only valid for DAG groups.

Joel

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev