Introduce compile-time bounded for loops and if statements inside of `entries` table property #1261

jafingerhut · 2023-07-23T00:32:41Z

Personnel

Owner: @jafingerhut
Supporters: @thantry

Design

Document: Powerpoint slide deck in Google drive: https://docs.google.com/presentation/d/1Izp0lW0TRPBwEdrZkr-YU0NpEUWwQOBC/edit?usp=sharing&ouid=113165450274035133953&rtpof=true&sd=true

Implementation

p4-spec:
p4c: Only the beginnings of proposed changes to the language grammar in this PR: P4 spec issue 1261 add bounded loops p4c#4120

Process

LDWG discussed: Discussed first 11 slides of Powerpoint deck on 2023-Aug-07 meeting.
LDWG approved:
Merged into p4-spec:
Merged into p4c:

=======================================

Related issues:

Discussion on this issue in the LDWG led to a preference for having two different designs for loops that are done at packet processing time (perhaps unrolled by the compiler), vs. loops that are guaranteed to be done at compile time, e.g. for creating multiple entries in a table's entries property, but also more generally such things as table keys, table actions, multiple control or parser definitions, multiple top-level definitions of tables, actions, local variables, or extern instance declarations within the top level of a control, etc.

The issue of execution loops is tracked in issue #1325

Compile-time / macro loops and conditionals are tracked in issue #1326

The text was updated successfully, but these errors were encountered:

jafingerhut · 2023-07-23T01:35:39Z

Rationale for introducing this feature: There are some who would like to use long lists of entries or const entries within a P4 program, with a repetitive structure to some of those entries.

Of course it is possible to write a simple program in a general purpose programming language such as C++, Java, Go, Python, etc. that prints out a list of the desired entries, and #include that into a P4 program. The proposers of this feature know this, and yet would still prefer if it were possible to do something like the following straw-man-syntax example proposes:

    action my_drop() {
        mark_to_drop(stdmeta);
    }
    action set_port(bit<9> port) {
        stdmeta.egress_spec = port;
    }
    table ing_filter {
        key = {
            stdmeta.ingress_port : ternary;
            hdr.ipv4.ttl : ternary;
            hdr.ipv4.protocol : ternary;
        }
        actions = {
            my_drop;
            set_port;
            NoAction;
        }
        entries = {
            (_, 0, _): set_port(CPU_PORT);
#ifdef NEW_FEATURE
            for (p in (list<bit<8>>) {3, 18, 19, 28, 42, 52}) {
                // Note: Within a for loop body, the loop variable is
                // a local compile-time known value.
                (_, _, p) : my_drop();
                // There can be additional entries here, and all of
                // them within the { } of the for loop are generated
                // once per loop iteration.
            }
            for (bit<8> p = 128; p <= 255; p = p + 1) {
                for (bit<9> port = 0; port <= MAX_PORT_ID; port = port + 1) {
                    (port, _, p) : my_drop();
                }
            }
            for (bit<8> p = 80; p <= 120; p = p + 1) {
                if (p != 99) {
                    (_, _, p) : my_drop();
                }
            }
#endif // NEW_FEATURE
        }
        const default_action = NoAction();
    }

Complete P4 program can be found here: https://github.com/jafingerhut/p4-guide/blob/master/table-entries-with-for-loops/example1.p4

2023-Aug-02 update: From discussion later in this issue, I would like to focus on only two use cases for bounded loops in P4 at this time:

for loop context (a) - inside entries table property

This only to make sense to me for such a loop to be implemented via compile-time evaluation of the loop and/or if statements to produce a finite set of entries.

Inside the body of a loop in context (a), all occurrences of the loop variable identifier are treated as local compile-time known values, equal to the value that the loop variable takes during that iteration of the loop. The only things that can be within such a for loop are: nested for loops, if statements, or table entries.

for loop context (b) - inside the apply block within a control, inside of an action body, or inside a state definition within a parser

Loops in context (b) could be implemented either by compile-time unrolling of the bounded loop, or some targets could implement them at run-time in a way similar to how a general purpose CPU often does so -- the resulting behavior would be the same regardless of the implementation.

Inside the body of a loop in context (b), all occurrences of loop variable identifiers that are newly introduced by the loop are "read only" values, i.e. they are treated similarly to a direction in parameter, so it is a compile-time error to attempt to modify its value in any way.

Any statement that can appear where a context (b) for loop appears, may also appear within the body of such a for loop, e.g. if the for loop appears within the apply block of a control, you can apply tables, apply sub-controls, call actions or extern functions, have if or switch statements, execute return or exit statements (which cause the for loop to be exited early), have nested for loops, etc.

(This is the end of text specific to for loop context (b)).

I propose that only for loops are defined, not while, repeat ... until, or do ... while as they appear in some other imperative languages, because for seem easier to specify and write in a way that a compiler can verify the number of iterations is bounded at compile time. If someone has proposed implementation mechanisms to determine that other forms of loops are compile-time bounded in their number of iterations, we can consider those, too.

I also propose that we either:

do not introduce continue or break statements at this time, or
discuss exactly how to implement these in for a back end that unrolls for loops in context (b) at compile time. It seems to require either if statements nested as deeply as the number of loop iterations, or calling a sub-control and using return, or a new goto statement.

I propose that the two possible forms that the for loop iteration be:

(1) for (<type_specification> <loop_variable_identifier> in <tuple_expression>) <loop_body>

It is a compile-time error if you attempt modify the value of <loop_variable_identifier> in the loop body in any way, e.g. by assigning it a value, or using it as an out or inout parameter in any function/method/sub-control/sub-parser call, in the <loop_body>.

(2) for (<assignment_statements>; <boolean condition>; <assignment_statements>) <loop_body>

where <assignment_statements> can be a comma-separated list of multiple assignment statements, as in C or C++.

New variables whose scope is local to the for loop can be introduced in the first <assignment_statements>, with type specification, e.g. for (bit<9> i = 5; i < 10; i = i + 1) <loop_body>. We need some kind of check at compile time to ensure that the loop has a finite number of iterations. One approach would be that there is at least one variable initialized in the first <assignment_statements>, and mentioned in the <boolean_condition>, and assigned to in the second <assignment_statements>, and the variable is NOT assigned to, nor used as an out or inout parameter in the <loop_body>, such that it is straightforward to guarantee that the number of iterations is bounded.

TODO: One possible restriction would be that all loop variables in the parentheses after for are new local variables, local to this for loop, and are out of scope outside of the for loop. That seems to simplify questions of what happens if it is a variable declared in an enclosing scope, e.g. we don't need to specify any final value for the loop variable(s), because they cannot be accessed after the loop has finished execution.

jafingerhut · 2023-07-23T01:38:04Z

@thantry Please take a look at the example above, and comment if you were hoping for something else.

My intent with this issue is NOT to introduce compile-time bounded for loops into arbitrary places in a P4 program. It is ONLY to enable them within the entries list of a table definition. If you were thinking of adding them in other places of a P4 program, now would be a good time to mention that. I suspect it might be significantly more effort to get for loops added to the language in more arbitrary places than is proposed in the example above.

jafingerhut · 2023-07-23T01:56:45Z

@thantry: I assume you want else, and else if in addition to a simple if as shown in the example?

Do you want the ability to write a switch statement in there, too?

Do you want the ability to have other local variables declared within for/if/switch?

Assignment statements with arbitrary P4 expressions on the right-hand side?

thantry · 2023-07-23T03:15:39Z

Thanks Andy!Yes, the only usecase of this feature would be to generate constant table entries from use of variables that are compile time evaluatable.Use of conditionals on such compile time evaluatables would be a great feature to have.All of (switch, if, if/else) on such variables would contribute to making the program more readable, since I suspect front end compiler passes would just unroll and flatten it to the final array.The intent is to be able to write compact code to generate such constant entries.ThanksHariSent from my iPhoneOn Jul 22, 2023, at 6:56 PM, Andy Fingerhut ***@***.***> wrote: @thantry: I assume you want else, and else if in addition to a simple if as shown in the example? Do you want the ability to write a switch statement in there, too? Do you want the ability to have other local variables declared within for/if/switch? Assignment statements with arbitrary P4 expressions on the right-hand side? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

jnfoster · 2023-07-23T10:19:37Z

I suspect it might be significantly more effort to get for loops added to the language in more arbitrary places than is proposed in the example above.

Why do you suspect it would be harder? It could well be that working out the design in the context of a specific feature like table entries is more work than just adding general, bounded loops.

jafingerhut · 2023-07-23T14:28:15Z

Why do you suspect it would be harder? It could well be that working out the design in the context of a specific feature like table entries is more work than just adding general, bounded loops.

My suspicion stems from the amount of discussion required in the language design work group to approve it, not from anything semantically difficult in the language.

e.g. suppose someone says that they want for loops to be able to cause an array of similar parser states to be generated at compile time? An array of similar transitions in a transition select statement? An array of repetitive fields in a header or struct definition? An array of repetitive cases in a switch statement? All of those are, I believe, unique cases for adding new syntax to different parts of the grammar, each perhaps with their own unique grammar ambiguities that are possible to introduce, their own corner cases to discuss, etc.

If we limit the proposal to fewer of those situations, then there is less to think about in terms of corner cases, and less time to discuss it to get to consensus (is my personal guess). Trying to cover all of those cases, and look for more, cannot possibly take less time.

I do not know if all of these places where for loops could be, you consider part of a general purpose implementation or not, but just from a few minutes of thinking about possibilities:

inside of control apply blocks is the no-brainer desire, and I am definitely sympathetic to that one.
inside of table properties: entries, key, actions?
inside of top level of control: around local variables definitions, table definitions, action definitions, extern instantiations
inside of top level of parser: around local variable definitions, extern instantiations, parser state definitions
for loops inside of the parameter lists of actions, parsers, controls, and packages, for repetitive parameters?
for loops inside of the parameter lists of calls to actions, parsers, controls, and/or package instantiations?
for loops inside of the lists of type parameters, e.g. parser p1<for (i in {1,2,3,4,5,6}) TYPE_PARAM_#{i}> ... ?
inside of struct, header, and union definitions, for repetitive fields?
inside of switch statements, for repetitive cases?
inside of transition select statements?

My belief is that most of the different bullet items above cause unique extensions to the language grammar, and unique enhancements required in the p4c implementation.

jafingerhut · 2023-07-23T18:11:09Z

Another situation where I suspect it would be harder to generalize: Suppose we target the ability to have a for loop around a table definition, to create an array of similar tables. It seems like to make that useful, we would either need (a) to introduce arrays of tables, or (b) introduce a mechanism like table base_name_##{loop_variable} or some other similar kind of syntax to create single lexical names that contain the values of loop variables, or expressions containing loop variables.

My belief is that the loops around table entries is quite useful without introducing other new language features like those above.

jnfoster · 2023-07-24T14:03:57Z

How about the following: start with adding for loops as statements. That addresses a bunch of the cases above, though not all. Then we can proceed to think about types, declarations, properties, etc. (I'm also fine if we start with entries since @thantry especially wants it for a real use case; but one could argue that starting with statements is more intuitive and natural, so it might be easier to work out the design.)

jnfoster · 2023-07-24T14:06:02Z

My meta-point is that I want to discourage us from doing "bolt-on," piecemeal, ad hoc design. It makes the language more complex, and it may actually be more work for us. It is tempting to believe that designing a limited feature may be easier pull off than the general case but I believe the total-cost-of-ownership of the piecemeal approach is actually higher. So the tempting belief is actually false.

mihaibudiu · 2023-07-24T17:24:11Z

There is also the alternative of using a more powerful preprocesor. We discussed augmenting the existing preprocessor with a #loop macro. But we could also use some template engine like [Jinja](https://en.wikipedia.org/wiki/Jinja_(template_engine)

jafingerhut · 2023-07-24T19:12:36Z

How about the following: start with adding for loops as statements. That addresses a bunch of the cases above, though not all.

Note: I generated that list of possible places someone might want for loops to avoid repetitive code not because I think they are all equally useful or important to work on, but in response to the suggestion that we add "general, bounded loops". Frankly, to me the idea of introducing for loops to generate repetitive lists of type parameters seems like it would never be used by anyone, because those lists of type parameters are usually less than 10 of them.

If we start with adding for loops as statements, that seems to me to cover these use cases in my list:

inside of control apply blocks is the no-brainer desire, and I am definitely sympathetic to that one.

It does not cover any of the others, as far as I can see. In particular, it does not cover creating repetitive table entries inside of an entries table property. I agree that the syntax and semantics can and should be made identical, or as similar as possible, between those two syntactic "places", but as far as adding these to the P4_16 language goes, we have to add those in two different places in the grammar in the spec and p4c implementation, unless I am missing a way to update the grammar in one "place" and have it add for loops to both of them.

The other "syntactic places" in my list above seem similar to me, in that adding the ability to use for loops in each of them requires yet another addition to the grammar. I don't want to make a mountain out of a molehill, but every such syntactic change seems to me likely to lead to another set of discussions and carefully looking for corner cases from whoever would implement this in p4c.

If you are content with starting by only adding for loops to control apply bodies and the entries table property for now, that is fine by me. I'd rather not bite off any more than that, e.g. for loops around table and action definitions, unless someone else volunteers to champion that and write the relevant portions of the specs and implementation.

jafingerhut · 2023-08-03T02:58:07Z

@jnfoster @jonathan-dilorenzo If there is time at the next P4 LDWG meeting, I would like to discuss this issue at least briefly, to get feedback on the proposal, especially on any restrictions of use that people think are good to include. This comment proposes some restrictions that I would personally be happy with, but others may have suggestions, too: #1261 (comment)

jnfoster · 2023-08-03T09:22:24Z

Sounds good. We will add it to the agenda.
-N

jafingerhut · 2023-08-07T21:22:58Z

Enhance this proposal to include the for (hdr in hdr_stack_variable_name) proposal here: #84

jafingerhut · 2023-08-07T21:25:52Z

AI Andy: From 2023-Aug-07, there were some who approved of seeing a written up specification PR for these ideas. Perhaps one of the main concerns voiced was that it would be nice for developers if they had a way in P4_16 to visibly know when something expands into something huge, vs. when it cannot. I am not sure whether the keyword for is enough of a signal of that in this proposal, or not?

Another concern raised was that it seemed odd to one commenter to leave open the option for a target device to choose whether to unroll loops, or not.

rcgoodfellow · 2024-05-13T20:07:05Z

The driving use case that started this discussion was a need for compile-time metaprogramming where loop iterations emit syntactic elements that are included in the program. We're now talking about general purpose bounded loops. These are very different things, that I think should be considered as different language features. If we're to take on the former in full generality, then I think we're talking about doing something in the existing macro machinery, or introducing a more powerful macro system. Having the same language element be a metaprogramming thing in some cases, but be a general purpose bounded loop facility in others would be highly confusing.

jnfoster mentioned this issue Aug 7, 2023

[Proposal] Structural iteration #84

Closed

jnfoster mentioned this issue Nov 11, 2023

Metaprogramming for Loops #907

Closed

9 tasks

asl mentioned this issue Mar 25, 2024

[WIP] Add example implementation for bounded foreach-style loops p4lang/p4c#4558

Closed

ChrisDodd mentioned this issue Apr 8, 2024

Changes for for loops p4lang/p4c#4562

Merged

jafingerhut mentioned this issue Nov 3, 2024

Add compile-time / macros that can do loops and conditionals #1326

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce compile-time bounded for loops and if statements inside of `entries` table property #1261

Introduce compile-time bounded for loops and if statements inside of `entries` table property #1261

jafingerhut commented Jul 23, 2023 •

edited

Loading

jafingerhut commented Jul 23, 2023 •

edited

Loading

jafingerhut commented Jul 23, 2023

jafingerhut commented Jul 23, 2023

thantry commented Jul 23, 2023 via email

jnfoster commented Jul 23, 2023

jafingerhut commented Jul 23, 2023 •

edited

Loading

jafingerhut commented Jul 23, 2023

jnfoster commented Jul 24, 2023

jnfoster commented Jul 24, 2023

mihaibudiu commented Jul 24, 2023

jafingerhut commented Jul 24, 2023

jafingerhut commented Aug 3, 2023

jnfoster commented Aug 3, 2023

jafingerhut commented Aug 7, 2023

jafingerhut commented Aug 7, 2023

rcgoodfellow commented May 13, 2024

Introduce compile-time bounded for loops and if statements inside of entries table property #1261

Introduce compile-time bounded for loops and if statements inside of entries table property #1261

Comments

jafingerhut commented Jul 23, 2023 • edited Loading

Personnel

Design

Implementation

Process

jafingerhut commented Jul 23, 2023 • edited Loading

jafingerhut commented Jul 23, 2023

jafingerhut commented Jul 23, 2023

thantry commented Jul 23, 2023 via email

jnfoster commented Jul 23, 2023

jafingerhut commented Jul 23, 2023 • edited Loading

jafingerhut commented Jul 23, 2023

jnfoster commented Jul 24, 2023

jnfoster commented Jul 24, 2023

mihaibudiu commented Jul 24, 2023

jafingerhut commented Jul 24, 2023

jafingerhut commented Aug 3, 2023

jnfoster commented Aug 3, 2023

jafingerhut commented Aug 7, 2023

jafingerhut commented Aug 7, 2023

rcgoodfellow commented May 13, 2024

Introduce compile-time bounded for loops and if statements inside of `entries` table property #1261

Introduce compile-time bounded for loops and if statements inside of `entries` table property #1261

jafingerhut commented Jul 23, 2023 •

edited

Loading

jafingerhut commented Jul 23, 2023 •

edited

Loading

jafingerhut commented Jul 23, 2023 •

edited

Loading