-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pppd: implement net-init, net-pre-up and net-down. #367
pppd: implement net-init, net-pre-up and net-down. #367
Conversation
This is aimed to replace both #341 and #342 with the strategy as discussed in #341. This is the initial step in that it basically provides a consolidate up/down in main.c, which ALL protocols (eg, MPLSCP should it ever get set up) can use to up/down the interface, and which will then execute net-init, net-pre-up and net-down as appropriate. Pushed now to leverage the build environments to at least compile-test the non-linux stuff so long. |
31fb4d9
to
bf9329e
Compare
Ordering: net-init successfully executed as per above as the first script, just after "Using interface ppp1" message. After this, authentication was completed. At this point IPCP and IPV6CP was initiated. IPV6CP completed first (but it could just as well have been IPCP. This correctly triggered net-pre-up. After which the normal IPv6 config and ipv6-up scripts was run (not waited for). IPCP then came up, and executed it's usual ip-pre-up (after configuring the IPs actually, and really was live by the time the script executed). This correctly ran ip-pre-up, and the set_ifup did NOT trigger net-pre-up again. As expected. After this the ip-up script was also triggered and not waited for. Upon ^C both IPCP and IPV6CP was terminated, and upon termination of the latter net-down was triggered. There is one debate here - should the ip-down and ipv6-down execute PRIOR to net-down? (None of them are waited for, so I don't think such a guarantee makes sense.) As it stands, net-down will trigger PRIOR to the "last proto down" executing it's script, to "fix" this I'll need to move the execution of ip-down and ipv6 down prior to calling set_ifdown, but still after deconfiguring the protocol from the OS. |
pppd/main.c
Outdated
* use set_ipup("IPCP") ... | ||
*/ | ||
int | ||
set_ifup(const char* name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of the new static variable up_protos[]. The protocols are already a part of the protocols[] (struct protoent). Seems like there should be a possibility of querying that structure to figure out if it is set in state up or not.
Also, I would imagine that NET_PRE_UP would be executed like IP_PRE_UP which means all configured network protocols have marked themselves as done (either UP or FAIL), but interface is not yet in configured with IFF_UP yet).
A script could possibly query the configuration of the interface being brought up to find the values it needs if it isn't passed into the net-pre-up script (as ip-pre-up would pass in the IP information, or the IPv6 information).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array of ints vs array of pointers? I was also wondering how to avoid up_protos, but alas, couldn't think of any. Even considered a bitmask ... do you have a better suggestion?
Yes, NET_PRE_UP gets executed like IP_PRE_UP, but since we cannot determine the order in which protocols configure, we can make no guarantees w.r.t. configured status (some protocol - probably exactly one - may already be configured, but it could be either ipv6cp or ipcp or theoretically some other), only that the interface will not yet be in IFF_UP state, but about to transition into that.
Yes, a script can either query the details using say "ip" or by using the environment variables (as set by script_setenv calls scattered through pppd - but again, no guarantees can be made since probably only one of ipv6cp or ipcp will have completed at this stage).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eliminated the sys-* netif_set_up and netif_set_down functions in preference of exposing a single netif_set_state() from each.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would look at main.c and find how they processes the protocol structure in get_input(), but if you could add a int field in there that says state, and this new function would
pseudo code:
for (protocol entry : protocols) {
if entry.id == PPP_EAP || entry.id == PPP_LCP || entry.id == .. ) // or, add a field phase in each protocol entry, i.e. mass update the files, then if (PHASE_NETWORK != entry.phase)
continue;
if !entry.up
return -1
}
// all NETWORK protocols are UP here.
setifstate(unit, 1);
Generally, adding the extra information to protocols structure; could help readability of the application. PHASE_NETWORK would be up when all NETWORK protocols has been negotiated.
Similarly, a flag to say the protocol is an AUTH protocol, you could chose to only pass the information to that protocol instead of checking EAP and LCP. See RFC1661.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to run NET_PRE_UP before all protocols has completed negotiation runs you afoul with the problem that not all information can be extracted. Though, trying to pass it to the script may be difficult, but a script can be made to query the state of the interface and extract the IPv6, IPv4 and MPLS information. If you execute the NET_PRE_UP before all network protocols have been negotiated then this information would be lacking at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the current protocol structure doesn't state if a protocol was intended to run during a particular phase, that could possibly be added in a separate change to ensure once you iterate through the structure don't have to e.g. compare it against a set of hard-coded protocol numbers (e.g. see get_input() in main.c).
This would be helpful here too where you iterate through the protocols and would only set the interface in state up when all the protocols has come up. Unfortunately, this probably require it to expose yet another callback to the protocol code to say: yeah you are up and scripts like ip-up and ip6-up gets executed (after interface has been configured in state up (IFF_UP)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enaess thinking on this I agree this would be nice.
However, and this is a big issue, both ip-pre-up and ip-up is called from the same function in ipcp.c - this is a sequential step. So this would require a fairly major redesign.
I can certainly add a notify() and/or net_{pre_,}up_hook() mechanism for you, that's quite trivial.
Do we know of many systems using ip-pre-up? My recommendation would be that protocol-specific pre-ups remain, but that they be moved prior to interface configuration. This does of course imply that net-pre-up will likely be called after one protocol-specific pre-up, which I think is not ideal. Unless all protocols adheres to:
- Ensure iface is UP.
- Invoke protocol-specific pre-up (currently only ip, not ipv6), providing "will be used" details.
- Configure interface for protocol.
- Invoke up (currently ip and ipv6).
Alternatively (very major rework referred to above) we can somehow permit all protocols to do something like the following:
- Upon starting pppd, prior to proceeding indicate that the protocol (if to be negotiated) needs to finish negotiation prior to interface up.
- If negotiation fails remove from the "desired" set.
- If negotiation succeeds, configure the interface as required.
- Invoke pre-up. (This can potentially move to pre-up callback in 5 in order to only invoke it after net-pre-up I reckon).
- Call if_setup(), additionally providing two optional call-backs. 1 to invoke when ready to bring interface up, just prior to actually doing so. 1 to invoke just after bringing the interface up.
All functionality in the current _up functions after the call to if_setup would thus move into the latter callback. I don't see a specific need to the first callback, but I'd rather someone look at it rather than for it.
One problem I possibly foresee is ondemand ... but I'd have to take a much closer look. As I see it there would then be four net-* scripts:
- net-init - as per current.
- net-pre-neg (need a better name, net-ready came to mind but that implies to iface is ready to bring up, which is not the case) - invoked post authentication but prior to initiating IPCP and IPV6CP (and others).
- net-pre-up - invoked just prior to the pre-up callbacks (such that it's possible to get ordering net-pre-up followed by proto-pre-up).
- net-down (as current).
Would this be more suitable? What if we WANT to bring up the interface when the first protocol is ready instead of when ALL protocols are ready? My opinion is that the current setup kinda works for me in that once the first protocol is good to go the interface can come up, so if IPv6 goes back and forth, failing to negotiate, why should it delay IPv4 or the other way around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enaess could you please provide some feedback here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the general outline of the above seems okay. However, I would want to solicit feedback from @paulusmack to make sure he is okay with a change like that and what you are trying to do. The other part of this is how "risky" is a change like that, meaning will it break or become fragile, e.g. is ip-pre-up / net-pre-up deterministic in all cases like ip is up, but ip6 failed negotiation, the test-coverage, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulusmack @enaess may I please request some direction here?
pppd/main.c
Outdated
int i = 0; | ||
const char** t; | ||
|
||
while (up_protos && up_protos[i] && up_protos[i] != name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use a strcmp() here instead of comparing the memory address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no real reason other than a compiler shouldn't duplicate the string and as such a pointer comparison is good enough. I've never seen that it gets duplicated but I've only ever worked with gcc in the last 19 years and some MSVC prior to that. Can convert if that's preferred, doesn't make a significant difference, and since others will probably wonder the same thing it's probably worth the change from a sanity perspective and just in case some compiler does actually duplicate the strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enaess are you happy with the explanation or should I switch to strcmp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not found of pointer comparisons as they imply deep understanding of the code and what you are trying to do. It is for the same reason Java does string comparisons using String.equals() vs. ==, they mean two completely different things.
b810041
to
6c56f53
Compare
pppd/main.c
Outdated
run_net_script(PPP_PATH_NET_PREUP, 1); | ||
if (if_indextoname(ifindex, iftmpname) && strcmp(iftmpname, ifname)) { | ||
info("Detected interface name change from %s to %s.", ifname, iftmpname); | ||
strcpy(ifname, iftmpname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to call script_setenv here for IFNAME? What do I pass for iskey?
Only case where set_ifunit is called with iskey=0 is under multilink, in which case pppd joining a bundle I suspect (but don't KNOW) that the bundle interface will already be UP and thus set_ifup will never be called.
We've only ever used multilink in a handful of cases, and those use-cases are long-gone now so don't have an environment where I can scope and check this.
@EasyNetDev: What do you think? |
@Neustradamus I don't see this PR as a reason to hold up a release. While some good collaboration happened here, the idea of a new net-pre-up and net-up callback to improve on the current problems related to ip-pre-up and absence of ipv6-pre-up can be implemented as a separate PR later |
Hi, I would really appreciate if we can get this merged, what would need to happen and what can I do from my side to assist in this regard? |
Copied from PR #111
OK.
Right. I was not aware of this. Does that actually happen that way? Because the various functions simply marks the interface up/down in the case of pppd, not a specific protocol/address here.
I can only test on Linux unfortunately. I looked at the code, and as far as I could determine, the code which was removed was 100% identical. I could have missed something. The primary aim here was to have a consolidated if up/down interface, which then did all the common work, around dispatching to the OS specific functions. Not to change behaviour, but to enable adding OS and protocol-independent up/down work to interfaces without having to do this in multiple places.
OK, so the question here is if it's possible to UP and/or DOWN the IPv4 vs IPv6 net0 interfaces (and even net0:1 independently of net0)? Based on what you stated, it sounds like a "yes it can", but can you please confirm? From what I recall the pppd code most certainly did not do that. And in fact, there was quite a bit of confusing stuff around upping and downing interfaces, and it looked like it's possible for IPv6 to tear down, and bring down the entire interface (ie, ifdown) without actually tearing down LCP as a whole, leaving pppd in a situation where it could consider IPv4 and MPLS etc to be "up", but due to the interface being down it's actually down. If you're right (I'm not saying you're wrong, just looking to understand better) about the way solaris handled this then I'm way off base and will probably need to rethink this a bit.
I think this touches on some of the discussion and issue points. So what I desperately want is an init script to execute right at the start of pppd (net-init as implemented in that PR), then scripts to execute just before upping the interface (protocol independent, net-preup, after auth, but prior to setting the interface up), and then to execute when tearing down (less critical, but we do have lated ifb interfaces used for shaping ingress traffic from the ppp interfaces which needs to be torn down too).
Wasn't my intention to break anything, and I did take as much care as I could to maintain backwards compatibility. I do agree with your pet peeve and I can only state that was not my intention. You reference 'unit' numbers here specifically, I do recall yanking certain parameters which I think could have been unit numbers, they were yanked from functions I was modifying where to the best of my ability to determine they were not being used anyway. You reference "odd non-public branches" that may have used them, of this I was not aware. I also didn't spot specific other places in the (fairly large) codebase where they could be removed, but I could well have missed stuff, obviously, which is why we have reviews of code before merging :).
OK, so add this to the per protocol structure is what I'm getting? This may also allow the one requested additional feature called net-up to be invoked once all protocols have been brought up. The one other requested change was to only actually up the interface once all desired protocols have been negotiated and brought up, but as stated, that would require changes to most sub protocols which can obviously be done. Specifically, sub protocols would need to signal "configuration done, ready to be upped, please call this callback when interface is up". This can be done, and I'm happy to put in the effort for the in-tree protocols. I would, however, recommend that this be configurable behaviour. If we're going that far, I'd further recommend that any and all other pre-up scripts be nuked, since already they have no guarantee that the interface is actually still down. Alternatively, we need to follow the above methodology and enforce that. This is perhaps a bigger change than what should be merged at this point though, and could cause further delays to an overdue ppp release.
Agreed about the consistency, will gladly make this change. They way I understood it was that locally the interface will still reflect UP, but I do agree with you, it will already no longer be operational if remote side tears down, so conceding to rather execute post down. This was a debate in my head and honestly it doesn't particularly matter that much. |
@carlsonj replying here rather so we can split the release discussion from this, I think they should be separate at this point.
ACK. Will re-look at this code.
ACK. I'll re-look. My suggestion is then to proceed with a new ppp release without this patch. Work does need to be done here, and we'll keep using this patch on our systems for the time being.
Can it? Let's say pppd is launched by xl2tpd or pppoe-server, you excpect the system administrator to somehow figure out which unit number is going to be used and wrapping pppd? Guessing that the "connect" option can be abused here, but at this stage (to the best of my knowledge) the unit number still won't be available. The disconnect script also doesn't run reliably due to "script is not run if the modem has already hung up". init script seems to be more aimed at initializing serial lines, either way, from a quick look these scripts only execute when ttyname option is given such that devnam is not the empty string. In fact: Jan 2 12:05:11 kerberos pppd[9457]: In file /etc/ppp/options.lns: unrecognized option 'init' This is when incoming over xl2tpd, pppd invoked by same.
This is the net-init part of same. Yes it's important. We need the unit number for initialization even prior to authentication. Mostly this ensures that ppp "state files" as we keep them were properly cleaned up during previous runs of pppd but there are other uses for us too.
There is both an auth-up (auth success) and auth-down (auth failure). Guessing the latter can be used to implement some form of source blacklisting, ie, for pppoe add the source mac into ebtables or similar to prevent further abuse, or to an ignore list for pppoe-server (which is a non-existent feature currently, but can be added, with the caveat that an attacker at this level can trivially bypass a mac block list, and can even utilize that to create a denial-of-service to other legitimate users. More useful for things like pptp and l2tpd wherewe can do a source IP block, but not useful in our primary case where we have a PPPoE to PPPoL2TP bridge as the LAC.
I created a PR for exactly that originally. It was shot down for various reasons, primarily that ip-pre-up is already unreliable (except perhaps for solaris), and adding ipv6-pre-up adds to the confusion. Two primary suggestions that came from that:
Note that the current patch series executes one NCP-pre-up prior to net-pre-up prior to upping the interface. This, following this discussion, will break at least solaris. This was my oversight plainly. The above should deal better with both Linux and Solaris type methodologies. The involved scripts documentation would then be: net-init - executes the moment the ppp interface has been created, at this point the interface will be down, unconfigured and prior to authentication commencing. You can use this script auth-up/down - as per current. It should be noted these are executed asynchronously and thus precludes use for performing work between auth and NCPs being brought up. net-pre-up - executes once the first NCP has completed negotiating. This is guaranteed to execute whilst the interface is still down, will be executed synchronously and should thus perform it's work in as short a time-frame as possible. No protocol specific information will be available at this point. You can safely rename the interface from this script, for example, based on authentication information. *-pre-up (currently only ip-pre-up): Will be executed once the NCP has completed, prior to configuring the interface and, for operating systems such as Solaris that uses per-protocol interfaces, bringing up the per-protocol interface. For all other operating systems the interface will already be UP, but not yet configured for the specific protocol. The to be configured details will be available *-up (I believe just ip-up and ipv6-up): Will be executed once the interface has been configured for the given NCP. There was quite a large discussion around this. Essentially, pre-up cannot (In Linux world at least) be guaranteed to execute whilst the interface is still down. The consolidation was primarily done to simplify the net-* scripts for me, and because I thought it made sense to push this kind of work through a central controller function. I still think it does, but plainly it's not quite as simple as I thought.
For solaris where there is a per-protocol interface this makes sense, for Linux it most certainly doesn't make that much sense, since the interface has a single global state, eg:
Refer above. auth-down executes on authentication failure, once a peer has been authenticated, the ppp is authenticed, authentication won't happen again.
I disagree. The patch series does not take ondemand into consideration at all currently. This point is valid, and if there is demand for hook scripts here I could work that into the system. ondemand itself I suspect makes less and less sense overall, except perhaps for things like vpn connections (client side). Eg, have the pppd sit there with a route pointed at the remote network, and bring up the vpn whenever it's needed, this to me makes some sense, however, ncp-pre-up in above should deal with those use-cases. I think. Haven't thought about this much at all.
No it doesn't. In solaris world this statement may be true, but certainly not in Linux world. As it stands ip-pre-up has zero guarantees on linux with regards to being executed with the interface still being down.
OK. Not sure what I missed here then, but I'll re-look.
As I understand, so that we can have a guarantee that the interface is still down when executing ip-pre-up. With the information around Solaris now I had to think wider about this, and I suspect the above strategy should be suitable all around? The change in behaviour from a system administrator point of view would then be restricted to ncp-pre-up being executed prior to actually configuring the interface (which makes sense on Linux, not so much on Solaris - perhaps we need to vary behaviour here based on OS behaviour?)
I call it net-, you call it if-, guess these are both aimed at the same. I don't care about if-up personally but yea, I guess this can be expanded to (keeping with my net- rather than if-, naming is irrelevant IMHO): net-init - executed prior to negotiation (even auth).
You would. personally I have no need for a NCP specific pre-up in the Linux world, however, in Solaris world (without thinking too hard about this) it would make a lot of sense since each protocol really has it's own interface.
Just to be clear, not all suggestions in this thread came from me. My aim was "simply" to get a net-init, net-pre-up and net-down. Consolidation was done in order to simplify that (ie, no need to signal net-pre-up from hordes of different locations), and the consolidation I believe is what broke things overall for solaris as I understand. I'd still like to action that, and put as few things as possible in sys-*.c such that the behaviour of pppd between different operating systems remains as close as viably possible between different operating systems (and to make porting between operating systems easier, not sure if *-bsd is currently supported, for example, or whehter that's even viable). Currently we only have Linux and Solaris. Agreed. So the ncp-pre-up being pre or post configuration could be a configurable option such that the default is to stick with current (broken on Linux) behaviour, but allow for the alternative behaviour. Or conversely we could have different behaviour hard wired in pppd based on the OS design. But that's still an incompatible change for Linux at least, so I think we should run this as an option and highly recommend the option being used on Linux then but not currently make that default, just warn that you should consider it's use (once off for every pppd run if one of the ncp-pre-up scripts is executed, unless the no- variant is explicitly set such that the system administrator has indicated to opt-in to the potentially broken behaviour), in a further future version we could then start warning that the default behaviour will change at some point, and then change it eventually. Sorry for these long messages, but I feel it's important to not leave stones unturned, and to make sure that everything is well understood. Would the above strategy be adequate for you to keep existing behaviour as close as possible, whilst still introducing the new behaviour (which enables improvements for Linux sysadmins then).
|
On 1/2/23 06:15, Jaco Kroon wrote:
ACK. I'll re-look. My suggestion is then to proceed with a new ppp
release without this patch. Work does need to be done here, and we'll
keep using this patch on our systems for the time being.
OK.
1. pppd is initially launched. No need for a special script here,
as the launcher can just do whatever is needed.
Can it? Let's say pppd is launched by xl2tpd or pppoe-server, you
excpect the system administrator to somehow figure out which unit number
is /going/ to be used and wrapping pppd? Guessing that the "connect"
If you're doing something that requires a known interface name, why
wouldn't you use the existing 'ifname' option on pppd? I don't see why
guesswork would be needed. (And, no, if those other tools lack needed
features, I don't think that's _necessarily_ a compelling argument to
stack more on top of pppd, though it could be if it's somehow easier or
more widely useful.)
But, ok, there could be a plausible usage case here where you want the
script to run after pppd has allocated a network interface to use, but
before "anything else" has been done, including negotiating IP addresses
or configuring them on the interface. I'm not actually *seeing* what
that case might be, but, sure, that doesn't concern me as much as all of
the rest of this.
Suppose, though, we could find a clean way to fix the (apparent) Linux
problem with ip-pre-up such that it actually was guaranteed to run and
finish that script before pppd ever set the IFF_UP flag and after IPCP
negotiation completed.
Doing so would have the nice side-effect of making Linux pppd match what
the Linux pppd man page says.
Would that be sufficiently usable for what you're trying to do? The only
difference (and I think it's a pretty important one) is that ip-pre-up
runs late enough that it is *NOT* in the direct line of any
timing-sensitive PPP negotiation machinery. So, yes, the addresses would
be configured if you looked, but no IFF_UP flag (in this scenario) would
be set.
option can be abused here, but at this stage (to the best of my
knowledge) the unit number still won't be available. The disconnect
The old 'unit N' option is still available. May not be what you really
want, though.
script also doesn't run reliably due to "script is not run if the modem
has already hung up". init script seems to be more aimed at initializing
serial lines, either way, from a quick look these scripts only execute
when ttyname option is given such that devnam is not the empty string.
In fact:
Correct. 'init' and 'connect' are about initializing the other side (the
serial connection) and 'disconnect' is the serial-side tear-down script.
Neither is what you want if you're trying to mess around with network
interfaces.
Jan 2 12:05:11 kerberos pppd[9457]: In file /etc/ppp/options.lns:
unrecognized option 'init'
Jan 2 12:05:52 cerberus pppd[24775]: In file /etc/ppp/options.lns:
unrecognized option 'connect'
Jan 2 12:06:32 cerberus pppd[8513]: In file /etc/ppp/options.lns:
unrecognized option 'disconnect'
I'll freely admit to knowing nothing about what xl2tpd is doing. At a
guess, that *looks* like the standard pppd code has been modified and/or
a plugin is removing those options. But I can't be sure. In any event, I
don't think it's relevant to the issue at hand.
This is when incoming over xl2tpd, pppd invoked by same.
2. pppd completes negotiating LCP but has not yet done AUTH (if
any). We don't have this now. Not sure if it's important. It
could be added.
This is the net-init part of same. Yes it's important. We need the unit
number for initialization even prior to authentication. Mostly this
ensures that ppp "state files" as we keep them were properly cleaned up
during previous runs of pppd but there are other uses for us too.
I'm not entirely sure I understand those requirements. What are the
"state files?" Is this a reference to the optional lock files used by
pppd on ttys? The '.pid' file used to communicate with the external
world? Or something else?
This matters because it likely has implications for exactly *when* the
script could possibly be run. It may be the case that it needs to run
before pppd is ever started or that it could interact badly with pppd
shutdown.
It would be nice to have some detail here, as pppd already does its
*own* management of state and I've never had great luck layering one
manager atop another.
3. pppd completes authentication of the peer and has not yet done
any of the NCPs. This is auth-up.
There is both an auth-up (auth success) and auth-down (auth failure).
I might have been a little too terse in this description. These scripts
are used when we're authenticating the peer. They are *NOT* general
purpose wrappers around authentication in general. In other words, for
the typical user with the typical sort of Internet connection scenario,
the peer (the ISP) will insist on authentication but the local instance
will *NOT* be configured to insist on authentication. And, of course,
for that to succeed, the local instance needs to be configured with
credentials and/or plug-ins that supply the authentication information
demanded by the peer.
And, because in that typical configuration we're not demanding
authentication data from the ISP (often with the "noauth" option to be
sure), this means that neither auth-up nor auth-down scripts will be
used at all.
These are sometimes referred to as "client" and "server" roles, although
that terminology is (at least to me) incorrect. PPP is symmetric. Each
side can decide (based on local administrative settings) whether and how
to authenticate the other side. They're peers, not a a client/server
relationship. But because people often set up a particular usage
scenario, they tend to think it's the "only" way things work.
This means that those scripts probably don't do what I'm _guessing_ you
want.
Guessing the latter can be used to implement some form of source
blacklisting, ie, for pppoe add the source mac into ebtables or similar
to prevent further abuse, or to an ignore list for pppoe-server (which
is a non-existent feature currently, but can be added, with the caveat
that an attacker at this level can trivially bypass a mac block list,
and can even utilize that to create a denial-of-service to other
legitimate users. More useful for things like pptp and l2tpd wherewe can
do a source IP block, but not useful in our primary case where we have a
PPPoE to PPPoL2TP bridge as the LAC.
I don't understand. It doesn't appear to be related to auth-up or
auth-down in any way. In particular, we don't run anything on
authentication failure.
The auth-up script is invoked if (and only if) we demand authentication
from the peer, we get that required data, and we successfully complete
authentication (i.e., we've validated the peer's identity). The
auth-down script is run when (and only when) a link that previously ran
auth-up is shut down or the PPP AUTH layer is otherwise torn down (as
with LCP renegotiation).
I don't see how those scripts could be used as you're describing. In
general, PPP doesn't know about PPPoE/PPTP/L2TP details such as
addresses on the underlying encapsulation. Nor, I think, should it.
This part sounds very confusing to me, but maybe it has no impact here.
4. pppd completes negotiating an NCP but has not yet marked the
network interface as "up" for traffic so no data is flowing yet.
This is ip-pre-up. (Sadly, we have this for IPv4 only. Someone
might want to add ipv6-pre-up.)
I created a PR for exactly that originally. It was shot down for various
reasons, primarily that ip-pre-up is already unreliable (except perhaps
for solaris), and adding ipv6-pre-up adds to the confusion. Two primary
suggestions that came from that:
1. Delay "upping" the interface until all NCPs has completed, that way
ipv6-pre-up would make more sense too. And net-init, net-pre-up,
I would definitely agree with that. Delaying the setting of the IFF_UP
flag so that ip-pre-up is actually reliable (and so that it conforms
with the specification in the man page) on Linux sounds like a valuable
thing to do.
net-up and net-down could be further enhanced possibly. The question
is, does net-pre-up then execute as the first pre-up script, or the
last one in such a case. Would require callbacks into NCPs to notify
NCPs that interface has now being brought up.
You've lost me again. Dunno what the "callbacks" do, where they come
from, or why they're needed.
To me, if pppd is managing the interface flags, then pppd can offer
script invocations that happen before or after flags are set in
particular ways. Those scripts are free to do whatever they need to do.
And, for consistency, pppd can (and does) wait for "pre-up" script to
complete before actually setting the IFF_UP flag.
But I don't understand "callback" in that context. What would the script
be doing that would require information to be sent from the script to
the pppd process? Is it anything other than success-vs-failure?
2. (my personal preference) Modify behaviour of pre-up scripts to
provide details as to what /will/ be configured (with no guarantee
as to iface up/down state), passing the information as arguments to
the script or environment variables (which I believe is already the
case). Thus:
2.1. Execute net-pre-up;
2.2. bring up iface for operating systems that doesn't support
per-protocol interface state (based on solaris input);
2.3. Set env variables;
2.4. execute ncp-pre-up;
2.5. configure interface;
2.6. bring up protocol specific interfaces for operating systems
that supports that; and
2.7. execute ncp-up.
I'm still missing:
a. What net-pre-up could possibly do that ncp-pre-up (is that a generic
term for "ip-pre-up"?) cannot. The only distinction it seems is that
"net-pre-up" is running before IPCP option negotiation takes place.
Is that it? Why? What can't you do after IPCP negotiation is done?
b. What the ordering really means. Where does IPCP negotiation happen
in that list?
c. How this is compatible? It seems to me that doing 2.2 before 2.4
means that Solaris is broken. The guarantee for "ip-pre-up" is that
it runs *before* the IFF_UP flag is set. So you can do critical
things such as install packet filters. But in this new scenario,
that doesn't work, because the IFF_UP flag was already set at 2.2.
d. Why would there be "no guarantee" on interface up/down state? That
makes no sense to me. The *whole point* of having ip-pre-up and
ip-up is that the interface state is known so you can write
sensible scripts. The state is in the pppd man page. Why would we
remove that?
e. Why there are so many points of control?
net-init - executes the moment the ppp interface has been created, at
this point the interface will be down, unconfigured and prior to
authentication commencing. You can use this script
... sentence seems cut off here. You can use this script how?
Agreed that this one is plausible.
auth-up/down - as per current. It should be noted these are executed
asynchronously and thus precludes use for performing work between auth
and NCPs being brought up.
Yes; those are async scripts. Normally, they're used for audit logging.
It is *possible* to interject a synchronous script between completion of
the AUTH layer and the start of the NCP layer(s). I'm not sure if it's a
good idea, so I'd really like to understand the intended usage much better.
In particular, such a script would need to be carefully written because
pppd would effectively hang while the script was off doing whatever it
does. This means that the peer (unaware of the situation) would end up
starting its NCPs, sending Configure-Request messages as one does
following authentication, and timing out. In theory, one could possibly
buffer a few messages in order to speed things along if the script is
quick, but things can get really dicey there -- both with PPP behavior
and known peer implementation bugs.
net-pre-up - executes once the first NCP has completed negotiating. This
is guaranteed to execute whilst the interface is still down, will be
executed synchronously and should thus perform it's work in as short a
time-frame as possible. No protocol specific information will be
available at this point. You can safely rename the interface from this
script, for example, based on authentication information.
Or we could just fix ip-pre-up on Linux. Right?
Not sure what interface renaming is about, but sure, that could be done.
Seems easier to use 'ifname' option than hijacking in a script, but ok.
*-pre-up (currently only ip-pre-up): Will be executed once the NCP has
completed, prior to /configuring/ the interface and, for operating
systems such as Solaris that uses per-protocol interfaces, bringing up
the per-protocol interface. For all other operating systems the
interface will already be UP, but not yet configured for the specific
protocol. The to be configured details will be available
That's not what the existing pppd documentation says, though. It says
that ip-pre-up is run before the interface is marked "up."
That's the whole reason for being. That's why it exists apart from "ip-up."
/etc/ppp/ip-pre-up
A program or script which is executed just before the
ppp network interface is brought up. It is executed
with the same parameters as the ip-up script (below).
At this point the interface exists and has IP ad-
dresses assigned but is still down. This can be used
to add firewall rules before any IP traffic can pass
through the interface. Pppd will wait for this
script to finish before bringing the interface up, so
this script should run quickly.
There was quite a large discussion around this. Essentially, pre-up
cannot (In Linux world at least) be guaranteed to execute whilst the
interface is still down. The consolidation was primarily done to
Sure it can.
All that we have to do is introduce the "all NCPs stable" condition that
we talked about before.
The required change is then just this, regardless of OS:
- "ip-pre-up" runs synchronously after the IP addresses have been
negotiated and set on the network interfaces, but IFF_UP has *NOT*
yet been set.
- pppd waits until all configured NCPs reach either OPENED or STOPPED
state.
- Once reaching that bar, all of the network interfaces that have been
negotiated are marked IFF_UP, and then any ip-up or ipv6-up scripts
are run.
I think that works everywhere, including both Linux and Solaris, and
avoids all this extra complication. The "difference" from what we have
today versus what you're proposing is a microscopic change on Solaris in
the timing of just "ip-up" -- instead of waiting for IPCP and the IPv4
LIF alone, it ends up waiting for all NCPs to settle and all LIFs to be set.
In a properly-functioning system, that's at worst a few milliseconds. In
a badly-functioning one, who cares?
The implication would be that if the user has enabled IPv6 and if the
peer doesn't implement (or allow) IPv6 and it *ALSO* doesn't follow the
RFCs so that it just discards the IPV6CP messages (rather than sending
LCP Protocol-Reject as it must), then we could have a situation where
ip-up is delayed arbitrarily as IPV6CP times out and eventually gives up.
With the default parameters, that would take 30 seconds to resolve
itself. The user could work around that by either (A) setting "noipv6"
as already described in the man page for dealing with bug-ridden peers
or (B) yelling at the owner of the peer machine to get a better system.
Maybe both.
simplify the net-* scripts for me, and because I thought it made sense
to push this kind of work through a central controller function. I still
think it does, but plainly it's not quite as simple as I thought.
5. pppd has marked the network interface "up" and traffic can now
flow. This is ip-up and ipv6-up (and others where supported).
6. pppd has lost a network interface that was "up." This is ip-down
and ipv6-down (and others).
Yes.
For solaris where there is a per-protocol interface this makes sense,
for Linux it most certainly doesn't make that much sense, since the
interface has a single global state, eg:
But it still makes sense there.
The current behavior on Linux is that the IFF_UP flag is removed when
the *last* NCP goes away. That's what the static if_is_up and if6_is_up
variables in sys-linux.c are about. (The if_is_up one is used
confusingly as a counter, but it's really just a boolean.)
This means that today:
1. You run the corresponding ip-down or ipv6-down script after deleting
the addresses and *possibly* turning off the IFF_UP bit (if you're
the last one out).
2. In the script, you can't trust the IFF_UP flag, particularly on
Linux. But you *CAN* trust that the interface is no longer passing
packets with that protocol, because we've shut it down in the
kernel and removed the addresses.
Refer above. auth-down executes on authentication failure, once a peer
has been authenticated, the ppp is authenticed, authentication won't
happen again.
No, it doesn't really work like that.
"net-preup" already exists. That's point (4). Just use ip-pre-up.
Tearing down also exists. That's point (6). So I'm not exactly sure
what's being fixed here.
No it doesn't. In solaris world this statement may be true, but
certainly not in Linux world. As it stands ip-pre-up has zero guarantees
on linux with regards to being executed with the interface still being down.
I think it's better to fix the bug rather than add more complexity.
As I understand, so that we can have a guarantee that the interface is
still down when executing ip-pre-up. With the information around Solaris
now I had to think wider about this, and I suspect the above strategy
should be suitable all around? The /change/ in behaviour from a system
administrator point of view would then be restricted to ncp-pre-up being
executed prior to actually configuring the interface (which makes sense
on Linux, not so much on Solaris - perhaps we need to vary behaviour
here based on OS behaviour?)
We can certainly have scripts that only fire on a given platform, if
that's needed.
But to the extent possible, I *strongly* prefer that we design the
interfaces in a platform agnostic manner.
I guess I wouldn't mind seeing "if-pre-up", "if-up", and "if-down"
scripts that are invoked on Linux only when the whole interface is
marked up and down by pppd. Not entirely sure if that fixes your
problem, but it sounds like something that sys-linux.c could do
without hurting others.
I call it net-, you call it if-, guess these are both aimed at the same.
I don't care about if-up personally but yea, I guess this can be
expanded to (keeping with my net- rather than if-, naming is irrelevant
IMHO):
I don't care about the naming.
I was just trying to disambiguate it from your "net-" stuff, which
*appears* to come from some other subsystem. (And which also has
external dependencies and expectations that are a bit opaque to me right
now.)
net-init - executed prior to negotiation (even auth).
net-pre-up - execute prior to bringing interface up. Permits renaming
(using eg "ip li set dev pppX name pppX-jkroon").
net-up - executed after bringing interface up (on solaris I guess this
will make little sense since it doesn't have a single state, but the
interface could still be RUNNING at this point, just not
protocol-specific UP?)
Why wouldn't it make sense? The design is really quite straightforward
and is in the man page.
"ip-pre-up" is run before the interface is marked IFF_UP.
"ip-up" is run after.
It gives you the option of doing things that can *ONLY* be done when the
interface is down -- such as configuring security-related packet filters
-- before the "up" flag is set. Setting filters after going IFF_UP is a
Bad Thing regardless of platform: it means the bad packets you're trying
to filter away can sneak through on a timing hole.
The IFF_RUNNING flag is orthogonal. That flag is set by the system
itself when the underlying layers are ready to accept packets and
cleared when those layers are down. It's basically an indication of
hardware or lower-layer protocol status. The IFF_UP flag is an
administrative flag: it says "the administrator would like to use this
interface now."
(For Ethernet, IFF_RUNNING is usually set using the hardware carrier
flag. For PPP, it usually marks when LCP itself is in state OPENED, and
LCP itself rides over some serial channel that usually has its own
up/down indication.)
Those two concepts are distinct. You can have an interface where the
administrator has set an address and marked the interface "up," but
someone has tripped on the wire and pulled it from the wall. That's
IFF_UP but ~IFF_RUNNING.
You can also have an interface where the wire is attached, but the
administrator hasn't configured the interface for use it or just doesn't
*want* it to be used. That's IFF_RUNNING but ~IFF_UP.
In this case, pppd is acting as the ersatz "administrator." It's doing
the things that the admin would normally do via "ifconfig" (or similar).
I suggested disconnecting the administrative part of pppd as another
option, but I guess that doesn't match whatever it is you're doing.
It's also possible that what you're really proposing is introducing
a new "phase" for pppd, and I think that'd need a little more
discussion. This would be a phase after NCP that represents "all
NCPs resolved" (as described above), and possibly on Linux only,
would be a place where the interface is finally marked "up." That
seems like a more radical change than what's proposed here, but at
least seems consistent with the existing design. Or, in a simpler
model, we could just have a new pppd option that disables the
modification of the IFF_UP flag by pppd, leaving that part to
external scripts (or programs) to resolve. Then you can do whatever
you want in your "if-up" and set the flag at your leisure. You might
still need the "all NCPs" script to make that work.
You would. /personally/ I have no need for a NCP specific pre-up in the
Linux world, however, in Solaris world (without thinking too hard about
this) it would make a lot of sense since each protocol really has it's
own interface.
Given all of the above, I'm not sure I understand that position. I think
that if you want "net-pre-up" then "ip-pre-up" *WITHOUT BUGS* is almost
exactly what you want. And has the huge benefit of not introducing more
complex functionality that we have to support from now until forever.
If it's not exactly what you want, then I'd still like to see a use-case
where net-pre-up works but ip-pre-up (again assuming the Linux bug is
fixed) does not.
Otherwise, removing the existing scripts would also be a pretty big
and incompatible change. I can't see how something like that would
ever reach consensus without a fork. Even then, I think it'd be a
long term disaster.
Just to be clear, not all suggestions in this thread came from me. My
aim was "simply" to get a net-init, net-pre-up and net-down.
Consolidation was done in order to simplify that (ie, no need to signal
net-pre-up from hordes of different locations), and the consolidation I
believe is what broke things overall for solaris as I understand. I'd
still like to action that, and put as few things as possible in sys-*.c
such that the behaviour of pppd between different operating systems
remains as close as viably possible between different operating systems
(and to make porting between operating systems easier, not sure if *-bsd
is currently supported, for example, or whehter that's even viable).
Currently we only have Linux and Solaris.
We used to support those other systems, but yanked support as users went
away. No idea what the state of that is now.
Agreed. So the ncp-pre-up being pre or post configuration could be a
configurable option such that the default is to stick with current
(broken on Linux) behaviour, but allow for the alternative behaviour. Or
conversely we could have different behaviour hard wired in pppd based on
the OS design. But that's still an incompatible change for Linux at
least, so I think we should run this as an option and /highly recommend/
the option being used on Linux then but not currently make that default,
just warn that you should consider it's use (once off for every pppd run
if one of the ncp-pre-up scripts is executed, unless the no- variant is
explicitly set such that the system administrator has indicated to
opt-in to the potentially broken behaviour), in a further future version
we could then start warning that the default behaviour will change at
some point, and then change it eventually.
I'm not really with you there.
I think having a new option that modifies how the scripts are used would
add an extra level of complexity and failure-prone-ness that would be
highly undesirable. It would end up with users randomly setting this
"make it work right" flag everywhere, likely without understanding any
of the implications. Pretty much the way people set "noauth" today or
"novj".
The "interface configuration" we're talking about is just adding the IP
address. But not setting the IFF_UP flag. How does that matter? The
system should ignore any IP addresses until the IFF_UP flag is set. If
it doesn't do that today, then it's *really* fundamentally broken.
Based on my limited testing, Linux isn't *that* bad.
Sorry for these long messages, but I feel it's important to not leave
stones unturned, and to make sure that everything is well understood.
Would the above strategy be adequate for you to keep existing behaviour
as close as possible, whilst still introducing the new behaviour (which
enables improvements for Linux sysadmins then).
Agreed; it's important to hash it all out. This is 30 year old code, and
it would be good to avoid doing things that make hard for our heirs.
…--
James Carlson 42.703N 71.076W FN42lq08 ***@***.***>
|
Without quoting (because the interface confuses me here and it's getting absurdly hard to sort it out): If we can fix ip-pre-up, that solves our requirement for net/if-pre-up. Having looked at the code I do not thing this is a trivial endeavour though. Possible but not easy. We still need/want a net/if-init. And some form of -down script (doesn't really matter if we use ip-down or ipv6-down, but we generally can only take action on the last one ... how to figure that out, thus for me net/if-down makes sense to execute once all NCPs have been brought down. ip-* only executes if IPCP is being negotiated, so it's conceivable that in the medium to long term ip-pre-up may no longer be usable. We certainly are trying to get to a point where we can ONLY have IPv6 on interfaces where possible, and tunneling IPv4 EDGES via IPv6 (most likely candidate currently is 464 XLAT or similar). I thus reason that a net/if-pre-up makes sense anyway. I'm not sure net/if-up makes sense, for that matter even ip-up and ipv6-up makes little sense to me as everything we want to achieve should happen in pre-up. I guess there are possible absurd things like "enable bgp peers" that should only happen once the protocol is actually up though (crazy example, since I can't think of anything better). This is protocol specific though, and I can't envision any need for net/if-up. So the issue on Linux happens with a sequence like: LCP comes up. at this stage the interface is UP and starts passing IPv6 packets. There is no ipv6-pre-up either (adding this makes a lot of sense for Solaris, and my PR which has been closed can be used for this). In Linux this would suffer the same issue as ip-pre-up currently. One of the two would execute with the interface being down, but not both. And if another NCP is also being negotiated, it could be neither. No IPCP completes, and pppd does this: Not too worried about teardown. Hoping this brings some clarity as to the perceived problem. I'm thinking that documentation on Linux should be updated to reflect reality, as that is MUCH, MUCH simpler than fixing the overall problem (which I believe would require modification to ALL NCPs - some of which may not even be in-tree). I'm thinking introducing interface/network level init, pre-up and down makes sense since these are protocol (NCP) agnostic. I'm thinking having an option to explicitly execute any NCP specific -pre-up PRIOR to actually configuring also makes sense for me. Currently ip-pre-up permits for interface renaming. Which is where the idea to permit this from net-pre-up came from. And there was quite a nasty other patch floating around to enable ifname to use % codes to include things like usernames etc into the interface name. The primary motivation being to set the interface name based on the authenticated username, eg, if user jkroon dials in over l2tp/ipsec, the interface can be renamed to something like vpn-jkroon rather than pppX. This can only be done after authentication. We don't have a specific need for this to be honest, this forms part of our state files. The state files I reference are basically short key=value pair text files we keep on a per-interface basis, for example:
Except for in net-init and net-down these are append-only files. In net-init it's blanked, in net-down it's removed. This allows us to generate lists of connected users, and the mechanism they're using to connect. Information about LACs and calling station IDs are also added to this file. Yes, most/all of this information is also available radius side, but it's still convenient from a system administrator perspective to have this information "locally" on the LNS. In terms of where we come from, we're an ISP, so using options like "unit" and "ifname" in the options files is precluded since it would cause problems, and we can't afford to have a per-user options file, not to mention that you can't reliably from pppoe-server nor xl2tpd determine which options file would be required until after PPP auth has happened. Regarding having the option to "modify" behaviour - I do concede that this is nasty, but I don't see a better mechanism. The choices with respect to ip-pre-up (and any other NCP specific pre-up) are as follows:
I'm not convinced of any of this, and as per yourself, this just doesn't sit quite right. |
On 1/3/23 17:29, Jaco Kroon wrote:
Without quoting (because the interface confuses me here and it's getting
absurdly hard to sort it out):
Agreed on that. Not sure what was so wrong with plain text email that we
have to go through this. I guess any existing wheel can be reinvented in
a substantially less than round form.
If we can fix ip-pre-up, that solves our requirement for net/if-pre-up.
Having looked at the code I do not thing this is a trivial endeavour
though. Possible but not easy.
If I get a spare moment, I'll look into it. It's not "trivial," but I do
think it's a lot more durable.
As I'm sure you know, one of the big problems with an older code base is
the accumulation of little hacks everywhere -- technical debt. That's my
main concern; that we avoid bolting little things on the side over and
over for each project that comes along. They inevitably require slightly
different things and have no built-in knowledge of each other.
We still need/want a net/if-init.
And some form of -down script (doesn't really matter if we use ip-down
or ipv6-down, but we generally can only take action on the last one ...
how to figure that out, thus for me net/if-down makes sense to execute
once all NCPs have been brought down.
A solution that detects all NCPs up would naturally be able to detect
the down event as well, though I don't really understand this specific
concern. Maybe what you're asking for is script-run-on-IFF_UP-cleared as
some kind of always-final "net-down". That's possible; just not sure how
useful.
As a system architectural matter, I think we're kinda doing all of this
wrong solely because of legacy issues. In a modern system, anyone who
cares about interface up/down can just monitor a routing socket and get
all those events delivered straight from the kernel, free of charge. No
hokey scripts or other nonsense required.
Clearly, that doesn't work for detecting the "I haven't marked it up,
but I will shortly" event and synchronously blocking the flag change
until some action is complete. That case does require something like
ip-pre-up or an internal callback hook for a plug-in. (At least until
someone dreams up a way for a routing socket to allow for interposition
on requests.)
I don't know what your application is doing, but if I were writing it,
that's what I would do. After all, an Ethernet interface also can
provide network service but has NONE of these scripts. WWED ("what would
Ethernet do") is to me a fair way to sort out proposed PPP features.
ip-* only executes if IPCP is being negotiated, so it's conceivable that
in the medium to long term ip-pre-up may no longer be usable. We
certainly are trying to get to a point where we can ONLY have IPv6 on
interfaces where possible, and tunneling IPv4 EDGES via IPv6 (most
likely candidate currently is 464 XLAT or similar). I thus reason that a
net/if-pre-up makes sense anyway.
I have a lot less sympathy for that one. What on Earth is anyone to do
in such a script? "This interface is about to go up, but only for some
mystery protocol" is not a coherent event.
I can understand "authentication has completed." I can even understand
wanting a synchronous call on authentication completion, though I think
it's dangerous in general. I'm sort of baffled about "some NCP I'm not
naming" as an event.
Interface is about to go up for IPv4? Add filters and maybe some proxies
or tunnels. Up for IPv6? Add filters and tunnels. What's the "generic"
form of that?
At worst, the current design forces you to deliver two scripts,
ip-pre-up and ipv6-pre-up (assuming we add that, and assuming the bugs
are fixed), that have identical contents and idempotent behavior. So
what? I guess someone could argue that it "doesn't scale." But that also
doesn't matter when we add a new network layer protocol about once per
human generation.
I'm not sure net/if-up makes sense, for that matter even ip-up and
ipv6-up makes little sense to me as everything we want to achieve should
happen in pre-up. I guess there are possible absurd things like "enable
bgp peers" that should only happen once the protocol is actually up
though (crazy example, since I can't think of anything better). This is
protocol specific though, and I can't envision any need for net/if-up.
The original reason for "ip-up" was to trigger local sync processes. A
common one was to kick off a fetchmail process or a netnews send/receive.
Granted, as I said above, I think it's a historical mistake. There
should have instead have been a separate cron-like service that allows
you to schedule things to run based on interface availability -- with
the daemon reading from a routing socket rather than setting timers.
Doing it just for the special case of PPP is suspicious at best.
So the issue on Linux happens with a sequence like:
LCP comes up.
AUTH phase happens (in neither, one or both directions, doens't matter).
NCPs starts. (let's say IPCP and IPV6CP).
IPV6CP completes, and brings interface UP. Only IPv6 LL at this stage.
Other handling happens outside of pppd by way of RAs etc ...
/at this stage the interface is UP and starts passing IPv6 packets./
There is no ipv6-pre-up either (adding this makes a lot of sense for
Solaris, and my PR which has been closed can be used for this). In Linux
this would suffer the same issue as ip-pre-up currently. One of the two
would execute with the interface being down, but not both. And if
another NCP is also being negotiated, it could be neither.
Are you referring to the current code or how it *should* operate?
Personally, I think the pppd daemon should at least operate according to
what is documented in the man page. That it does not on Linux is a bug,
not a feature. And certainly not something we need to retread in every
single message on this topic.
No IPCP completes, and pppd does this:
Configure the interface's address.
At this point packets can already be passed (Linux, not Solaris).
ip-pre-up executes.
Interface is "brought up" (ie, counter incremented).
ip-up executes.
Again, this appears to be just a restatement of the existing bug. I'm
not sure how this helps or makes anything any clearer.
Not too worried about teardown. Hoping this brings some clarity as to
the perceived problem.
I'm thinking that documentation on Linux should be updated to reflect
reality, as that is MUCH, MUCH simpler than fixing the overall problem
(which I believe would require modification to ALL NCPs - some of which
may not even be in-tree).
Definitely, absolutely not.
This would be terribly wrong. The whole point of ip-pre-up is that it
runs before the interface is marked "up." The whole point. As in: just
delete all the code instead or make it "ifndef LINUX" if the plan would
be to update the documentation. There is absolutely no excuse for a
feature that "sometimes, sort of, kinda, maybe" works. It's an
attractive nuisance.
The question of outside-of-tree NCPs is an interesting one. I don't know
if there's (say) a Linux MPLS that has its own "sifup" function that
torques the IFF_UP flag on "the interface." Heck, I don't know of Linux
MPLS uses the same interface structure as regular network protocols --
if it does, then I think that'd be pretty wild.
In any event, plug-ins have *always* been an architectural issue with
pppd because there's very little constraint on the API. You can do
arbitrary things in a plug-in that drag pppd outside of the existing
documented behavior and/or out of alignment with the RFCs. So, having
effectively a flag day for them doesn't sound out of bounds to me. More
importantly: it matters much more to me that stock pppd works the way
it's designed to work and less so that any hypothetical plug-in stays
functional.
I suspect we can accomplish either a compilation error or a run-time
crash for any such unmodified NCP plug-in.
I'm thinking introducing interface/network level init, pre-up and down
makes sense since these are protocol (NCP) agnostic.
The init and down one seem somewhat innocuous. I think they may well not
be nearly as widely useful as you're expecting (certainly less useful
than the existing ip-up; likely only for your project), but at least
they're not actively harmful.
I'm having a little trouble with a generic pre-up, as I just don't know
what it means. Yeah, it's not too hard to implement (once the rest of
the infrastructure is added), but I'm struggling with the utility of it.
It seems like you're trying to future-proof something for a world where
there's neither IPv4 nor IPv6. Or something like that. But without even
a clear use-case for what actions might be taken in those scripts.
I'm thinking having an option to explicitly execute any NCP specific
-pre-up PRIOR to actually configuring also makes sense for me.
What does "actually configuring" mean in this context?
Does it mean "adding the IPCP negotiated IP address to the interface?"
If so, then once again I must ask "why?" What can you accomplish before
the address is set on the interface that cannot be accomplished after it
is set?
If you want to be in the middle of IPCP negotiation, approving or
changing addresses, then that's a different discussion.
Currently ip-pre-up permits for interface renaming. Which is where the
idea to permit this from net-pre-up came from. And there was quite a
nasty other patch floating around to enable ifname to use % codes to
include things like usernames etc into the interface name. The primary
motivation being to set the interface name based on the authenticated
username, eg, if user jkroon dials in over l2tp/ipsec, the interface can
be renamed to something like vpn-jkroon rather than pppX. This can only
be done after authentication. We don't have a specific need for this to
be honest, this forms part of our state files.
Ah, ok. That (finally) makes some sense. That's the sort of thing I
wanted: information about what you're actually trying to do.
For what it's worth, we already have that feature in the built-in
authentication.
/etc/ppp/chap-secrets:
***@***.*** * "joe's secret" * -- ifname ppp-joeuser
I don't know about the RADIUS or Diameter plug-ins. They should have
features like that. It'd be a shame if they didn't as it's a
long-standing feature in pppd.
The state files I reference are basically short key=value pair text
files we keep on a per-interface basis, for example:
|direction=in protocol=l2tp ***@***.*** |
Except for in net-init and net-down these are append-only files. In
net-init it's blanked, in net-down it's removed. This allows us to
generate lists of connected users, and the mechanism they're using to
connect. Information about LACs and calling station IDs are also added
to this file. Yes, most/all of this information is also available radius
side, but it's still convenient from a system administrator perspective
to have this information "locally" on the LNS.
OK. So *not* pppd files. That was my point of confusion.
Since I really haven't spent a great deal of time with L2TP, one area
that seems a little confusing to me is that this seems to be relying on
PPP authentication data rather than SCCRP Host Name and Challenge.
But ok.
In this case, since your system is the one that is demanding
authentication, the existing auth-up and auth-down scripts do almost
exactly what you want. auth-up is invoked when the user has been
authenticated by PPP, and auth-down is invoked when the link goes down
after having done auth-up. (And pppd has some logic to make sure that we
never overlap auth-up and auth-down, in case the scripts run for a "long
time.")
It sounds like your concern there is that these scripts aren't
synchronous -- that we don't block the state machine progress based on
script termination. But I don't think that's a serious issue given the
usage case. I would just write the authentication data to one file via
auth-up, and then write the addressing data to another file with ip-up.
The user interface that displays "logged in users" can read both and
correlate based on interface name.
If it's just a matter of convenience, that seems more than sufficient.
If it's something with tighter constraints, then how do those fit in?
Perhaps what you really want here is a custom plug-in to do all the
special actions you need (and perhaps an API or two to enable them)
rather than a slate of new general-purpose mechanisms in pppd.
In terms of where we come from, we're an ISP, so using options like
"unit" and "ifname" in the options files is precluded since it would
cause problems, and we can't afford to have a per-user options file, not
to mention that you can't reliably from pppoe-server nor xl2tpd
determine which options file would be required until after PPP auth has
happened.
There's already a feature to do that in the built-in pppd authentication
mechanism -- it's the "extra_options" mechanism in auth.c. No special
options files needed.
Regarding having the option to "modify" behaviour - I do concede that
this is nasty, but I don't see a better mechanism. The choices with
respect to ip-pre-up (and any other NCP specific pre-up) are as follows:
1. Leave broken. But at least document it.
2. Try to fix it, probably requiring all NCPs to be modified.
3. Leave broken by default, document it and give an option for changed
behaviour (and warn if it's not set). Eg, preup-prior-to-conf and
preup-after-conf (naming is too long, but you get the idea, issue a
warning if neither option is set ... in Linux).
I'm not convinced of any of this, and as per yourself, this just doesn't
sit quite right.
4. Break these things up to reduce the dependencies:
a. Fix the bugs (places where pppd doesn't conform to its existing
documentation and/or where the behavior or the documentation are
incomplete or incoherent).
b. Create a plug-in that provides the L2TP dial-in access server glue,
potentially added some new callback hooks to make possible whatever
it needs to do.
…--
James Carlson 42.703N 71.076W FN42lq08 ***@***.***>
|
Thank you, that feedback was very clear and concise, much appreciated. Fully agreed around technical debt, I've spotted a fair amount of that in ppp project just trying to implement this. Hoping to SOLVE some of this for the future, not add to it, so your input goes a long way for that. Regarding comparing ethernet to ppp ... I don't think that's a always a fair comparison, ethernet is designed as a multi-access network, not point-to-point, and whilst many of the issues may be similar, this is not always the case, especially when you consider ISP type use. If you look at dhcpcd for example you will note it also has a very comprehensive hook script mechanism that can be used to take a number of actions, and for us these actions are very different when compared to ppp. So even on a client or "access" side, we generally treat this very differently. I was not aware of extra_options, to the best of what I can determine looking at auth.c this only applies to local secrets files. 163 /* Extra options to apply, from the secrets file entry for the peer. */ Yes, to a degree I think net-pre-up is more future-proof than ip-pre-up as it does not rely on IPCP from being run. It also makes it more clear that it will execute regardless of NCPs being negotiated. This world is probably further off than I'd like to see, but I figured whilst I'm messing/tampering with it I might just as well. I personally do believe that net-pre-up vs ncp-pre-up makes some sense. And I do believe both have their uses. For example, it makes more sense for me to do NCP agnostic work (like setting up tc on the ppp interface) from a network-agnostic script, and then to possibly amend or add to it from ncp-pre-up PRIOR to the interface coming up. Frankly, in this specific use-case we have nothing protocol specific going on, however in some other cases we do have some ipv4 vs ipv6 specific things going on. the ipv6 stuff currently due to lack of ipv6-pre-up just happens from ipv6-up, and we use procfs to not auto-configure an interface by default, but then on the ppp interface if we'd like it enable it from ipv6-up, so this isn't a major blockage currently. Based on the fact that this has not been requested previously, I suspect that either IPv6 penetration isn't what we'd like it to be yet, or very few people care enough (or have figured out workarounds like we have). Towards this end, no, I do not believe that it's one or the other, it really should be both. I do agree I'd prefer to have the interface configured for the NCP prior to invoking the ncp-pre-up, but if we can't guarantee that this can happen, then I'd like the OPTION of having it execute PRIOR to configuring the interface with IP addresses, but with the environment variables already set. As we've established, behaviour on solaris is sane, linux is not. I'd rather have the issue documented at the very least in the man page, and workarounds provided if the issue can't be properly fixed. With an option to pick which quirky behaviour the system administrator prefers, and I see three options (based on your input too):
I'll happily concede that 1 should be the default, but I'd like the option of behaviour 2 or 3 too. I'd like to have the options that interface comes up even if only one NCP completes, in case of remote bugs (being on the ISP side it's hard to guarantee what stupidity we face from the remote side, we've seen some ... interesting things). We are pushing to a world where we can deploy routers that supports 464 XLAT such that we can move to a world where we only have IPv4 where absolutely required, so these routers will NACK IPCP, and then NAT46 IPv4 into the well-known 64:ff9b::/96, which the core ISP network can then NAT64 again on some dedicated host(s) which is connected to the world with IPv4. This is still a bit off, but yes, I'm at the same time trying to future-proof for this. IPv4 is fast becoming a scarce resource, and at the current IP lease rates this isn't something we're looking forward too, and thus looking to mitigate early. With regards to l2tp and pppoe other "outer" protocols, we can trust the l2tp host (FNO controlled, and if we can't trust them, then we've got substantially bigger problems than rogue clients), but we can't trust the pppoe nor the ppp side of things. In the case of the FNO bridging PPPoE to L2TP the SCCRP authenticates the LAC, not client, so the LAC (which bridges between PPPoE and L2TP) is authenticated at this stage, but ppp ends up authenticating the final client. VPN use typically only authenticates in ppp anyway, and SCCRP doesn't happen here (PPP/L2TP/IPSec for example). I've not actually used MPLSCP on ppp yet, so I honestly don't know what's all involved there, or even how it works, but we will have to look at this later in the year for another project. |
On 1/4/23 10:43, Jaco Kroon wrote:
Thank you, that feedback was very clear and concise, much appreciated.
Fully agreed around technical debt, I've spotted a fair amount of that
in ppp project just trying to implement this. Hoping to SOLVE some of
this for the future, not add to it, so your input goes a long way for
that. Regarding comparing ethernet to ppp ... I don't think that's a
always a fair comparison, ethernet is designed as a multi-access
network, not point-to-point, and whilst many of the issues may be
similar, this is not always the case, especially when you consider ISP
type use. If you look at dhcpcd for example you will note it also has a
very comprehensive hook script mechanism that can be used to take a
number of actions, and for us these actions are very different when
compared to ppp. So even on a client or "access" side, we generally
treat this very differently.
DHCP is an interesting comparison because DHCP can be used over PPP with
both IPv4 and IPv6. The fact that we negotiate DHCP-like things within
some (but certainly not all) of the NCPs is more of a bug than a
feature, but is definitely water under the bridge. In particular, the
DNS weirdness in IPCP was delivered to the WG as a fait accompli by a
Particular Large Vendor, and I think predates my time as chair. In any
event, it wasn't at all a pleasant surprise. It stinks and is a terrible
mistake we have to drag along with us "forever."
(For those following along: DHCP isn't just an address leasing
mechanism, but can be used with already-addressed interfaces to provide
all sorts of application layer data. It -- and the BOOTP subset -- have
a long history with serial links prior to PPP, where BOOTP over SLIP was
once somewhat common and served exactly the same purpose.)
I was not aware of extra_options, to the best of what I can determine
looking at auth.c this only applies to local secrets files.
163 /* Extra options to apply, from the secrets file entry for the peer. */
164 static struct wordlist *extra_options;
Eh, probably so. Oh well. In any event, that was the intended design
here: a given identity can unlock a set of configuration values. It has
the downside that it happens after authentication is complete, so you
can't set some of the basic link parameters on a per-user basis, but you
can definitely tweak all of the network layer things you might need.
You'd probably have to talk to someone involved with RADIUS/Diameter to
find out how (or "if") such a thing would be usable there.
Worst case, one could augment the AAA plug-in so that it also reads a
file or database to map from authenticated user ID to a set of pppd
options to inject via extra_options.
In any event, it wasn't intended as *exclusively* for authentication via
built-in file mechanisms.
Yes, to a degree I think net-pre-up is more future-proof than ip-pre-up
as it does not rely on IPCP from being run. It also makes it more clear
that it will execute regardless of NCPs being negotiated. This world is
probably further off than I'd like to see, but I figured whilst I'm
messing/tampering with it I might just as well.
If we had the Linux bug fixed and you hooked both ip-pre-up and (an
added) ipv6-pre-up, I suspect you'd have a solution that substantially
outlasts at least my lifespan. And probably the marketspan of customers
initiating PPP links as well. So that doesn't quite do it for me. (But
see below.)
I /personally/ do believe that net-pre-up vs ncp-pre-up makes some
sense. And I do believe both have their uses. For example, it makes more
sense for me to do NCP agnostic work (like setting up tc on the ppp
interface) from a network-agnostic script, and then to possibly amend or
add to it from ncp-pre-up PRIOR to the interface coming up. Frankly, in
this specific use-case we have nothing protocol specific going on,
however in some other cases we do have some ipv4 vs ipv6 specific things
going on. the ipv6 stuff currently due to lack of ipv6-pre-up just
happens from ipv6-up, and we use procfs to not auto-configure an
interface by default, but then on the ppp interface if we'd like it
enable it from ipv6-up, so this isn't a major blockage currently. Based
on the fact that this has not been requested previously, I suspect that
either IPv6 penetration isn't what we'd like it to be yet, or very few
people care enough (or have figured out workarounds like we have).
Towards this end, no, I do not believe that it's one or the other, it
really should be both.
tc might be a viable explanation of such an agnostic script. That's what
I was trying to push towards: a usage case, so it's possible to document
why such a feature is there and how anyone might use it somewhere and
(crucially) what it could depend on. Many things seem to turn out poorly
if there's no usage case because we end up getting all the boundary
conditions wrong.
Note that the existing "set_filters()" function is invoked after
authentication is complete and before the network layers start
negotiating, so it's all pretty well staked out.
Though it's also the case that for scaling to very large numbers of
users, you may well not want to have a bunch of scripts tying everything
together. But I guess that's your issue to navigate.
I do agree I'd prefer to have the interface configured for the NCP prior
to invoking the ncp-pre-up, but if we can't guarantee that this can
happen, then I'd like the OPTION of having it execute PRIOR to
configuring the interface with IP addresses, but with the environment
variables already set. As we've established, behaviour on solaris is
sane, linux is not. I'd rather have the issue documented at the very
least in the man page, and workarounds provided if the issue can't be
properly fixed. With an option to pick which quirky behaviour the system
administrator prefers, and I see three options (based on your input too):
1. Delay interface UP until all NCPs have completed (you stated you
will investigate options, there are downsides to this w.r.t. buggy
remotes resulting in long, sometimes indefinite negotiation, and
simply stating the remote side should fix isn't always
viable/practical).
It's not indefinite. The existing pppd FSM parameters (and the design of
PPP itself) prevent that.
2. Execute pre-up prior to configuring (adding addresses) to interface,
with no guarantee that interface is still down.
3. Execute pre-up after configuring, with no guarantee that interface
is still down.
It sounds like you are still describing bugs, so I think I need to work
through some examples here.
First of all, although the underlying protocol design allows it, and any
good definition should allow it, the current pppd implementation does
not actually support any "passive" NCPs. (Only LCP can be marked
"passive," allowing the peer to initiate the link first.) Once we get to
continue_networks(), all of the NCPs are given the "Open" event at once
and they all start negotiating. This is then bounded overall by the
restart timer and the maximum Configure-Request count.
So, a good but complicated starting example starts with these:
1. LCP negotiates and goes to Opened state.
2. CHAP negotiates and goes to Opened state.
3. CPP+MPPE negotiates and goes to Opened state.
At this point, we'll issue the "Open" event to all of the regular NCPs,
and they start negotiating in parallel. The interface is affirmatively
*DOWN* and nothing is yet configured. Now, we *could* invoke a
net-pre-up script here, but I'm wobbly on that unless the excuse is just
to invoke 'tc' commands on Linux. In which cause ... eh ... ok. Hope you
didn't actually need IP addresses for that.
In the current code (WITHOUT a fix), the next step gets ambiguous
because of a Linux bug. Specifically, we could end up doing this
specific sequence:
4x. IPV6CP finishes and marks interface UP
5x. IPCP finishes, sets addresses, and causes ip-pre-up to run (BAD!
BUG! STOP!).
I am once again arguing that this is straight-up wrong. It's a problem.
It must be fixed. We just can't live like that. It's not something to
document and pretend never happened. It's not a reason to introduce a
new "fixed" interface and let the existing code rot in place as though
it hadn't been thought about.
Assuming a fix, the continuation looks like this:
4. NCPs negotiate with some ending in Opened state and some in
Starting state (following LCP Protocol-Reject and a lower-down
event). The NCPs that happen to negotiate addresses set
them on the interface as part of this process, but AGAIN the
PPP network interface is still DOWN.
5. After all NCPs settle into one of those two states as above,
then invoke ip-pre-up and ipv6-pre-up, as indicated, and wait for
them to terminate synchronously. Note: interface is still DOWN.
6. Now set IFF_UP in a platform-dependent manner.
7. Finally, invoke ip-up and ipv6-up.
That's the normal (albeit a little complicated) route. The only
interesting part comes if someone decides to start up an NCP in the
middle of a connection. The underlying protocol allows this, and pppd
also supports it, though the sequence itself is made tricky because of
the lack of "silent" mode. This case looks like:
4-7. The protocols come up normally and we do as above.
8. The peer sends IPCP Terminate-Request. This causes the IP address
to be removed, but if IPv6 is still running, nothing else happens.
The interface is still up. The IPCP state machine is left as
Stopped. Only IPv4 is effectively "down."
9. Some time passes.
10. The peer sends IPCP Configure-Request. We renegotiate IPCP and add
the address back on. At this point, the interface is still up and
we (somewhat bogusly for Linux only) invoke ip-pre-up again and
then ip-up. Even with the fixes described so far from either of us,
this is still broken.
That's the one case where I think we would need to do something special
for Linux. Is this the case you're worried about? If so, we could
possibly do something like check whether ip-pre-up exists and, if so,
then temporarily take the interface down, set the address, run the
script, and bring it back up. That'll cause a short outage in IPv6, but
probably nobody really cares.
(If we had "silent" mode for IPCP or IPV6CP, we could start the protocol
off in Stopped state and allow it to come up later. The existing FSM
implementation supports this, but there's no option to enable it because
nobody has ever needed it.)
I'll happily concede that 1 should be the default, but I'd like the
option of behaviour 2 or 3 too.
I don't want (3) as stated. That's clearly a bug. Either the interface
is down for the protocol or the "pre-up" can't be used.
I can see how (2) might be a hack-around for the specific IFF_UP problem
on Linux, particularly for the re-negotiation case described as (10)
above. I don't particularly like this, though, especially as it has
unknown side-effects with existing scripts. It's entirely possible that
existing scripts expect the address to be on the interface so that
"route add" can be used with a next-hop address. (Yeah, on some systems
you don't and shouldn't need a next-hop address for point-to-point. But
legacy stuff and/or confused admins ...)
I'd very much rather that (regardless of whether the bug is fixed or
not) we have good hooks for a plug-in that wants to do something weird
during the address-setting process. Hooks like that are nearly free in
terms of maintenance cost. And that would enable you and others doing
similar things to make progress without having to integrate a change now.
I'd like to have the options that interface comes up even if only one
NCP completes, in case of remote bugs (being on the ISP side it's hard
to guarantee what stupidity we face from the remote side, we've seen
some ... interesting things).
We have those options today. See the "*-max-configure" and "*-restart"
options. They do *exactly* that. We give up on a protocol and pretend
that the peer has rejected it if the peer has bugs. We even do it *by
default*, because the options have default limits on them.
I don't see how you can do any better than that. There are few truly
timely and robust ways to discover that a given peer is clueless.
I've not actually used MPLSCP on ppp yet, so I honestly don't know
what's all involved there, or even how it works, but we will have to
look at this later in the year for another project.
Bit of a red herring. I was casting about for what network layer
protocols other than IPv4 and IPv6 you could possibly be caring about
that could possibly be involved somehow in the setting of the IFF_UP
bit, and wasn't sure who your end users might be (yours aren't
necessarily the same as mine).
All of the other network layers that an ordinary user might care about
(IPX, AppleTalk) are long since dead and buried. That leaves MPLS, which
is really used as a datagram replacement for ATM or Frame Relay in
backbone applications. No ordinary home user really ought to be asking
for MPLS, though it might be possible for some kinds of business links.
(Basically, the same places where they'd buy a dedicated circuit.)
But if we did have it, my point was that it's unclear to me how it'd
actually be hooked in. It's more L2-ish than L3-ish, so I'd think it'd
be really weird if it showed up as part of "ppp0" or that it would care
about IFF_UP. At least that'd be weird to me; I don't know what
Linusians think about the issue. It should maybe look like whatever an
InfiniBand underlying interface looks like.
And if you meant something like IPv9 (since the numbers through at least
8 have already been proposed and shot), well, nah. We're having a tough
enough time getting IPv6 deployed after 28+ years since IPng and CLNP
got hashed out in Toronto. I mean, I don't want to be the one who says
"640k ought to be enough," but I wouldn't hold my breath expecting
something else. (Yes, there are admittedly potential wildcards. The
alphabet-related one certainly comes to mind, as do a few national
governments. But I'll just keep my eyes closed and trust that deployment
is very much harder than any of them might think.)
…--
James Carlson 42.703N 71.076W FN42lq08 ***@***.***>
|
640KB :p. For tc commands, yes, we definitely DO NOT require any form of IP information here. At least, for what we use it for :). Your "later renegotiate" reasoning looks solid. But I disagree, once the interface is up it should remain up. It's trivial to disable ipv6 temporarily on a per interface basis via sysctl (net.ipv6.conf.${interface}.disable_ipv6), but I don't see similar for ipv4. And once disabled, to re-enable experience dictates that a number of other sysctl's for same iface also needs to be properly reset. I've not dug too deep into this. This is a really, really sticky case, but not one I'm too worried about. I still do not like there being any form of longer delay bringing the interface up than absolutely required. I really feel strongly that (on Linux at least) this should be configurable behaviour, or at the very least, if no ncp-pre-up scripts are present, should default to not delaying bringing the interface up. Since pre-up's are handled from the NCPs this gets tricky again and would require co-operation from NCPs to indicate which has it. For downing the interface temporarily that's a bad idea IMHO, and again, if this is the way you want to go, sure, but I'd definitely want a configuration option to avoid this behaviour. Given the discussion, would it be OK for me to so long amend the net-* scripts (would you prefer if-*?) to provide for network agnostic net-init (executes prior to auth even, the moment LCP has been established), net-pre-up (auth completed, either when first NCP signals ready, or prior to NCP being initiated), net-down (after LCP has shut down, prior to deleting the interface from the OS). Can this be done currently by way of plugin rather than coding this into the core? Then users looking for this can just load this as a plugin rather, eg, netscripts or ifscripts? |
On 1/6/23 02:38, Jaco Kroon wrote:
640KB :p.
For tc commands, yes, we definitely DO NOT require any form of IP
information here. At least, for what we use it for :).
Your "later renegotiate" reasoning looks solid. But I disagree, once the
interface is up it should remain up. It's trivial to disable ipv6
temporarily on a per interface basis via sysctl
(net.ipv6.conf.${interface}.disable_ipv6), but I don't see similar for
ipv4. And once disabled, to re-enable experience dictates that a number
of other sysctl's for same iface also needs to be properly reset. I've
not dug too deep into this. This is a really, really sticky case, but
not one I'm too worried about.
Agreed; it would be messy at best. Turning the bit off may well have
undesirable side-effects. I'm not too sure what to do about that one. I
don't think there are any really good answers for it.
I still do not like there being any form of longer delay bringing the
interface up than absolutely required. I really feel strongly that (on
Linux at least) this should be configurable behaviour, or at the very
least, if no ncp-pre-up scripts are present, should default to not
delaying bringing the interface up. Since pre-up's are handled from the
NCPs this gets tricky again and would require co-operation from NCPs to
indicate which has it.
Conditioning the behavior (setting the IFF_UP flag) on the existence of
a pre-up script makes sense to me. It's not really observable in any
meaningful way without that script in place.
I'm not really sure that such a wait does, in fact, result in any
concerning delay, though. For properly-functioning implementations, we
still have to go through LCP+AUTH+NCP to get to an 'up' interface, no
matter what we do. Of those bits, the frequent outlier is the
authentication portion, not the NCP, as it often proxies into some
external AAA mechanism.
(Yes, I know about the bad case of 3GPP, where they acknowledge all
authentication data without bothering to check any of it, and then do
the expensive AAA stuff during NCP. The less said about that, the better.)
Even with a poorly-behaved (but still conformant) implementation, we're
talking about _maybe_ a couple of seconds worth of skew. It's only the
hypothetical really, really bad implementation that completely ignores
the Configure-Request messages for an NCP and forces us into timing out
completely that causes a noteworthy delay.
I'd rather document that as "here are the risks associated with pre-up
if your peer stinks, and your options for a work-around." (With the
work-around being either don't use pre-up, tune the existing timeouts
and retries downwards, or just disable the faulty protocols and drive on.)
For downing the interface temporarily that's a bad idea IMHO, and again,
if this is the way you want to go, sure, but I'd definitely want a
configuration option to avoid this behaviour.
Agreed. I was just trying to think out loud about the larger problem to
see where any sort of "fix" for it might break down.
I think I'm ok with saying that if pre-up exists, then we just don't
allow protocols to be renegotiated like that without taking LCP down. I
think that if you're using pre-up, you're *presumably* doing it from
some Really Good Reason and not just on spec. As such, it should do what
it advertises to do and not something arbitrarily different.
Given the discussion, would it be OK for me to so long amend the net-*
scripts (would you prefer if-*?) to provide for network agnostic
net-init (executes prior to auth even, the moment LCP has been
established), net-pre-up (auth completed, either when first NCP signals
ready, or prior to NCP being initiated), net-down (after LCP has shut
down, prior to deleting the interface from the OS).
Those will need to be nailed down a bit more.
I'm still fuzzy on the purpose of net-init as described here. You'd have
no available information there about the identity of the peer, so the
whole bit about renaming the interface or issuing per-user 'tc' settings
seems out the window. It's not clear you could do much in that script,
but I _guess_ the purpose is to do "rm -f" on a pile of left-over
external state that might exist in this particular implementation ... ?
If that's it, then I'd wonder whether there's a better answer somewhere.
Is net-init also synchronous like the pre-up script and does it thus
block LCP from executing? It seems so.
The "net-pre-up" needs to be defined more crisply. If it happens right
after auth + CCP/ECP, runs synchronously, and blocks the starting of
NCPs, then I think I understand it. If it's anywhere else, then I'm much
more wobbly on how to describe it for a user.
"After the first NCP" seems like a bad definition to me. It means that
it ends up with an unclear relationship with ip-pre-up. It's just
confusing that way. In that case, it seems like you're trying to build a
union of ipv4-or-ipv6-pre-up, and I think that's either pointless or
just completely dependent on Linux issues and a fear of bad peers. As
far as I can tell, the existing design serves the purpose, and there's
no good reason I can see to force the union. Even if it means that you
personally are forced to deliver two identical pre-up scripts and merge
the events on the other end, I don't think the added bits in pppd for
that part are worthwhile.
If that's what you really want, then I think that belongs in a plug-in
instead.
It sounds like you might intend "net-down" to be synchronous as well.
I'm not sure that's a great idea, because "down" scripts have a long
history of hanging and causing resource leaks. I think the existing down
scripts do as much as we can reasonably do here, though, for symmetry
with "net-pre-up," I guess there could be a "net-down" that (presumably)
is invoked after the other ip*-down scripts. I just think it should be
async and guarantee nothing about the network interface.
And, again, if you need to do something peculiar that would be hazardous
for general use, I think a plug-in is the right way to go about that.
Can this be done currently by way of plugin rather than coding this into
the core? Then users looking for this can just load this as a plugin
rather, eg, netscripts or ifscripts?
I was mostly suggesting that for your particular L2TP application, you
should probably explore the option of creating a plug-in that does what
you want rather than trying to wire everything together with a bunch o'
scripts (and thus requiring yet-more baroque scripting options). I don't
think such a plug-in would be generically useful, but I guess that's up
to you.
We can easily create something like:
/* Hook for a plugin to know when IP protocol is about to come up */
void (*ip_pre_up_hook)(u_int32_t our_addr, u_int32_t his_addr) = NULL;
... and then you can go to town on this state without affecting existing
pppd users or adding any baked-in mechanisms that we'll need to continue
supporting "forever" even when this L2TP application no longer exists.
…--
James Carlson 42.703N 71.076W FN42lq08 ***@***.***>
|
I hear you regarding plugin vs external scripts. So from a system administrator perspective, it is simpler to just modify shell scripts, and it's also a lot simpler to debug these, and if something goes wrong with them, the probability of impacting or even crashing pppd is much lower. We prefer doing as little as possible from inside the pppd binary for this reason. It's also lower maintenance overhead to not have to keep non-public plugins in sync with the pppd project than notification scripts (the available scripts has to the best of my personal knowledge not changed in decades). I suspect I'm potentially one of the first pushing for that. net-init - purpose: to notify that a new ppp negotiation has started. Basically, a new "unit" has been created. Most people will not find this useful, since quite frankly, as you say, there isn't much information available here. It's pre-auth, pre-everything. I don't recall how I implemented in the associated patch, so I'm not sure if it's actually pre or post LCP established. For our use-case this doesn't particularly matter. Our use-case is to reset out-of-pppd data we associate with the "unit" where the unit number is the primary key (if you think in terms of relational data). Other possible use-cases I can imagine relates to detection of attacks and blacklisting. Ie, if we get a number of initiations but auth never happens then potentially we're looking at some form of brute-force or denial of service attack. Anyhow, I agree this is probably fuzzy at best and difficult to define well in terms of alternative envisioned use-cases. IIRC this happens the moment we have a unit number, even before LCP is established but I'd have to re-check. Earlier == better here. net-pre-up - purpose: to initialize networking prior to bringing up the network interface. As it's implemented right now, it comes up after the first NCP, which as per yourself is very unclear semantics, can ip-pre-up rely on net-pre-up having executed or not? So that's just not happy. Since we use net-pre-up instead of ip-pre-up for us that didn't matter. I'd prefer to keep delaying this as far as possible to the first NCP actually bringing the interface up, but yea, that's tricky because it means all plugins will have to signal that the NCP is now ready to configure the interface, then execute it's own pre-up, configure and then signal for bringing the interface up. Another question is if the interface isn't actually going to come up, why execute net-pre-up? The simpler methodology is to just execute this once the NCP negotiation is about to start, so it's really net-pre-ncp rather than net-pre-up. The stuff we intend to perform here is essentially house-keeping again, setting up tc against the interface, some rudimentary firewalling updates (from here we simply assume that either ipv4 or ipv6 or both may be negotiated and set up for both accordingly, it would be nice if we could separate ipv4 and ipv6 into their own -pre-up, but that doesn't work currently). Renaming the interface can happen here (not in init). I'm OK if this happens when we move into the NCP negotiation phase (sorry if my naming of phases aren't 100%) just after AUTH finished (or in the case of no auth, directly after LCP has established). net-down - purpose: clean up out-of-pppd data, including undoing tc (the related ifb at least since when pppX gets destroyed any tc data directly associated with that will auto clean-up) and possible firewall updates. This would execute AFTER LCP has been torn down, and by implication AFTER all NCPs have been brought down (including their -down scripts have been executed). I hear you regarding the risks, please refer below. net-up we don't care about, but if you want to keep with the NCP parallels it may be worthwhile to also consider this, to be executed once the interface is UP, whether that is delayed until all NCPs are "up" or the moment when the interface comes up ... hard to define. For Solaris this would (should) be once all NCPs are up in my opinion where it makes most sense. For Linux ... this is the sticky part around the entire discussion we've been circling the whole time. Frankly, to avoid this complication, let's just NOT implement this until it's asked for. But I think if this gets implemented it should be "after all NCPs have estalished or failed", in other words "when all NCPs have completed their negotiations". Re-negotiations may also mess with this, as such, this is an over-complication we (in my opinion) should not try and solve now, or ever ideally, only if someone asks for it. You mentioned buffering of packets, I believe the Linux kernel already handles this to a degree (specifically, there is a file-descriptor from which packets are read, which implies the kernel is buffering them - or rather, buffering at least one, but probably more). So again, at worst, this may look like an "overly long latency issue" to the protocol, and except if it's exceptionally long as in the order of seconds it should be able to recover. I think these scripts executing timeously is the responsibility of the system administrator. If the distinction is made between the interface state and the protocol state, then separate net-* or if-* (please indicate your preference) scripts compared to ncp-* scripts are perfectly sensible. Perhaps more so on Linux than Solaris. ncp-init may also in some cases make sense if a system administrator may want some way to indicate whether or not an NCP may or may not proceed based on authentication credentials or something (or perhaps net-pre-up should have some way of setting additional options based on whatever auth data may or may not be available - this way one could also use a private radius tag to provide data from radius which this script can then translate into pppd options - just spitballing here, since the radattr plugin already writes attributes to a file, eg, based on data from radius allow ipv4 or ipv6 to be negotiated, or even MPLS ... or associate the interface with a specific VRF in an MPLS world - I don't see a standard radius tag for this based on a VERY QUICK google, so some other mechanism may be required). Yes, there is an MPLSCP expernal plugin floating around which Gentoo have been including for a while now - I'm not sure the source. I'd highly prefer all scripts to be synchronous, especially init and pre-up, down is less of an issue. With regards to these causing problems, suggestion perhaps, we can always adjust the log entries for executing scripts to indicate if they're being executed synchronously or asynchronously, as well as how long they actually took (in milliseconds), and for any synchronous script issue a warning if they take longer than time X, and we could even have a config option to wait max X time for synchronous scripts before sending them a SIGTERM, and then treating them as async, or simply forgetting about them. This has other risks I'm not sure I like, but frankly, sorting the scripts to not block should remain the responsibility of the system administrator of those scripts, and pppd's responsibility should end with warning in the man page about the risks. Something like this: WARNING: This script is executed synchronously and if it doesn't complete in a timeous manner could cause protocol failures for the associated ppp protocol, and by implication the connection. As a general rule, then I guess the protocol independent rules then for external scripts are: -init - executed when protocol negotiation is initiated (ip-init when IPCP is started, ipv6-iit when IPV6CP is started, net-init when the ppp interface is created). sync. -pre-up - executed prior to bringing the protocol interface up (see below regarding Linux). sync. -up - executed after bringing the protocol interface up. async. -down - executed after bringing the protocol down. sync/async/don't care, whatever the current status quo is. Regarding Linux: -pre-up can be safetly executed prior to configuration in most cases, as long as the scripts don't depend on the configuration being present. Towards this end there is three strategies:
I honestly don't like any of these options in full. Each have advantages and disadvantages. I personally think this should be configurable to indicate "which broken behaviour you'd prefer if you insist on using a protocol specific pre-up", something like: pre-up-behaviour (Linux specific) one of the following: delayup (default) - delay bringing the ppp interface up until all NCPs with pre-up scripts have completed negotiation. It has to be noted that in the (rare) case where a NCP is (re-)negotiated at a later stage the guarantee that during execution of pre-up scripts the interface is not actually guaranteed to be down. delayconfig - delay configuration of the interface until after the pre-up script has executed. Environment variables containing the relevant information will still be set up prior to execution of the script. historic - Would appreciate a better name here. The interface is configured prior to pre-up, for the first pre-up there is a reasonable chance that the interface may still be down, but it cannot be guaranteed. One could elaborate, or link to a page explaining the issue. At this stage, it may also be a good thing to bring the script names into the NCP specific protocol structure in a way so that consistent cross-protocol behaviour can eb enforced, but this may also be reasoning too far ahead. I'd have to dig a bit deeper, but for the moment I think if we can just kinda try to finalize the desired behaviour. |
@jkroonza: New commits have been added, maybe you can look :) |
On this PR? Or on master? I can see there's merge conflicts, either way, I don't think this is going to be ready for 2.5.0 ... |
@jkroonza: 2.5.0 is now ready if nothing new ^^ |
What I've seen and tested I'm happy with. I'll re-work this code once 2.5.0 is released, hopefully for 2.5.1 but I suspect not, otherwise I'll just use it as a patch of my own until 2.6.0. For now I've discovered at least one flaw in this patch series that's annoying, to say the least, but for 99% of our use-cases it works. If fixed to work as per the discussion in this thread it'll solve the other 1% too, for us at least. It will, however, require more work and testing than what should be done prior to 2.5.0 at this stage. Other pending on work on rp-pppoe (server specifically) is also quite a bit more urgent for us and I've got a handle on an xl2tpd bug too that's of the three probably top priority. So any spare time that I manage to carve out currently will go to that. Most of the voice (asterisk) R&D that's ongoing currently should be completed in the next three weeks hopefully, then I can swing back to networking for a stint. |
Hello @jkroonza, have you looked for your PR? |
Hi, Still need to loop back to this ... hopefully in the coming week. |
Hey, I re-read this and other relevant discussions. I suspect I'll need to dig a bit deeper into the FSM implementation. There is definite work to be done here, and I'm happy to attempt it, going to need to sleep a bit on this. Out of hand though, the fsm_callbacks struct - is it OK to remove say the up(fsm*) function in preference of calls like configure(fsm*), pre_up_script() and up_script()? Where the script calls really just provide a path (or potentially ncp_scripts) should be split into it's own structure. Renaming and splitting here serves two purposes:
{net/if}-pre-up would then execute first, guaranteed to complete, at which point other pre-ups can execute in parallel. This would mean that the FSM would need to become aware of scripts for various network protocols (And I'd put down into this too so we can wait for all of those too prior to pppd itself terminating). Since this work is happening anyway, we might just as well move this into the FSM so we can enforce behaviour of network scripts to that we can ensure -down doesn't get called prior to -up completing, and re-negotiation can't start (or complete?) until previous -down scripts have finished. Additional (global) configuration options: no-delay-iff-up - (Linux onl) By default on Linux bringing the interface into an up state is delayed until all network control protocols that has pre-up scripts that exist have completed (either successfully or failed) and all of these scripts (where applicable) have been executed. no-down-if-for-preup - (Linux only) On Linux if a protocol gets negotiated after an interface has been brought up, then if said network protocol has a pre-up script, in order to maintain the constraint that the interface will be "still down" when the pre-up script it run the interface will be flagged down prior to configuring the protocol, including executing the pre-up, and then the interface will be re-enabled prior to finally executing the up script. I'm not aware of any implementations that start negotiation after initial negotiations has completed, so perhaps the second option here should specify behaviour or "late" negotians (I have no idea for a name for such a option) with options such as:
Of these only only-if-no-pre-up and down-iface-for-pre-up IMHO really is sensible, but it may be that a sysadmin prefers to execute the script with the interface down, but really don't want to bring the interface down at all once it's up. This overall is quite a major change, which has impact primarily in the Linux world, and if I do this right, almost none for Solaris. @carlsonj @Neustradamus - opinions, objections, enlightenments, pitfalls or any other comments prior to me descending down this rabbit hole again? |
@enaess, @paulusmack, @EasyNetDev: What do you think about it and the last @jkroonza comment? |
Really would appreciate a bit of direction here before commencing work. |
I would prefer not to change the FSM callbacks since they correspond to events and actions defined in the relevant RFCs. The "up" action generally means that the two ends have reached agreement on how the link is to be configured. Then there is the question of how to do that configuring in practice in a race-free manner, which I assume is what we're discussing here. It seems to me that we have a higher-level state machine for which the "up" and "down" actions for the individual protocols become input events. Maybe expressing that state machine more formally might help with solving the races. |
May I then simply implement a net-pre-up that executes (blocking) once auth is done but PRIOR to protocol negotiations starting? And net-down (non-blocking) which executes just prior to taking LCP down? Since ip-pre-up is then broken by design (and based on the feedback cannot be fixed), may I recommend that a big fat warning be added to the documentation? Something like: WARNING: Please note that on systems where a single interface carries multiple protocols (Linux) ip-pre-up is NOT actually guaranteed to execute prior to the interface moving into an up state, although IP information won't be known you should consider using net-pre-up instead, alternatively, disable other NCPs such that IPv4 is the only negotiated protocol - which will also result in a guarantee that ip-pre-up is called prior to the interface going into an UP state. |
That all sounds fine to me. |
0d12ca7
to
7f65867
Compare
Sorry for the delay. Moving to pppd 2.5.0 is getting some priority now. Try as much as I could I could not completely axe the net-init mechanism. So please consider the motivation: We have incoming ppp over ethernet (pppoe-server), l2tp (xl2tpd) as well as pptpd (some things just won't die). None of the daemons allows without some tweaking a mechanism to pre-run stuff prior to exec'ing pppd without basically wrapping pppd, and certainly the unit number won't be available at that stage, so we really just need to make sure that the moment the unit number is allocated, we clean the slate and verify that out external (from perspective of pppd) management state trackers are properly reset back into sane state (in case where pppd is forcefully killed or segfaults and doesn't clean up properly this is the only sane spot to perform the cleanup). net-pre-up also permits interface renaming (there was an attempt to implement very complex code in pppd to perform interface renaming, this way this code can be handled externally to pppd using whatever script mechanism the op prefers). Any variables exported during the auth phase is also available here. net-down should execute after all else have torn down, and ip-down and ipv6-down have been initiated (but possibly not completed). Documentation pending, but review on code in the meantime please. |
7f65867
to
a0ad71a
Compare
@paulusmack, @carlsonj, @enaess, @EasyNetDev: Can you look this PR? |
The commit description is not adequate. I want to see there a brief outline of what these new scripts are for and what problem is solved by having them, and also what other problems might still exist that this doesn't solve. I want people in the future to be able to understand the motivation for these changes and the design decisions behind them without needing access to github.com. |
net-init executes as a blocking script directly after the unit number becomes available. This can be used to initialise aspects related to the ppp connection that lives outside of the ppp connection. It can also be used to clean up (in the author's extremely unlikely case) where a previous pppd crashed, and net-down didn't execute in order to clean up. net-pre-up executes as a blocking script after auth, prior to NCPs being negotiated. Unlike ip-pre-up this is guaranteed to execute prior to the interface being brought up, and can be used in an NCP agnostic manner to pre-initialise aspects of the interface for which it still needs to be down (amongst others it's recommended that firewall changes happen here). net-down executes in a non-blocking manner just prior to pppd terminating and can be used to clean up actions from previous scripts. You will notice that I mention ip-pre-up doesn't gaurantee that the interface will still be down, this is because in a Linux world all protocols runs on the same interface, compared to solaris where I'm informed each protocol runs on it's own sub-interface, each of which has it's own operational state. The man page for pppd has also been adjusted to indicate as much. Signed-off-by: Jaco Kroon <[email protected]>
a0ad71a
to
3906398
Compare
That should be better now. Whilst not my original strategy this does address all our concerns, as such I don't have further concerns around this, but there are from your side I'll be happy to add to the commit message. |
@paulusmack: The @jkroonza commit description is good now? |
@EasyNetDev: What do you think about the @jkroonza merged PR? |
This no longer differentiates between which sub protocols brings an
interface up or down, merely tracks which sub protocols have brought up
and interface, and which have not yet taken it down.
There is one caveat here, I use pointer comparison on these names, so
the absolutely cannot be anything other than a compile-time constant,
and I'm not sure whether or not at least -O1 is required or not, ie, if
the compiler will eliminate multiple "STRINGS" to a single instance in
memory.
I'm unable to test the solaris code, but I don't see any reason why this
should not work.
Signed-off-by: Jaco Kroon [email protected]