-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network Management #24
Comments
+1 for networkd. |
I don't think we can completely discount this issue as it also can affect us in the initramfs (like DO on CL). |
I think that's workable by doing something hacky like using some |
I understand that this has not been a needed option so far, but with some use cases (like kubevirt), taking down a node is costly and unwanted. Networkd vs NM also seems to have long term implications. Beside the shared maintenance burden, we have the need to change/evolve per the changes in requirements. |
This is a tricky discussion without taking into account layers higher level in the stack. Last I heard, OpenStack (at least RDO/RHOS) do not support NetworkManager (and since this is RHEL7, hence require legacy initscripts). I'm not sure what the status of things is in the Kubernetes world in general - I imagine there are some things taking a hard dependency on NM, but I don't know.
I think it's an open question though whether such vlans are truly managed by NM - the way I think of it is you have "host networking" and "kubernetes networking" with the latter being a distinct layer. |
OpenStack are planning to support Network Manager as initscripts is being deprecated in the Fedora side over time (no added features and only major bug fixes). The scope of the NM project is much wider in terms of what is manages and what is plans to manage. I don't think there is interest on the networkd side to handle this and even if there is the gap is wider. You also need to keep in mind the the Fedora devs attention is on NM, not networkd, which will mean that Container Linux would need to maintain and test a completely different network stack, which is less stable by nature. |
Hi, an NM guy here. Discussing this topic is welcomed by us. Thank you for raising the issue. But the thread sounds like pressing for a decision quickly. We just recently started discussing missing gaps for adopting NetworkManager. Unsurprisingly, gaps were identified. All we can assure at this point is that these gaps can and will be filled. But I'd rather not have you take my word for it. Why not take the next months to address issues and provide a proof of concept. It seems better to make a decision after showing that NetworkManager actually works well in your environment. Anyway, if you really want to force a decision now, then at least discuss it in terms of where NetworkManager can be in a near future. For example, claiming that it's "harder to write config files for" seems not a good argument. It's not clear what the exact issues are, or how hard it is to fix them. I'd even claim that this issue is not severe and NetworkManager will easily improve in this regard (e.g. https://bugzilla.gnome.org/show_bug.cgi?id=772414 ) In my opinion, networkd is a great piece of software and does what it does well. Both NetworkManager and networkd tick checkboxes in the feature matrix. If you purely decide based on that, networkd is very attractive: it works well today, why would you even change? Yaniv elaborated already on the larger picture of why. Let me add, why we think NetworkManager should be used. Our stance is that NetworkManager provides an API on Linux for configuring networking. Having multiple APIs is a burden. Look at how cockpit does not support networkd (cockpit-project/cockpit#7987). Can you image the effort to add a suitable API to networkd and leverage it from cockpit? And think further, integrating this API with the rest of the Linux ecosystem. It's really about integrating components on Linux. If you use NetworkManager today, you can use GUI, CLI, files-based configuration, ansible, cockpit. They all integrate with each other because they use the same underlying API. In the future, we aim for management systems (e.g. Openstack and libvirt) to also use the same API. For example, I don't mind when systemd-timesyncd competes with chrony. Clientside NTP is simple from a configuration point of view. Not so a networking API. Integrating with either networkd or NetworkManager is a large effort. And it's almost unfeasable to find a powerful, common abstraction that targets multiple networking APIs. Sure, networkd or something else could take the role standard API. But NetworkManager is a defacto standard API today, it is quite suitable today, and it integrates with many components today. It will be simpler to fix the shortcomings of NetworkManager and focus on providing a suitable API than bending the rest of the ecosystem towards an API that does not even exist yet. Granted, it's impossible for the generalist to be the perfect solution for all scenarios. But that is what both Linux kernel and systemd do very successfully. They don't only target a server, desktop, or embedded environment. Being a generalist is key to their success. Regardless of whether NetworkManager will be adopted, NetworkManager will continue improving at becoming a generally suitable API. It's not that a decision needs to be made today. As NetworkManager keeps getting better, the door for adopting it is only opening further. TL;DR:
|
That's a fair statement. The true issue is more so around passing single configuration that can use hierarchical templates to produce the proper output for different clouds. If nmstate adds in stacking templates that would get us past minimum requirements for the configuration file. There still is a bit of issue of running the configuration while in initramfs though. I don't think that's a killer, but is pretty limiting as we've moved to The "it requires python" thing is also a problem, but nmstate is open to porting to a complied language and there is a workaround for using the python version as a "binary" while the port occurs.
Agreed.
I believe we attempted to do that in the last meeting we had. Keep in mind, it's not that "NetworkManager is wrong". Most of us use it on a daily basis and it's great! No matter what the decision is we should still get together over missing functionality to help widen network managers use cases. |
Regarding the configuration: networkd's config is pretty great for our use case. There's literally nothing I'd want changed in terms of the configuration format. A big theme with networkd (and systemd for that matter) configuration is that it's fully decomposable into as many files as you'd like and every part of the configuration supports that. There are no special cases; this makes the configuration very clean. "Layering" configs is a fully supported case and first class citizen. Supporting the networkd style configuration is somewhat antithetical to nmstate's goals from my understanding (please correct me if I'm misunderstanding). nmstate wants to have the running state of an interface and the configuration for that interface be expressed in the same manner (which is pretty cool). Describing the state of an interface is different than describing rules for what the state should be. In networkd all the config files are parsed into a bunch of rules that get applied to the interface when interfaces appear to networkd. Networkd doesn't know what the state of the interface should be until the interface appears and the rules are applied. nmstate has a config file that describes exactly what the state should be but that needs to be generated somehow from a more generic source. In the end, the desired state of the network is a function of some rules and an interface to apply them to. There's a few main components:
networkd excels at #1 and #2 and combines 3 and 4 into one and the same (afaik networkd doesn't try to determine the ideal state first, it just applies rules until they are all applied). nmstate + nm does a great job at #4. The nmstate template proposal would tackle #3 (correct me if I'm wrong) but templates aren't as expressive or easily tweakable as networkd's configs. Looking at the proposal it looks like #3 would also be manual; we should make that automatic. Misc thought experiment: what if there was something that took a networkd like set of configs and generate a nmstate config when an interface appeared? Another difference (not saying this is better or worse, purely different) is that networkd does nothing automatically; everything is very explicit. This means you may need to do manual tweaking yourself but also means there are no surprises. I personally like this and think it goes well with the explicit, declarative nature of CL today, but could be convinced a little magic here and there is ok. NetworkManager is great and supports a great many things but that also makes me a little nervous. More flexibility and features are not necesarily better in this case. Each feature someone uses is another chance of breakage, which is very painful when you're running automatic updates. Related question: does nm have any stability promises for configs? networkd doesn't make any promises to my knowledge, I'm just curious if NM does. We also don't want users logging into their machines. The whole idea behind CL and FCOS is you have an Ignition config that defines what your machine should be, and then you don't touch it. The dbus API isn't helpful in this case since changing the config isn't useful. From what I understand (granted I don't understand kubernetes well) the host networking and the networking k8s sets up are non-overlapping, so we should just let each do it's own thing. That is, k8s shouldn't care what the host's networking stack is. @EdDev you talked about kubevirt wanting to make changes, can you elaborate on that? As someone who doesn't use nm and has now been looking into it a lot, one thing that jumps out at me is since it's (arguable) primary use case is interactively configured the docs on how to configure it via just writing config files don't seem to be as good (comparatively). There's not a lot of examples for setting up nm without using nmcli or a gui tool. (Or maybe I'm just bad at finding them). Finally, it seems the discussion seems to be "Can we get NetworkManager/nmstate to do enough to fill the role of networkd" and not "which is a better fit for FCOS". If we'd have to shoehorn things into nmstate or NetworkManager, we shouldn't use them. If we could incorporate the things we like from networkd that would be great, but I worry that that'd be a large undertaking to do right. Sorry if that was a bit ramble-y. I had a lot of thoughts. |
I love NetworkManager for workstations, but I always uninstall it on servers. It has caused a fair amount of downtime for me in the past due to its dynamic nature/autoconfiguring unknown interfaces. Most recently, it suddenly felt responsible for configuring VLAN interfaces added by OpenStack Neutron, breaking an OpenStack cluster. Config management is pretty hard, too. Configuring something like bonded interfaces isn't straight-forward at all. Yes, some of this can be disabled, but to me it feels much more complex and unpredictable than networkd. |
Right, I thought a template can be handy to cover the rules part without compromising the declarative state nature.
nmstate actually does not need the full state, it will take the requested config state and merge it with the current state to create a new full state. So you could define in the config that you are interested in setting the mtu of an interface, by specifying the interface name, type and the mtu itself and nmstate will internally read the full current state of that interface and overwrite the mtu with the one from the config that was input.
Kubevirt purpose is to run VM/s as applications on a k8s cluster and is an example of applications that are stronger bound to the env their run on (OS, HW) and more sensitive when moved from one node to another. As of that, changes to node networking settings (create a new bridge, change mtu, add a bond, replace/add an interface or vlan, etc) is preferred to occur without disturbing much (or at all) the applications that run on that node. As an example, adding a VLAN for a secondary network for the pods to consume should not take down the node with all its pods.
I think you are referring to the ifcfg files. NM is consuming these config files and while NM is operational, it replaces the initscripts. I think that the discussion is more towards: short term vs long term solution and supporting new requirements from the node. It will be useful to provide an example scenario that incorporates most of the requirements you look for and consider a must. Then an estimation can be made to see the cost or provide alternatives for considerations. |
We have been working with the NetworkManager devs on improving the server use cases in recent Fedora releases. But keep in mind that the use case is initscripts running along side NetworkManager in OpenStack, I'm not sure that networkd running along side NetworkManager/initscripts would be any better.
I would appreciate operators use cases for what kinds of smartness they use after the defaults are applied where is decomposition is useful.
That doesn't mean due that these advanced use cases can be deprioritized. We are planning to provide a CRD that will lock support to the policies we expose on the cluster level, the intention is not to let customer SSH and do everything. You should consider that maintaining something that is out of the Fedora eco-system will have maintenance costs that might be greater than improving nm and if nm is not used to solve these use cases than networkd would need to be able to do that.
In the base installation that is true, but in that case ignition should probably only care about a single NIC or bond needed for K8s networking and scope mentioned here is bigger even for the basic use cases. You need to consider a possible near future where ignition does only care about a single NIC or bond needed for K8s networking and the rest is done after the cluster is up via smart cluster policies, providing a day 2 management for the non-base use cases. |
Yeah; see e.g. this bug. |
Note that all OpenShift installations, by default, run NetworkManager. That includes OpenShift Online (literally hundreds of nodes supporting tens of thousands of users). That is a huge number of systems that have proven that NetworkManager is capable of supporting OpenShift, Kubernetes, and container use-cases without interference or problems. In my 3+ years working with OpenShift and Kubernetes, I have only a few times (in 2015) had to debug an issue that involved NetworkManager. eg NetworkManager is already widely deployed in mission-critical server environments and seems to work fairly well there for container-based use-cases. |
@cgwalters not sure it's that one, for whatever reason we do not (and haven't) had problems with OpenShift and NM, and we extensively use veth interfaces there. NM does not touch them (though it does recognize them and show them in its CLI). |
If we do go the nm + nmstate route will this be an issue? Or rather, can nmstate ensure that nm only does things configured through nmstate?
They can, but I'd want them to be decomposable and "overlay"-able.
I'm curious how this handles unset things. I.e. if the running config has some option set and you want to unset it (particularly in cases "0 values" like 0 or empty string are distinct from nil/null/None values). Ignition also does config merging and can't really unset values from an appended config. I don't know if nmstate's config has places where this would come into play yet, but it's something to think about in the future. This is kinda off topic, I'm just curious how you handle that since I've worked a lot on Ignition with its similar problem. Also, if you're not making changes to a running machine and only loading the config on boot (as we think you should be running FCOS) then the config being loaded is the full state, yes?
This sounds like leaving kubevirt to manage its own things would be fine, right?
I don't understand what you mean here.
I don't like the term requirements; it implies a bare minimum functionality. I don't want to throw out the flexibilty of networkd (specifically its configuration). We don't have data on how existing CL users use networkd and I don't want to move to something and have a bunch of users lose functionality they were using. The CL user base is mostly only vocal when things break; for better or worse they don't really participate in the development process or really give much feedback (unless we break them, then they do). We can't give you a hard list of requirements other than "it should support most things we support today in a similarly flexible and elegant configuration". I also worry that if we do find a way to generate a list of requirements, we'll build something to meet just those requirements and not solve more general cases.
Shipping initscripts or support for initscripts is antithetical to CL and FCOS. We ship a minimal, up to date distro. Initscripts (alongside systemd) are neither. I think (hope?) it's fair to say we're not gunna support initscripts. IMO if we end up shipping initscripts, we have failed. Also I'm assuming if we go the nmstate + nm route then that will be the only way to configure the network on the host. We should absolutely not ship the ifcfg plugin. Is this a fair assumption?
On CL we have a base network config that OEM specific gets get layers on (and we only have to override the differing bits). Users then can further override that. We don't know to what extent users do override that. Yes you could do this in other ways but I don't think any other ways are as clean as how we do it today with networkd. There is value in that cleanliness. What I really don't want is some system where there are a bunch of special cases or restrictions (e.g. only 1 template file allowed, restrictions of what fields can be overridden, etc).
I'm going to strongly disagree on this one. FCOS is not just for k8s/clusters. We specifically call out single node as a use case as a primary goal. I'm not going to cripple FCOS and or Ignition to not be flexible enough configure multiple NICs. I will fight tooth and nail for this. Ignition should be able to lay down the config for any networking you want the host to handle.
It's not so much maintaince cost as likelyhood of bugs. We carry two networkd patches on top of upsteam networkd (see coreos/systemd#103) both of which are trivial. One of the selling points of CL is that it is minimal and (to the best of our ability) clean. We try not to ship things that pull in a lot of dependencies or do 1000 things we don't care about and 1 we do. I don't see this minimalism being valued in this dicsussion and want to make it clear that's one of the things I think made CL successful. I don't dislike NetworkManager and nmstate; I just don't think they're a good fit for FCOS. I also think networkd is easier to grok than nm. Both because the config is very regular (and well documented), it's minimal, and because it only have one way of being configured. It's also just a smaller project that does less. I want to reiterate that not shipping things we don't use is a feature. We're already replacing many simple components of CL with more complex and featureful components from FAH (e.g. dual partitions with ostree, torcx with package layers, etc). There's going to be a lot of new things for migrating users to learn and unless there are clear benefits for the change it's just more pain from the user perspective. Users don't so much care about the maintainence cost. This is something we spent a while considering when talking about whether to use ostree or dual partitions as well. Another concern is the number of components (that all need to talk to each other). We'd need NetworkManager, nmstate and whatever nmstate template renderer tool gets created. We'd have to render templates when interfaces appear. This means we'd also need weird udev rules or some other component to track them. This feels like reimplementing networkd out of a bunch of other components. It also seems like a great place to encounter race conditions. I don't particularly like generating configuration during the boot process. |
@ajeddeloh Note that NM also values size and minimal dependencies. Non-core device types (wifi, wwan, team, bluetooth, adsl, ppp, wimax, ifcfg-rh, ifnet, keyfile, etc) and anything that has external deps are optional plugins. NetworkManager has consistently worked to reduce dependencies, spin optional things into non-core plugins, and streamline the binaries and libraries. That said, NM does link to more things that networkd does, and some of those could perhaps be removed to optional plugins. @thom311 would have more info on that. |
systemd-networkd can be restarted. This works with some caveats. Essentially, it'll apply the new configuration to any interfaces it finds. This means that just updating the config for some interface and restarting does the right thing. https://www.freedesktop.org/software/systemd/man/systemd-networkd.html#Description has some more details.
All networkd configuration files can be considered stable. Unfortunately networkd files are not mentioned at all in https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/. I filed systemd/systemd#9850 to track this and add them. |
@ajeddeloh, there is little fundamental difference between NetworkManager's profiles and what networkd does. The only differences as I see are:
Agree, nmstate follows a different approach, which may or may not be suitable. But NetworkManager's profiles and netword's configuration are fundamentally similar.
Yes
You don't need a user to log in and access it via D-Bus. Just prepare (predeploy, generate) the profiles, and NetworkManager will do it automatically. Although you say "networkd does nothing automatically; everything is very explicit". I don't see a large the difference there. Both services start, and automatically configure networking according to the configured profiles (or .link files)
Currently NetworkManager profiles are not composable/overlayable like they are with networkd, but there is no huge objection from the NM team to adding that capability. The keyfiles (.ini-file format which NM uses by default) can easily be composed from snippets and composability could be added to NM if this functionality is a deal-breaker. However, NetworkManager can be configured via files and D-Bus API. It is hard (impossible?) to come up with a usable, simple, and powerful D-Bus API, that lets you modify profiles which are assembled from multiple locations. Currently, the D-Bus API is just an "update-entire-profile" call and NetworkManager replaces the entire file on disk. If NetworkManager gets composable keyfiles, these profiles probably cannot be sensibly modified via D-Bus. So, read-only overlays are easily possible, but they take away another important feature. Since networkd doesn't have a D-Bus API for writing profiles anyway, it doesn't have this "limitation". Maybe it would be better to approach this from the angle of which problem composable profiles solve, instead of what your current solution to that problem is. Yes, the general problem is clear. But how much do you use this? Do you use it for some properties in particular? Is this more relevant for some properties than for others? If you for example just use it to adjust the MTU you can already configure the default value for ethernet.mtu property in /usr/lib/NetworkManager/conf.d snippets. Regarding nmstate. I see nmstate as a higher layer API on top of NetworkManager. That aims to make some things easier to do. But it cannot escape limitations that NetworkManager itself has. For example, nmstate can generate profiles for NetworkManager (templated?), not unlike a generator in systemd. And of course, you can generate profiles via any other means, aside using nmstate. If that is useful to solve the same problem as overlay configuration files, then fine. On the other hand, if overlay-able profiles are deemed as the best solution, NetworkManager can add support for them (despite the shortcomings). We also value minimal solutions and simplicity (yes, really!). I personally think that CL would need nothing except some pre-deployed profiles and configuration snippets, and the rest should be handled by NetworkManager. If nmstate turns out to be beneficial in this picture, that's very fine with me. But that remains to be seen. "NetworkManager dependency chain is not really too big when installed without optional plugins (such as Wi-Fi or PPP that may be unneccary in CL). As initrd was mentioned above. IMO, NetworkManager in initrd is a must have feature, regardless of CL. Lubomir is working on that right now. One thing that could be improved with NetworkManager is still it's memory footprint. I don't see there low-hanging fruits either. But it's a valid criticism, that we should focus more at improving there.
NetworkManager will not add features for the sole purpose to check an item in a requirements list. It will add them, if they make sense in the larger picture, and are beneficial to use-cases we want to support. And we want NetworkManager to support a wide range of use-cases, in particular server and CL use-cases. And we do so for years already.
Agree! The preferred configuration format is keyfile; manual nm-settings-keyfile and nm-settings. Granted, file based configuration is where systemd and networkd excel. NetworkManager's keyfile format and its documentation should improve further. However, it is a first-class citizen for NetworkManager as well. It's just not the only one. |
KubeVirt VMs are only consumers of the pods resources which in turn would require the node to be able to provide it to the pod and keep the expected SLA.
Yes, but on the other side of this we do have customer requirements we need that nm+nm-state were created to solve and a roadmap to enable a lot of advanced application capabilities using these tools.
Yes. I was only try to highlight the comparison of usage patterns/expectations.
So this would include SR-IOV, DPDK, OVS bridges, Contrail vRouter, Infiniband, VPP and so on? |
As a general note, fedora-coreos explicitly does not try to tackle all possible workloads and environments. There will always be custom kernel modules, complex network controllers and pet nodes we are not catering to. I appreciated @thom311 details. I don't have many real technical points against networkd and I don't have much experience on NM as a distro maintainer. I'm interested in tracking whether NM and nmstate evolution will end up in a declarative yet non-monolithic model similar to networkd, which I think would a be sweet intersection point for our specific usecase. From a design point of view, I have some fears (but I lack specific knowledge) as neither seem to be designed with any internal-serialized-state/external-user-configuration separation and seem in general not aware of the vendor/user/runtime split and layering (i.e. /usr+/etc+/run). Both are ingredients which allowed us to evolve and update the distribution without forcing frequent manual interventions/re-provisionings. |
Quick note of context: This is a discussion for FCOS, not CL. We cannot switch CL without massive breakage.
How big is it? CL hasn't historically been too concerned about memory footprint (torcx even unpacks docker et. al. into a tmpfs) so I don't think it'll be too much of an issue unless it's leaky
In a way, the initramfs is the case we care about least since it's not user configurable. As long as the network in the initfamfs works enough that Ignition can run it's inivisble to the user. In fact NetworkManager in the initramfs could actually help solve the DigitialOcean use case (they don't use DHCP, they reimplemented it over HTTP).
I'm not sure I follow 100%. When you make a change over DBus does it (currently) change the config files to match?
As @lucab said, we don't plan to support absolutely everything.
What does the NetworkManager release cycle look like? I.e. how soon can we expect to see the changes in a stable release?
I thought I saw something about ethernet getting auto-connected? Could have been from some plugin's docs though; ignore me if that's the case. Basically if you run NM with no config at all does it have any implicit configutation it uses? If it does have an implicit config, can you force it to not? |
yes
There is a RPM package called NetworkManager-config-server that will install a simple Network Manager config snippet to disable automatic configuraion of ethernet devices. |
But as said we will need these capabilities for advanced applications, like those running in VMs. The assumption for aiming to the simple makes sense and we also want to support the FCOS assumptions (like having that config as policy that applies to nodes in a non-specific way). FCOS is a platform and it needs to enable the wider user stories that we are trying to solve in a layered way for this aspects. |
For KubeVirt it should not make much of a difference how the network is set up - if it's NM, nmstate, or networkd. In the end there will be a set of interfaces with different roles (classically categorized as control plane(s) and data plane(s)). Now, from my perspective it's a little tricky ATM. I understand why networkd is preferred over NM for this use-case (and technically I like this). But considering the pain that not decently maintained software can cause, I also understand why people favor NM. There are gaps on both sides right now, but it's not only about now - as we know that there will be requirements coming in future, once FCOS sees more adoption. Thus IMHO it should be taken into account if NM or networkd can deal with an incoming stream of requirements and bugs. Just my 2ct. |
This is a very late (sorry!) followup to a meeting we had with the NM folks to see whether we could be on a converging path. NM recently landed an initrd-generator: https://developer.gnome.org/NetworkManager/stable/nm-initrd-generator.html. This is the first step to allow us to replace the dracut-based networking in initramfs with a monolithic solution, which is coherent and shared with how the network in real-rootfs works. Upstream is targeting this at F30 (@lkundrak may have more status update on this). Upstream NM confirmed that it should be already possible to support runtime-only (i.e. not persisted to FS) configurations. I didn't dig more specifically into the details of this, but investigation may be needed in order to add support for this in coreos-metadata. This is effectively #111. Regarding carry-on DHCP leases from initramfs: nowadays NM supports multiple DHCP backends (I think the current Fedora default is based on networkd library). As long as NM with a consistent backend is used in both initramfs and rootfs a lease can be carried (or dropped, by deleting the runtime lease file) without issues. Regarding file-system split, there are hook-scripts shipped by packages under Regarding internal NM support for merging configuration snippets (i.e. like networkd config), @thom311 confirmed this is likely not happening in the near future. In particular, this is very hard due to fundamental NM design constraints and writable dbus API. This would limits and re-shape a bit the way we ship distro-wide defaults, but it looks like it can't be easily changed in the foreseeable future. From CoreOS side, we decided to make the upcoming Ignition schema more flexible and not bound to networkd. As such we are dropping the Finally, Anaconda also likes to fiddles with legacy "ifcfg-rh" network configuration, which may interfere with both distro defaults and user configs. As we are moving out of Anaconda image building, this shouldn't be a concern going further. For future wish items, we briefly touched how profiles declare their own matching rules. Currently it is mostly based on interface names, but upstream is keen on adding new matching parameters. That should allow us to more comfortably allow cloud-platform-specific bits. |
@thom311 do you have tickets to reference here for the few RFE items above on NM side? I think you mentioned in Brno that you had action items on your plate, but I couldn't find any on gitlab now. |
nm-initrd-generator is currently not shipped in the F30 NM packages. |
It looks like the rolled that back because of anaconda; I don't think that impacts FCOS. Can we include the generator in FCOS anyway? |
@lucab sorry for the late reply. To my understanding, we identified 3 main issues.
All these should be ready in upcoming 1.20.0 release. NetworkManager 1.20.0 is not yet released, but that will happen soon and be in rawhide quickly. |
From reading this issue it seems like NetworkManager has been decided on but from a user's perspective, trying to get NetworkManager working (and still failing) has been extremely painful and time-consuming. I have a small request that you have good docs on getting connection profiles set up as currently the information on the internet seems pretty sparse. For a bit of awareness of the problem I faced (I don't expect it to be solved here though):
Beware, the above will cause connectivity issues if NetworkManager ever crashes or you restart it. It's probably better to put it in a separate unit. |
@SerialVelocity what are you running in initrd to setup networking? When NetworkManager starts and finds an interface pre-configured, it assumes that somebody else configured the device and won't autoactivate a profile on it because that would be destructive. That is also the case when you setup networking in initrd before NM is starting. One possible solution here would be to define an API how NetworkManager takes over externally configured devices. Optimally, this API would not be NM specific, so you could run an arbitrary tool in initrd (that honors the API) and pass the configured devices to any other networking configuration tool in real boot. In practice, defining and implementing such an API complicates everything tremendously, and this is when in practice we lack contributors and testers to implement this. So, the suggested solution is instead to also run NetworkManager in initrd. NetworkManager knows how to configure the device and pass it over to itself. Thereby we concentrate our efforts in having one combination working well. In this github issue, that point is also discussed as a roadblock: running NetworkManager in initrd. And from NetworkManager's side, this should be working, but it requires to configure initrd accordingly (upstream dracut also has support for that and Fedora 31 also runs NetworkManager in initrd). Yes, the |
@thom311 Thanks for the explanation! Another possible solution is a flag that forces the connection to match a certain interface no matter what or just always match if But ok, the NetworkManager in initrd is meant to fix this case, good to know! Thanks again for the info.
Yes, it made me very sad having to do this. 😭 |
OK, right. Seems like this is the default now in f31? Just going to cross-link this here: coreos/fedora-coreos-config#200 (comment). Note this is only to get the rebase to f31 going. With that in place, we should be able to switch to NM in the initrd more easily when we're ready. (Though offhand as mentioned in that link, I wasn't able to get it working in testing, but I didn't dig very deeply.) |
A update on two points about networkd that were raised earlier in the dicussion:
In systemd-244 networkd supports reloading of configuration (through a dbus command or by In systemd-243 a little tool called This generator approach shows that it is fairly easy to write "importers" for external config. |
This discussion has been open for a long time. In practice we have shipped Fedora CoreOS out of preview with NetworkManager as the de-facto networking configuration implementation on FCOS. This is even more true with the recent move to use NetworkManager in the initramfs, which brings us closer to the default networking implementation in the initramfs for the rest of Fedora (as of Fedora 31). I'm going to close this out now. Thanks for the discussion all! |
systemd-networkd was removed because NetworkManager was chosen as the de-facto networking configuration implementation as discussed in coreos/fedora-coreos-tracker#24 but removing it entirely restricts end user choices to fit their use cases. Since systemd-networkd can live along side of NetworkManager this PR adds it back in.
Fedora uses NetworkManager for handling network configuration. Container Linux uses networkd. We need to decide on one. We don't want to carry both since that's just twice the maintainence and chance of breakage without much benefit.
NetworkManager is advantageous because it has wider adoption, especially within the Fedora (and RedHat) ecosystem. It is (to my knowledge) generally more stable than networkd. Unfortunately, it's also harder to write config files for. The nmstate project will help significantly (makes the configuation more declarative) but it still lacks the flexibility of networkd's configuration. nmstate would need to be rewritten in some compiled language (i.e. not python) for inclusion in FCOS.
networkd has a configuation format that lends itself nicely to Container Linux today. The ability to "layer" configs works well for having a default that can be overridden for cloud specific changes and user specified changes. This is especially powerful when combined with it's matching rules. It's configuration is very similar to systemd's in general. nmstate has a proposal for templates which would help, but they still aren't as flexible as networkd's configuation. Unfortunately, networkd tends to suffer regressions and isn't as actively maintained as the core of systemd or NetworkManager. It cannot handle config file changes without restarting the service, but that isn't an issue with FCOS since the nodes shouldn't be configured after first boot.
Finally, networkd has fewer dependencies than networkmanager (considering we're already shipping systemd), especially since Fedora enables most features. We could change this and repackage it for FCOS stripping out unneeded features, but that'd be another custom package to carry and maintain.
We don't have any visibilty into how existing CL or FAH are using networkd or NetworkManager (respectively). This makes determining what requirements we have for network configuration hard.
In my opinion, networkd is a better fit for FCOS, even if it is more regression-happy than we'd like. I'm also perhaps a bit biased coming from my CL background.
The text was updated successfully, but these errors were encountered: