Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: produced an unexpected new value: Root resource was present, but now absent. #333

Closed
natemellendorf opened this issue Aug 3, 2024 · 7 comments

Comments

@natemellendorf
Copy link

natemellendorf commented Aug 3, 2024

Good evening, everyone.

We're in the process of deploying multiple Fortinet NGFW instances in AWS.
Things have been going smoothly, but we've hit a snag and we're not sure what to make of it.

I'll provide debugs and details below, but let me know if more information is needed.

Setup

We have multiple fortinet providers configured, each having a unique alias.
We have our firewall configuration tied to a single terraform module, which we duplicate and pass each provider into.

module "use1-az6-a" {
  source = "../../modules/tfm_fortinet_config"
  providers = {
    fortios = fortios.use1-az6-a
  }
  policy_config = local.fortinet_policies
}

module "use1-az6-b" {
  source = "../../modules/tfm_fortinet_config"
  providers = {
    fortios = fortios.use1-az6-b
  }
  policy_config = local.fortinet_policies
}

Example logic in the module, which creates all the addresses that are passed in:

resource "fortios_firewall_address" "this" {
  for_each      = var.policy_config["addresses"]
  name          = each.value.name
  color         = try(each.value.color, 0)
  fqdn          = try(each.value.fqdn, null)
  subnet        = try(each.value.subnet, null)
  country       = try(each.value.country, null)
  wildcard_fqdn = try(each.value.wildcard_fqdn, null)
  obj_type      = try(each.value.obj_type, null)
  sub_type      = try(each.value.sub_type, null)
  type          = try(each.value.type, null)
}

Issue

When running a terraform plan, everything looks great. I'll see my firewalls each needing to have their policies applied.
However, when running terraform apply, we get inconsistent results. Sometimes everything applies smoothly, and others, the apply will fail on any of our firewalls. When it fails, this is the error message we get:

│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.use1-az6-b.fortios_firewall_address.this["CVIAWS020"], provider "provider[\"registry.terraform.io/fortinetdev/fortios\"].use1-az6-b" produced an unexpected new value: Root resource was present, but now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.

Debugs

debug.json

Observation

I noticed that when these failures occur, I see this in the trace:
\"matched_count\":2,\n

I see a similar log for other resources that create successfully, but they log:
\"matched_count\":1,\n

As these firewalls have little to no firewall configuration on them when we apply terraform, I find this behavior puzzling.

When I login the firewall that produced the error message,
I’ll find that the resource that threw the error was created.

Running a terraform apply again will produce the typical 500 server error, because the resource already exists on the firewall (or I believe that’s why)

This error occurs randomly. As in, it happens to any one of the multiple firewalls we're configuring. It also happens to random resources, not just CVIAWS020. However, in my testing, it seems to only happen to firewall addresses (objects).

{
    "@level": "info",
    "@message": "2024/08/02 23:28:33 FOS-fortios reading response: {\n  \"http_method\":\"GET\",\n  \"size\":40,\n  \"limit_reached\":false,\n  \"matched_count\":2,\n  \"next_idx\":16,\n  \"revision\":\"0b97f06be46d75f9885025723106e92a\",\n  \"cli_error\":[\n  ],\n  \"status\":\"error\",\n  \"http_status\":404,\n  \"vdom\":\"FG-traffic\",\n  \"path\":\"firewall\",\n  \"name\":\"address\",\n  \"mkey\":\"CVIAWS020\",\n  \"serial\":\"FGTAWSHTYM1GEG90\",\n  \"version\":\"v7.4.4\",\n  \"build\":2662\n}",
    "@module": "provider.terraform-provider-fortios_v1.20.0",
    "@timestamp": "2024-08-02T23:28:33.331674Z",
    "timestamp": "2024-08-02T23:28:33.331Z"
}
{
    "@level": "info",
    "@message": "2024/08/02 23:28:33 [WARN] resource (CVIAWS020) not found, removing from state",
    "@module": "provider.terraform-provider-fortios_v1.20.0",
    "@timestamp": "2024-08-02T23:28:33.331719Z",
    "timestamp": "2024-08-02T23:28:33.331Z"
}
@natemellendorf
Copy link
Author

natemellendorf commented Aug 6, 2024

For what it's worth, after some debugging, I noticed that after a successful POST to the firewall address endpoint by the provider, it immediately performs a GET against the returned resource id. As the POST itself will return a status for the initial request (success/failure), it felt redundant to perform the extra GET after a success being returned. Though, I may be misunderstanding the purpose of that extra GET request and why it's on all firewall address actions except DELETE.

I've pulled the provider locally, removed that extra GET, and verified that my firewall addresses, address groups, services, service groups, and policies are building correctly.

This is just a data point I've collected. Still not sure why the firewall would return a 404 or matched_count of 2 on newly created firewall addresses. It almost feels like some kind of eventual consistency issue? though, that's just a guess.

@MaxxLiu22
Copy link

Hi @natemellendorf ,

Thank you for raising this issue and providing this valuable information. It is quite strange; it appears that the FOS return shows two objects existing in the current URL path, but the content is missing, resulting in a 404 error. As you mentioned, if this issue occurs randomly across different resources, it may be related to the logic of how the backend handles requests.

I wonder if it is possible for you to provide your var.policy_config file and hide sensitive information since I can't reproduce this issue on my end when creating 20 firewall addresses. Alternatively, we could enable the debug function on FOS to see what is happening on the backend.

diagnose debug application httpsd -1
diagnose debug enable

Thanks,
Maxx

@natemellendorf
Copy link
Author

@MaxxLiu22

Thanks for taking a look and responding to my issue.

Your observation and subsequent concern is where I landed too. I enabled those debug commands on one of the six Fortinet NGFWs two days ago, and they didn’t reveal much for me.

I’ll start over, enable them again, and run the apply until the firewall produces the error. I’ll also supply a full working example of my terraform configuration with sensitive info redacted.

I should have this for you tomorrow,.

thanks again,

  • Nate

@rdkls
Copy link

rdkls commented Nov 20, 2024

I have a simple replication here

terraform {
  required_providers {
    fortimanager = {
      source  = "fortinetdev/fortimanager"
      version = "1.12.1"
    }
  }
}

provider "fortimanager" {
  hostname = "x.x.x.x"
  username = "admin"
  password = "password"
}

locals {
  fqdns_intune = split("\n", file("urls-intune.txt"))
}

resource "fortimanager_object_firewall_address" "intune_fqdn" {
  for_each = toset(local.fqdns_intune)
  name     = "Intune ${each.value}"
  type     = "fqdn"
  fqdn     = each.value
  comment  = "Intune ${each.value}"
}

Where urls-itune.txt is just

*.manage.microsoft.com
manage.microsoft.com
*.delivery.mp.microsoft.com
*.prod.do.dsp.mp.microsoft.com
*.update.microsoft.com
*.windowsupdate.com

(actual list is much longer but truncated to this to test, and produces same error)

diag logs I'm seeing on the fortimanager

Request [/usr/local/apache2/bin/httpd: 8030: 2292
]: {
    "client": "\/usr\/local\/apache2\/bin\/httpd:8030",
    "id": 2292,
    "method": "add",
    "params": [
        {
            "data": {
                "comment": "Intune *.windowsupdate.com",
                "fqdn": "*.windowsupdate.com",
                "name": "Intune *.windowsupdate.com",
                "type": "fqdn"
            },
            "url": "\/pm\/config\/adom\/root\/obj\/firewall\/address"
        }
    ],
    "session": "xxx",
    "src": "x.x.x.x"
}

Request [/usr/local/apache2/bin/httpd: 8030: 2298
]: {
    "client": "\/usr\/local\/apache2\/bin\/httpd:8030",
    "id": 2298,
    "method": "get",
    "params": [
        {
            "data": null,
            "url": "\/pm\/config\/adom\/root\/obj\/firewall\/address\/Intune\/*.windowsupdate.com"
        }
    ],
    "session": "xxx",
    "src": "x.x.x.x",
    "verbose": 1
}

As Nate mentioned, can see the GET there

@rdkls
Copy link

rdkls commented Nov 20, 2024

btw I found this issue isn't present in version 1.8.0

@MaxxLiu22
Copy link

Hi @rdkls ,

Thank you for bringing this issue to my attention. I was able to reproduce it, and since this issue related to the Terraform FMG provider, I have opened a separate case to track it. The root cause appears to be the presence of a space in the address name. During the GET request, Terraform may not handle this special character properly, which prevents it from locating the object it just created. I have reported the issue to the development team for resolution.

Thanks,
Maxx

@MaxxLiu22
Copy link

Hi @natemellendorf I am proceeding to close this case. If you have any further questions, please feel free to reopen it or create a new case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants