Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap/Kick issues with realms #347

Open
gwilton opened this issue Mar 22, 2014 · 3 comments
Open

Bootstrap/Kick issues with realms #347

gwilton opened this issue Mar 22, 2014 · 3 comments

Comments

@gwilton
Copy link

gwilton commented Mar 22, 2014

Hi,

Over this past week i been trying to integrated with Ironfan 6.0.x. I been running into numerous problems along the way. At the moment I can successfully launch instances to EC2 but every other functionality seems to be failing. Here are the problems I am having, maybe someone on here can help.

Dependencies

So my Gemfile looks like this... The ironfan_homebase seems to be stuck on ironfan4, did what I could here to integrate with ironfan6.

source "http://rubygems.org"

#
# Chef
#
gem 'ironfan',         "= 6.0.6"

gem 'berkshelf',       "= 1.4.2"     # FIXME: pins chef to the 10.16 branch.
gem 'faraday', "= 0.8.9"             # Latest faraday version was resulting in problems with ridley had to pin it down
gem 'parseconfig'

gem 'spiceweasel'
gem 'chef-rewind'
gem "knife-ec2", "~> 0.6.4"

#
# Test drivers
#

group :test do
  gem 'rake'
  gem 'bundler',       "~> 1"
  gem 'rspec',         "~> 2.5"
  gem 'redcarpet',   "~> 2"
  gem 'cucumber',      "~> 1.1"
  gem 'foodcritic'
end

#
# Development
#

group :development do
  gem 'yard',          "~> 0.6"
  gem 'jeweler'

  gem 'ruby_gntp'

  # FIXME: Commented out until guard-chef stops breaking bundle update
  # gem 'guard',         "~> 1"
  # gem 'guard-process', "~> 1"
  # gem 'guard-chef',    :git => 'git://github.com/infochimps-forks/guard-chef.git'
  # gem 'guard-cucumber'
end

group :support do
  gem 'pry'  # useful in debugging
end

Clusters & Realms Definition

It seems like defining a cluster under ironfan_homebase/clusters is going away and now is all being done in ironfan_homebase/realms. I was able to put something together with the following documentation (https://github.com/infochimps-labs/ironfan/blob/master/NOTES-REALM.md).

I created a realm 'ironfan_homebase/realms/q1.rb' That looks like this.

Ironfan.realm(:q1) do  

  cluster :control do
    cloud(:ec2) do
      permanent           false
      availability_zones ['us-east-1a']
      flavor              'm1.large'
      backing             'ebs'
      image_name          'ironfan-precise'
      bootstrap_distro    'ubuntu12.04-ironfan'
      chef_client_script  'client.rb'
      mount_ephemerals
    end

    environment           :qa


    role                  :systemwide,    :first
    cloud(:ec2).security_group :systemwide
    role                  :ssh
    cloud(:ec2).security_group(:ssh).authorize_port_range 22..22
    role                  :set_hostname

    recipe                'log_integration::logrotate'

    role                  :volumes
    role                  :package_set,   :last
    role                  :minidash,      :last

    role                  :org_base
    role                  :org_users
    role                  :org_final,     :last

    role                  :tuning,        :last

    facet :worker do
      instances           1
    end

    facet :app do
      instances           1
      cloud(:ec2).flavor        'm1.large'
      recipe              'volumes::build_raid', :first

      # FIXME: This works around https://github.com/infochimps-labs/ironfan/issues/209
      cloud(:ec2).mount_ephemerals(:mountable => false, :in_raid => "md0")
      raid_group(:md0) do
        device            '/dev/md0'
        mount_point       '/raid0'
        level             0
        sub_volumes       [:ephemeral0, :ephemeral1]
      end
    end

    cluster_role.override_attributes({
      })
  end


end

Launching EC2 instance

I am able to launch the instance in EC2 without a problem. But the moment I try to bootstrap the instance I get an ERROR. I try playing around with many different cluster/realms definitions. The only time I get a different result is if the cluster is named "sandbox" (I know, very strange see below).

$ knife cluster launch q1-control-worker-0
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, control cluster, worker facet, servers 0
  control:          Loading chef
  control:          Loading ec2
  control:          Reconciling DSL and provider information
  +---------------------+-------+-------------+----------+------------+-----+-------+
  | Name                | Chef? | State       | Flavor   | AZ         | Env | Realm |
  +---------------------+-------+-------------+----------+------------+-----+-------+
  | q1-control-worker-0 | no    | not running | m1.large | us-east-1a | qa  | q1    |
  +---------------------+-------+-------------+----------+------------+-----+-------+
Syncing to chef
Preparing shared resources:
  control:          Loading chef
  control:          Loading ec2
  control:          Reconciling DSL and provider information
Loaded information for 2 computer(s) in cluster control
  q1-control:       creating key pair for q1-control
  control:          creating security groups
  q1-control:         creating q1-control security group
  q1-control-app:     creating q1-control-app security group
  q1-control-worker:      creating q1-control-worker security group
  control:          ensuring security group permissions
  q1-control:         ensuring access from q1-control to q1-control
  ssh:                ensuring tcp access from 0.0.0.0/0 to 22..22

Launching computers
  +---------------------+-------+-------------+----------+------------+-----+-------+
  | Name                | Chef? | State       | Flavor   | AZ         | Env | Realm |
  +---------------------+-------+-------------+----------+------------+-----+-------+
  | q1-control-worker-0 | no    | not running | m1.large | us-east-1a | qa  | q1    |
  +---------------------+-------+-------------+----------+------------+-----+-------+
  q1-control-worker-0:  creating cloud machine
  i-b0bc4891:       waiting for machine to be ready
  i-b0bc4891:       tagging with {"cluster"=>"control", "facet"=>"worker", "index"=>0, "name"=>"q1-control-worker-0", "Name"=>"q1-control-worker-0", "creator"=>"wilton"}
  vol-94fab2e2:     tagging with {"cluster"=>"control", "facet"=>"worker", "index"=>0, "name"=>"q1-control-worker-0-root", "Name"=>"q1-control-worker-0-root", "creator"=>"wilton", "server"=>"q1-control-worker-0", "mount_point"=>"/", "device"=>"/dev/sda1"}
  q1-control-worker-0:  setting termination flag false
  q1-control-worker-0:  syncing EBS volumes
  q1-control-worker-0:  trying ssh
All computers launched correctly
Applying aggregations:
  control:          Loading chef
  control:          Loading ec2
  control:          Reconciling DSL and provider information
Loaded information for 2 computer(s) in cluster control
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+
  | Name                | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP     | Private IP    | Created On |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+
  | q1-control-worker-0 | yes   | running | m1.large | us-east-1a | qa  | q1    | i-b0bc4891 | 50.19.196.221 | 10.225.25.224 | 2014-03-22 |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+

Bootstrapping instance

When cluster name is "control"

$ knife cluster bootstrap q1-control-worker-0
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, control cluster, worker facet, servers 0
  control:          Loading chef
  control:          Loading ec2
  control:          Reconciling DSL and provider information
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+-----------+
  | Name                | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP     | Private IP    | Created On | relevant? |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+-----------+
  | q1-control-worker-0 | yes   | running | m1.large | us-east-1a | qa  | q1    | i-b0bc4891 | 50.19.196.221 | 10.225.25.224 | 2014-03-22 | true      |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+-----------+
Preparing shared resources:
  control:          Loading chef
  control:          Loading ec2
  control:          Reconciling DSL and provider information
Loaded information for 2 computer(s) in cluster control
  control:          ensuring security group permissions
  q1-control:         ensuring access from q1-control to q1-control
  ssh:                ensuring tcp access from 0.0.0.0/0 to 22..22

Running bootstrap on q1-control-worker-0...

Bootstrapping the node redoes its initial setup -- only do this on an aborted launch.
Are you absolutely certain that you want to perform this action? (Type 'Yes' to confirm) Yes

WARNING: Error running #<Ironfan::Broker::Computer(server=#<Ironfan::Dsl::Server(name="0", components=c{  }, run_list_items=c{ role[systemwide], role[ssh], role[nfs_client], role[set_hostname], log_integration::logrotate, role[volumes], role[package_set], role[minidash], role[org_base], role[org_users], role[org_final], role[tuning], role[q1-control-cluster], role[q1-control-worker-facet] }, clouds=c{ ec2 }, volumes=c{  }, security_groups=c{  }, environment=:qa, realm_name="q1", cluster_role=#<Ironfan::Dsl::Role>, facet_role=#<Ironfan::Dsl::Role>, cluster_names={:control=>:control}, cluster_name="control", facet_name="worker")>, resources=c{ client, node, machine, security_group__systemwide, security_group__ssh, security_group__nfs_client }, drives=c{ root, ephemeral0, ephemeral1 }, providers=c{ chef, iaas })>:
WARNING: undefined method `ssh_identity_file' for #<Ironfan::Dsl::Ec2:0x007f9cd4733958>
ERROR: undefined method `ssh_identity_file' for #<Ironfan::Dsl::Ec2:0x007f9cd4733958> (NoMethodError)
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan/broker/computer.rb:209:in `ssh_identity_file'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/chef/knife/ironfan_knife_common.rb:171:in `bootstrapper'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/chef/knife/ironfan_knife_common.rb:181:in `run_bootstrap'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/chef/knife/cluster_bootstrap.rb:62:in `block in perform_execution'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan.rb:114:in `block (3 levels) in parallel'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan.rb:123:in `safely'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan.rb:113:in `block (2 levels) in parallel'
ERROR: /Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan/broker/computer.rb:209:in `ssh_identity_file'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/chef/knife/ironfan_knife_common.rb:171:in `bootstrapper'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/chef/knife/ironfan_knife_common.rb:181:in `run_bootstrap'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/chef/knife/cluster_bootstrap.rb:62:in `block in perform_execution'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan.rb:114:in `block (3 levels) in parallel'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan.rb:123:in `safely'
/Users/wilton/.rvm/gems/ruby-1.9.3-p392@ironchef/gems/ironfan-6.0.6/lib/ironfan.rb:113:in `block (2 levels) in parallel'
Applying aggregations:
  control:          Loading chef
  control:          Loading ec2
  control:          Reconciling DSL and provider information
Loaded information for 2 computer(s) in cluster control

Finished! Current state:
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+
  | Name                | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP     | Private IP    | Created On |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+
  | q1-control-worker-0 | yes   | running | m1.large | us-east-1a | qa  | q1    | i-b0bc4891 | 50.19.196.221 | 10.225.25.224 | 2014-03-22 |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+---------------+---------------+------------+

When cluster name is "sandbox"

$ knife cluster bootstrap q1-sandbox-worker-0
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, sandbox cluster, worker facet, servers 0
  sandbox:          Loading chef
  sandbox:          Loading ec2
  sandbox:          Reconciling DSL and provider information
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+-----------+
  | Name                | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP   | Private IP   | Created On | relevant? |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+-----------+
  | q1-sandbox-worker-0 | yes   | running | m1.large | us-east-1a | qa  | q1    | i-56b64277 | 54.82.90.20 | 10.96.197.99 | 2014-03-22 | true      |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+-----------+
Preparing shared resources:
  sandbox:          Loading chef
  sandbox:          Loading ec2
  sandbox:          Reconciling DSL and provider information
Loaded information for 2 computer(s) in cluster sandbox
  sandbox:          ensuring security group permissions
  q1-sandbox:         ensuring access from q1-sandbox to q1-sandbox
  ssh:                ensuring tcp access from 0.0.0.0/0 to 22..22

Running bootstrap on q1-sandbox-worker-0...

Bootstrapping the node redoes its initial setup -- only do this on an aborted launch.
Are you absolutely certain that you want to perform this action? (Type 'Yes' to confirm) Yes

  q1-sandbox-worker-0:  Running bootstrap
Bootstrapping Chef on ec2-54-82-90-20.compute-1.amazonaws.com
Failed to authenticate ubuntu - trying password auth

When cluster name is "sandbox"

When the cluster has a name of "sandbox", I get prompted for a password. So it seems that the ssh key is not properly being set. So i use the -i option and provide the generated key under "ironfan_homebase/knife/credentials/ec2_keys/" to get by this. The system begins to bootstrap but never completes successfully. I get this console prompt window asking me to enter the chef_server_url which I can't. I do see that chef-client is installed on the instance.

$ knife cluster bootstrap q1-sandbox-worker-0 -i knife/credentials/ec2_keys/q1-sandbox.pem 
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, sandbox cluster, worker facet, servers 0
  sandbox:          Loading chef
  sandbox:          Loading ec2
  sandbox:          Reconciling DSL and provider information
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+-----------+
  | Name                | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP   | Private IP   | Created On | relevant? |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+-----------+
  | q1-sandbox-worker-0 | yes   | running | m1.large | us-east-1a | qa  | q1    | i-3dae5a1c | 54.82.77.79 | 10.65.144.70 | 2014-03-22 | true      |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+-----------+
Preparing shared resources:
  sandbox:          Loading chef
  sandbox:          Loading ec2
  sandbox:          Reconciling DSL and provider information
Loaded information for 2 computer(s) in cluster sandbox
  sandbox:          ensuring security group permissions
  q1-sandbox:         ensuring access from q1-sandbox to q1-sandbox
  ssh:                ensuring tcp access from 0.0.0.0/0 to 22..22

Running bootstrap on q1-sandbox-worker-0...

Bootstrapping the node redoes its initial setup -- only do this on an aborted launch.
Are you absolutely certain that you want to perform this action? (Type 'Yes' to confirm) Yes

  q1-sandbox-worker-0:  Running bootstrap
Bootstrapping Chef on ec2-54-82-77-79.compute-1.amazonaws.com
ec2-54-82-77-79.compute-1.amazonaws.com deb http://apt.opscode.com/ precise-0.10 main
ec2-54-82-77-79.compute-1.amazonaws.com gpg: directory `/local/home/ubuntu/.gnupg' created
ec2-54-82-77-79.compute-1.amazonaws.com gpg: new configuration file `/local/home/ubuntu/.gnupg/gpg.conf' created
ec2-54-82-77-79.compute-1.amazonaws.com gpg: 
ec2-54-82-77-79.compute-1.amazonaws.com WARNING: options in `/local/home/ubuntu/.gnupg/gpg.conf' are not yet active during this run
ec2-54-82-77-79.compute-1.amazonaws.com gpg: keyring `/local/home/ubuntu/.gnupg/secring.gpg' created
ec2-54-82-77-79.compute-1.amazonaws.com gpg: keyring `/local/home/ubuntu/.gnupg/pubring.gpg' created
ec2-54-82-77-79.compute-1.amazonaws.com gpg: requesting key 83EF826A from hkp server keys.gnupg.net
ec2-54-82-77-79.compute-1.amazonaws.com gpg: /local/home/ubuntu/.gnupg/trustdb.gpg: trustdb created
ec2-54-82-77-79.compute-1.amazonaws.com gpg: key 83EF826A: public key "Opscode Packages <[email protected]>" imported
ec2-54-82-77-79.compute-1.amazonaws.com gpg: Total number processed: 1
ec2-54-82-77-79.compute-1.amazonaws.com gpg: 
ec2-54-82-77-79.compute-1.amazonaws.com               imported: 1
ec2-54-82-77-79.compute-1.amazonaws.com Sat Mar 22 17:17:26 UTC 2014 
ec2-54-82-77-79.compute-1.amazonaws.com 
ec2-54-82-77-79.compute-1.amazonaws.com **** 
ec2-54-82-77-79.compute-1.amazonaws.com **** apt update:
ec2-54-82-77-79.compute-1.amazonaws.com ****
ec2-54-82-77-79.compute-1.amazonaws.com Preconfiguring packages ...
ec2-54-82-77-79.compute-1.amazonaws.com 

ec2-54-82-77-79.compute-1.amazonaws.com Package configuration                   



      ┌───────────────────────┤ Configuring chef ├───────────────────────┐      
      │  This is the full URI that clients will use to connect to the    │      
      │  server.                                                         │      
      │  .                                                               │      
      │  This will be used in /etc/chef/client.rb as 'chef_server_url'.  │      
      │                                                                  │      
      │ URL of Chef Server (e.g., http://chef.example.com:4000):         │      
      │                                                                  │      
      │ ________________________________________________________________ │      
      │                                                                  │      
      │                              <Ok>                                │      
      │                                                                  │      
      └──────────────────────────────────────────────────────────────────┘      

knife cluster kick

So lets assume some how the bootstrap above at least installed the chef-client. Lets try to do a kick. I follow the same method as bootstrapping and provide the -i option. That fails and i am not sure why is trying to use my local username of 'wilton' to kick, that should be ubuntu. Ignoring that and going to provide the -x option.

$ knife cluster kick q1-sandbox-worker-0 -i knife/credentials/ec2_keys/q1-sandbox.pem 
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, sandbox cluster, worker facet, servers 0
  sandbox:          Loading chef
  sandbox:          Loading ec2
  sandbox:          Reconciling DSL and provider information
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+
  | Name                | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP   | Private IP   | Created On |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+
  | q1-sandbox-worker-0 | yes   | running | m1.large | us-east-1a | qa  | q1    | i-3dae5a1c | 54.82.77.79 | 10.65.144.70 | 2014-03-22 |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+
WARNING: Failed to connect to  -- Net::SSH::AuthenticationFailed: Authentication failed for user [email protected]@ec2-54-82-77-79.compute-1.amazonaws.com
$ knife cluster kick q1-sandbox-worker-0 -i knife/credentials/ec2_keys/q1-sandbox.pem -x ubuntu
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, sandbox cluster, worker facet, servers 0
  sandbox:          Loading chef
  sandbox:          Loading ec2
  sandbox:          Reconciling DSL and provider information
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+
  | Name                | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP   | Private IP   | Created On |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+
  | q1-sandbox-worker-0 | yes   | running | m1.large | us-east-1a | qa  | q1    | i-3dae5a1c | 54.82.77.79 | 10.65.144.70 | 2014-03-22 |
  +---------------------+-------+---------+----------+------------+-----+-------+------------+-------------+--------------+------------+
q1-sandbox-worker-0 ****
q1-sandbox-worker-0 
q1-sandbox-worker-0 starting chef-client-nonce service
q1-sandbox-worker-0 
q1-sandbox-worker-0 ****
q1-sandbox-worker-0 
q1-sandbox-worker-0 [2014-03-22T17:31:54+00:00] INFO: *** Chef 10.16.4 ***
q1-sandbox-worker-0 [2014-03-22T17:31:55+00:00] INFO: Run List is [role[systemwide], role[ssh], role[set_hostname], recipe[log_integration::logrotate], role[volumes], role[org_base], role[org_users], role[package_set], role[org_final], role[tuning], role[q1-sandbox-cluster], role[q1-sandbox-worker-facet]]
q1-sandbox-worker-0 [2014-03-22T17:31:55+00:00] INFO: Run List expands to [apt::update_immediately, build-essential, motd, ntp, route53::default, route53::set_hostname, log_integration::logrotate, xfs, volumes::mount, volumes::resize, package_set, tuning::default]
q1-sandbox-worker-0 [2014-03-22T17:31:55+00:00] INFO: HTTP Request Returned 404 Not Found: No routes match the request: //reports/nodes/q1-sandbox-worker-0/runs
q1-sandbox-worker-0 [2014-03-22T17:31:55+00:00] INFO: Starting Chef Run for q1-sandbox-worker-0
q1-sandbox-worker-0 [2014-03-22T17:31:55+00:00] INFO: Running start handlers
q1-sandbox-worker-0 [2014-03-22T17:31:55+00:00] INFO: Start handlers complete.
q1-sandbox-worker-0 [2014-03-22T17:31:56+00:00] INFO: Loading cookbooks [apt, build-essential, log_integration, motd, ntp, package_set, route53, silverware, tuning, volumes, xfs]

knife cluster ssh

Same issue trying to use cluster ssh. I have to provide both -i -x option to get a successful authentication.

$ knife cluster ssh q1-sandbox-worker-0 -i knife/credentials/ec2_keys/q1-sandbox.pem uptime
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, sandbox cluster, worker facet, servers 0
sandbox: Loading chef
sandbox: Loading ec2
sandbox: Reconciling DSL and provider information
WARNING: Failed to connect to -- Net::SSH::AuthenticationFailed: Authentication failed for user wilton@ec2-54-82-77-79.compute-1.amazonaws.com@ec2-54-82-77-79.compute-1.amazonaws.com

$ knife cluster ssh q1-sandbox-worker-0 -i knife/credentials/ec2_keys/q1-sandbox.pem -x ubuntu uptime
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in q1 realm, sandbox cluster, worker facet, servers 0
sandbox: Loading chef
sandbox: Loading ec2
sandbox: Reconciling DSL and provider information
q1-sandbox-worker-0 17:34:05 up 18 min, 1 user, load average: 0.15, 0.15, 0.14

Conclusion

I know this is a lot of information. I been trying to get this working for a while. However, i am not too familiar with the Ironfan internal code to dig to deep, but i will give it a shot this weekend. I am at the point that if i can't figure this out i will have to go back to using ironfan3/4 with chef 0.10.x.

I am thinking there are a few issues here...

  1. Not sure why i can't bootstrap a facet unless the cluster name is "sandbox". This is really weird.
  2. I think cluster bootstrap #1 has to do with the bigger problem revolving around ssh keys for the realms-cluster. I see a lot of commits over the last few days around keys in general.
  3. Once I get the systems to bootstrap I think there is an issue with the "ubuntu12.04-ironfan" bootstrap script, this should not prompt you for chef_server_url.

These will be the three thing i will be looking into this weekend and try to fix. I would really appreciate some help here. Thank you!

@aseever
Copy link
Contributor

aseever commented Mar 22, 2014

Thanks for the details! Sorry it hasn't gone smoothly this week. It might be a few days before we have a solution but we'll dig in to see if we can ascertain what might have gone wrong.

@gwilton
Copy link
Author

gwilton commented Mar 22, 2014

Hey man, thanks for the quick follow-up. I enjoy using Ironfan and will do what I can to help. If I figure anything out I will let you know. If anyone is running a working homebase with Ironfan6 it would be great to see how the Gemfile and knife.rb looks. What ruby version is being used, maybe even a gem list. An example of a fully working realm configuration would help. Thanks.

P.S.
Every knife cluster command says...
"no realm-specific Gemfile found. using default Gemfile."
What does this mean?

@gwilton
Copy link
Author

gwilton commented Mar 23, 2014

I was able to gather more information. At least for the bootstrapping issue. The reason I am able to bootstrap a realm named "sandbox" and nothing else works, is because i have a EC2 Key Pair with the name of "sandbox". You will find in the resource data object below that during the bootstrap process the realm looks for a identity_file with name of #{cluster}.pem when it should be #{realm}-{cluster}.pem. If #{cluster}.pem happens to exist, you will not get an ERROR but you still have to provide the proper identity_file and username with the -i -x option.

$ knife cluster bootstrap qa-sandbox-app-0
no realm-specific Gemfile found. using default Gemfile.
Inventorying servers in qa realm, sandbox cluster, app facet, servers 0
  sandbox:           Loading chef
  sandbox:           Loading ec2
  sandbox:           Reconciling DSL and provider information
  +------------------+-------+---------+----------+------------+-----+-------+------------+--------------+---------------+------------+-----------+
  | Name             | Chef? | State   | Flavor   | AZ         | Env | Realm | MachineID  | Public IP    | Private IP    | Created On | relevant? |
  +------------------+-------+---------+----------+------------+-----+-------+------------+--------------+---------------+------------+-----------+
  | qa-sandbox-app-0 | yes   | running | m1.large | us-east-1a | qa  | qa    | i-7a2fd85b | 54.80.213.11 | 10.214.21.194 | 2014-03-23 | true      |
  +------------------+-------+---------+----------+------------+-----+-------+------------+--------------+---------------+------------+-----------+
Preparing shared resources:
  sandbox:           Loading chef
  sandbox:           Loading ec2
  sandbox:           Reconciling DSL and provider information
Loaded information for 3 computer(s) in cluster sandbox
  sandbox:           ensuring security group permissions
  qa-sandbox:          ensuring access from qa-sandbox to qa-sandbox
  ssh:                 ensuring tcp access from 0.0.0.0/0 to 22..22

Running bootstrap on qa-sandbox-app-0...

Bootstrapping the node redoes its initial setup -- only do this on an aborted launch.
Are you absolutely certain that you want to perform this action? (Type 'Yes' to confirm) Yes

  qa-sandbox-app-0:    Running bootstrap
Bootstrapping Chef on ec2-54-80-213-11.compute-1.amazonaws.com
Failed to authenticate ubuntu - trying password auth
Enter your password:

WARNING: Error running [#<Ironfan::Broker::Computer(server=#<Ironfan::Dsl::Server(name="0", components=c{ }, run_list_items=c{ role[systemwide], role[ssh], role[set_hostname], role[volumes], role[package_set], role[org_base], role[org_users], role[org_final], role[tuning], volumes::build_raid, role[app], role[qa-sandbox-cluster], role[qa-sandbox-app-facet] }, clouds=c{ ec2 }, volumes=c{ ephemeral0, ephemeral1, md0 }, security_groups=c{ }, environment=:qa, realm_name="qa", cluster_role=#Ironfan::Dsl::Role, facet_role=#Ironfan::Dsl::Role, cluster_names={:sandbox=>:sandbox}, cluster_name="sandbox", facet_name="app")>, resources=c{ client, node, machine, keypair, security_group__systemwide, security_group__ssh }, drives=c{ ephemeral0, ephemeral1, md0, root }, providers=c{ chef, iaas })>, {:ssh_user=>"ubuntu", :distro=>"ubuntu12.04-ironchef", :template_file=>false, :run_list=>["role[systemwide]", "volumes::build_raid", "role[ssh]", "role[set_hostname]", "role[volumes]", "role[org_base]", "role[org_users]", "role[app]", "role[package_set]", "role[org_final]", "role[tuning]", "role[qa-sandbox-cluster]", "role[qa-sandbox-app-facet]"], :first_boot_attributes=>{}, :host_key_verify=>true, :verbosity=>0, :color=>true, :editor=>nil, :format=>"summary", :bootstrap_runs_chef_client=>true, :cloud=>true, :dry_run=>false, :config_file=>"/Users/wilton/Documents/github/ironfan_homebase/.chef/knife.rb", :computer=>#<Ironfan::Broker::Computer(server=#<Ironfan::Dsl::Server(name="0", components=c{ }, run_list_items=c{ role[systemwide], role[ssh], role[set_hostname], role[volumes], role[package_set], role[org_base], role[org_users], role[org_final], role[tuning], volumes::build_raid, role[app], role[qa-sandbox-cluster], role[qa-sandbox-app-facet] }, clouds=c{ ec2 }, volumes=c{ ephemeral0, ephemeral1, md0 }, security_groups=c{ }, environment=:qa, realm_name="qa", cluster_role=#Ironfan::Dsl::Role, facet_role=#Ironfan::Dsl::Role, cluster_names={:sandbox=>:sandbox}, cluster_name="sandbox", facet_name="app")>, resources=c{ client, node, machine, keypair, security_group__systemwide, security_group__ssh }, drives=c{ ephemeral0, ephemeral1, md0, root }, providers=c{ chef, iaas })>, :server=>#<Ironfan::Dsl::Server(name="0", components=c{ }, run_list_items=c{ role[systemwide], role[ssh], role[set_hostname], role[volumes], role[package_set], role[org_base], role[org_users], role[org_final], role[tuning], volumes::build_raid, role[app], role[qa-sandbox-cluster], role[qa-sandbox-app-facet] }, clouds=c{ ec2 }, volumes=c{ ephemeral0, ephemeral1, md0 }, security_groups=c{ }, environment=:qa, realm_name="qa", cluster_role=#Ironfan::Dsl::Role, facet_role=#Ironfan::Dsl::Role, cluster_names={:sandbox=>:sandbox}, cluster_name="sandbox", facet_name="app")>, :attribute=>nil, :identity_file=>"/Users/wilton/Documents/github/ironfan_homebase/knife/credentials/ec2_keys/sandbox.pem", :use_sudo=>true, :chef_node_name=>"qa-sandbox-app-0", :client_key=>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants