Category Archives: openstack

Difference between neutron LBaaS v1 and LBaaS v2 ?

LBaaS v2 is not a new topic anymore most of the customers are switching to LBaaS v2 from LBaaS v1. I have written blog posts in past related to the configuration of both in case you have missed, those are located at LBaaSv1 , LBaaSv2

Still the in Red Hat Openstack, no HA functionality is present for load balancer itself, it means if your load balancer service is running on controller node present in HA setup and if that node is getting down then we have to manually fix the things. There are some other articles present in internet to make LBaaS HA work using some workarounds but I have never tried them.

In this post I am going show the improvements of lbaasv2 over lbaasv1. I will also shed some light on Octavia project which can help us to provide HA capabilities for load balancing service basically it used for Elastic Load Balancing.

Let’s start with comparison of lbaasv2 and lbaasv1

lbaasv1 has provided the capabilities like :

  • L4 Load balancing
  • Session persistence including cookies based
  • Cookie insertion
  • Driver interface for 3rd parties.

Basic flow of the request in lbaas v1 :

Request —> VIP —> Pool [Optional Health Monitor] —> Members [Backend instances]

untitled

Missing features :

  • L7 Content switching [IMP feature]
  • Multiple TCP ports per load balancer
  • TLS Termination at load balancer to avoid the load on instances.
  • Load balancer running inside instances.

lbaasv2 is introduced in Kilo version, at that time it was not having the features like L7, Pool sharing, Single create LB [Creating load balancer in single API call] these features are included in liberty. Pool sharing feature is introduced in Mitaka.

Basic flow of the request in lbaas v2 :

Request —> VIP —> Listeners –> Pool [Optional Health Monitor] —> Members [Backend instances]

lbaas3

Let’s see what components/changes have been made in¬† which makes the missing feature available in newer version :

  1. L7 Content switching

Why we require this feature :

A layer 7 load balancer consists of a listener that accepts requests on behalf of a number of back-end pools and distributes those requests based on policies that use application data to determine which pools should service any given request. This allows for the application infrastructure to be specifically tuned/optimized to serve specific types of content. For example, one group of back-end servers (pool) can be tuned to serve only images, another for execution of server-side scripting languages like PHP and ASP, and another for static content such as HTML, CSS, and JavaScript.

This feature is introduced by adding additional component “Listener” in lbaasv2 architecture. We can add the policies and then apply the rules to policy to have L7 layer load balancing. Very informative article about the L7 content switching is available at link , it covers lot of practical scenarios.

2. Multiple TCP ports per load balancer

In lbaas v1 we were only having one TCP port like 80 or 443 at load balancer associated with VIP (Virtual IP), we can’t have two ports/protocols associated with VIP that means either you can have HTTP traffic load balanced or HTTPS. This limit has been lifted in case of Lbaas v2, as now we can have multiple ports associated with single VIP.

It can be done with pool sharing or without pool sharing.

With pool sharing :

with-pool-sharing

Without Pool Sharing :

pool-sharing

3. TLS Termination at load balancer to avoid the load on instances.

We can have the TLS termination at load balancer level instead of having the termination at backend servers. It reduces the load on backend servers and also it provides the capability of having L7 content switching if the TLS termination done at load balancer. Barbican containers are used to do the termination at load balancer level.

4. Load balancer running inside instances.

I have not seen this implementation without Octavia which is using “Amphora” instances to run the load balancer.

IMP : Both load balancer versions can’t be run simultaneously.

As promised at the beginning of article, let’s see what capabilities “Octavia” adds to lbaasv2 version.

Here is the architecture of Octavia :

octavia

Octavia API lacks the athentication facility hence it accepts the APIs from neutron instead of exposing direct APIs.

As I mentioned earlier, in case of Octavia load balancer runs inside the nova instances hence it need to communicate with components like nova, neutron to spawn the instances in which load balancer [haproxy] can run. Okay, what about other components required to spawn instance :

  • Create amphora disk image using OpenStack diskimage-builder.
  • Create a Nova flavor for the amphorae.
  • Add amphora disk image to glance.
  • Tag the above glance disk image with ‘amphora’.

But now amphora instance becomes single point of failure and also the capability to handle the load is limited. From Mitaka version onwards we can run single load balancer replicated in two instances which can run in A/P mode and send the heartbeat using VRRP. If one instance is getting down other can start serving load balancer service.

So what’s the major advantage of Octavia, okay, here comes¬† the term Elastic Load Balancing (ELB), currently VIP is associated with single load balancer it’s 1:1 relation but in case of ELB relation between VIP and load-balancer is 1:N, VIP distribute the incoming traffic over pool of “amphora” instances.

In ELB, traffic is getting distributed at two levels :

  1. VIP to pool of amphora instances.
  2. amphora instances to back-end instances.

We can also use HEAT orchestration with CEILOMETER (alarm) functionality to manage the number of instances in ‘amphora’ pool.

Combining the power of “pool of amphora instances” and “failover” we can have a robust N+1 topology in which if any VM from pool of amphora instance is getting failed, it’s getting replaced by standby VM.

 

I hope that this article shed some light on the jargon of neutron lbaas world ūüôā

How to make auto-scaling work for nova with heat and ceilometer ?

I was trying to test this feature for a very long time but never got a chance to dig into it. Today, I got a opportunity to work on this feature. I prepared a packstack OSP 7 [Kilo] setup and took the reference from wonderful official Red Hat documentation [1] to make this work.

In this article I am going to cover only scale-up scenario.

Step 1 : While installing packstack we need to make below options as “yes” so that required components can be installed.

# egrep “HEAT|CEILOMETER” /root/answer.txt | grep INSTALL
CONFIG_CEILOMETER_INSTALL=y
CONFIG_HEAT_INSTALL=y
CONFIG_HEAT_CLOUDWATCH_INSTALL=y
CONFIG_HEAT_CFN_INSTALL=y

If you have already deployed packstack setup no need to worry just enable these in answer.txt file which is used for creating existing setup and run the packstack installation command again.

Step 2 : Created three templates to make this work.

cirros.yaml – Contains the information for spawning an instance. Script is used to generate the cpu utilization alarm.

environment.yaml – Environment file to call cirros.yaml template.

sample.yaml — Containing the main logic for scaling-up.

# cat cirros.yaml
heat_template_version: 2014-10-16
description: A simple server.
resources:
server:
type: OS::Nova::Server
properties:
#block_device_mapping:
#  Рdevice_name: vda
#    delete_on_termination: true
#    volume_id: { get_resource: volume }
image: cirros
flavor: m1.tiny
networks:
– network: internal1
user_data_format: RAW
user_data: |
#!/bin/sh
while [ 1 ] ; do echo $((13**99)) 1>/dev/null 2>&1; done

# cat environment.yaml
resource_registry:
“OS::Nova::Server::Cirros”: “cirros.yaml”

# cat sample.yaml
heat_template_version: 2014-10-16
description: A simple auto scaling group.
resources:
scale_group:
type: OS::Heat::AutoScalingGroup
properties:
cooldown: 60
desired_capacity: 1
max_size: 3
min_size: 1
resource:
type: OS::Nova::Server::Cirros
scaleup_policy:
type: OS::Heat::ScalingPolicy
properties:
adjustment_type: change_in_capacity
auto_scaling_group_id: { get_resource: scale_group }
cooldown: 60
scaling_adjustment: +1
cpu_alarm_high:
type: OS::Ceilometer::Alarm
properties:
meter_name: cpu_util
statistic: avg
period: 60
evaluation_periods: 1
threshold: 20
alarm_actions:
– {get_attr: [scaleup_policy, alarm_url]}
comparison_operator: gt

 

Shedding some information on sample.yaml file, initially I am spawning only one instance and scaling this up-to maximum of 3 instances. Threshold of ceilometer set to 20.

Step 3 : Modify the ceilometer sampling interval for cpu_util in “/etc/ceilometer/pipeline.yaml” file. Changed this value from default of 10mins to 1 min.

– name: cpu_source
interval: 60
meters:
– “cpu”
sinks:
– cpu_sink

Restart all openstack services after making this change.

Step 4 : Let’s create a stack now.

[root@allinone7 VIKRANT(keystone_admin)]# heat stack-create teststack1 -f sample.yaml -e environment.yaml
+————————————–+————+——————–+———————-+
| id                                   | stack_name | stack_status       | creation_time        |
+————————————–+————+——————–+———————-+
| 0f163366-c599-4fd5-a797-86cf40f05150 | teststack1 | CREATE_IN_PROGRESS | 2016-10-10T12:02:37Z |
+————————————–+————+——————–+———————-+

Instance spawned successfully and alarm is created once the heat stack creation is completed.

[root@allinone7 VIKRANT(keystone_admin)]# nova list
+————————————–+——————————————————-+——–+————+————-+———————–+
| ID                                   | Name                                                  | Status | Task State | Power State | Networks              |
+————————————–+——————————————————-+——–+————+————-+———————–+
| 845abae0-9834-443b-82ec-d55bce2243ab | te-yvfr-ws5tn26msbub-zpeebwwwa67w-server-pxu6pqcssmmb | ACTIVE | –¬†¬†¬†¬†¬†¬†¬†¬†¬† | Running¬†¬†¬†¬† | internal1=10.10.10.53 |
+————————————–+——————————————————-+——–+————+————-+———————–+

[root@allinone7 VIKRANT(keystone_admin)]# ceilometer alarm-list
+————————————–+—————————————-+——————-+———-+———+————+——————————–+——————+
| Alarm ID                             | Name                                   | State             | Severity | Enabled | Continuous | Alarm condition                | Time constraints |
+————————————–+—————————————-+——————-+———-+———+————+——————————–+——————+
| 7746e457-9114-4cc6-8408-16b14322e937 | teststack1-cpu_alarm_high-sctookginoqz | insufficient data | low      | True    | True       | cpu_util > 20.0 during 1 x 60s | None             |
+————————————–+—————————————-+——————-+———-+———+————+——————————–+——————+

Checking the events in heat-engine.log file.

~~~
2016-10-10 12:02:37.499 22212 INFO heat.engine.stack [-] Stack CREATE IN_PROGRESS (teststack1): Stack CREATE started
2016-10-10 12:02:37.510 22212 INFO heat.engine.resource [-] creating AutoScalingResourceGroup “scale_group” Stack “teststack1” [0f163366-c599-4fd5-a797-86cf40f05150]
2016-10-10 12:02:37.558 22215 INFO heat.engine.service [req-681ddfb8-3ca6-4ecb-a8af-f35ceb358138 f6a950be30fd41488cf85b907dfa41b5 41294ddb9af747c8b46dc258c3fa61e1] Creating stack teststack1-scale_group-ujt3ixg3yvfr
2016-10-10 12:02:37.572 22215 INFO heat.engine.resource [req-681ddfb8-3ca6-4ecb-a8af-f35ceb358138 f6a950be30fd41488cf85b907dfa41b5 41294ddb9af747c8b46dc258c3fa61e1] Validating TemplateResource “ws5tn26msbub”
2016-10-10 12:02:37.585 22215 INFO heat.engine.resource [req-681ddfb8-3ca6-4ecb-a8af-f35ceb358138 f6a950be30fd41488cf85b907dfa41b5 41294ddb9af747c8b46dc258c3fa61e1] Validating Server “server”
2016-10-10 12:02:37.639 22215 INFO heat.engine.stack [-] Stack CREATE IN_PROGRESS (teststack1-scale_group-ujt3ixg3yvfr): Stack CREATE started
2016-10-10 12:02:37.650 22215 INFO heat.engine.resource [-] creating TemplateResource “ws5tn26msbub” Stack “teststack1-scale_group-ujt3ixg3yvfr” [0c311ad5-cb76-4956-b038-ab2e44721cf1]
2016-10-10 12:02:37.699 22214 INFO heat.engine.service [req-681ddfb8-3ca6-4ecb-a8af-f35ceb358138 f6a950be30fd41488cf85b907dfa41b5 41294ddb9af747c8b46dc258c3fa61e1] Creating stack teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w
2016-10-10 12:02:37.712 22214 INFO heat.engine.resource [req-681ddfb8-3ca6-4ecb-a8af-f35ceb358138 f6a950be30fd41488cf85b907dfa41b5 41294ddb9af747c8b46dc258c3fa61e1] Validating Server “server”
2016-10-10 12:02:38.004 22214 INFO heat.engine.stack [-] Stack CREATE IN_PROGRESS (teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w): Stack CREATE started
2016-10-10 12:02:38.022 22214 INFO heat.engine.resource [-] creating Server “server” Stack “teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w” [11dbdc5d-dc67-489b-9738-7ee6984c286e]
2016-10-10 12:02:42.965 22213 INFO heat.engine.service [req-e6410d2a-6f85-404d-a675-897c8a254241 – -] Service 13d36b70-a2f6-4fec-8d2a-c904a2f9c461 is updated
2016-10-10 12:02:42.969 22214 INFO heat.engine.service [req-531481c7-5fd1-4c25-837c-172b2b7c9423 – -] Service 71fb5520-7064-4cee-9123-74f6d7b86955 is updated
2016-10-10 12:02:42.970 22215 INFO heat.engine.service [req-6fe46418-bf3d-4555-a77c-8c800a414ba8 – -] Service f0706340-54f8-42f1-a647-c77513aef3a5 is updated
2016-10-10 12:02:42.971 22212 INFO heat.engine.service [req-82f5974f-0e77-4b60-ac5e-f3c849812fe1 – -] Service 083acd77-cb7f-45fc-80f0-9d41eaf2a37d is updated
2016-10-10 12:02:53.228 22214 INFO heat.engine.stack [-] Stack CREATE COMPLETE (teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w): Stack CREATE completed successfully
2016-10-10 12:02:53.549 22215 INFO heat.engine.stack [-] Stack CREATE COMPLETE (teststack1-scale_group-ujt3ixg3yvfr): Stack CREATE completed successfully
2016-10-10 12:02:53.960 22212 INFO heat.engine.resource [-] creating AutoScalingPolicy “scaleup_policy” Stack “teststack1” [0f163366-c599-4fd5-a797-86cf40f05150]
2016-10-10 12:02:55.152 22212 INFO heat.engine.resource [-] creating CeilometerAlarm “cpu_alarm_high” Stack “teststack1” [0f163366-c599-4fd5-a797-86cf40f05150]
2016-10-10 12:02:56.379 22212 INFO heat.engine.stack [-] Stack CREATE COMPLETE (teststack1): Stack CREATE completed successfully
~~~

Step 5 : Once the alarm is triggered, it will initiate the creation of one more instance.

[root@allinone7 VIKRANT(keystone_admin)]# ceilometer alarm-history 7746e457-9114-4cc6-8408-16b14322e937
+——————+—————————-+———————————————————————-+
| Type             | Timestamp                  | Detail                                                               |
+——————+—————————-+———————————————————————-+
| state transition | 2016-10-10T12:04:48.492000 | state: alarm                                                         |
| creation         | 2016-10-10T12:02:55.247000 | name: teststack1-cpu_alarm_high-sctookginoqz                         |
|                  |                            | description: Alarm when cpu_util is gt a avg of 20.0 over 60 seconds |
|                  |                            | type: threshold                                                      |
|                  |                            | rule: cpu_util > 20.0 during 1 x 60s                                 |
|                  |                            | time_constraints: None                                               |
+——————+—————————-+———————————————————————-+

Log from ceilometer log file.

~~~
From : /var/log/ceilometer/alarm-evaluator.log

2016-10-10 12:04:48.488 16550 INFO ceilometer.alarm.evaluator [-] alarm 7746e457-9114-4cc6-8408-16b14322e937 transitioning to alarm because Transition to alarm due to 1 samples outside threshold, most recent: 97.05
~~~

Step 6 : In the heat-engine.log file, we can see that triggered alarm has started the scaleup_policy and stack came in “UPDATE IN_PROGRESS” state. We are seeing two events because 2 instances are getting spawned, remember we set the max number of instances to 3, first instance got deployed during stack creation and remaining 2 instances are triggered at alarm. Actually at first alarm, first instance got triggered, as utilization stayed more than threshold for next min hence 3rd instance got triggered.

~~~

2016-10-10 12:04:48.641 22213 INFO heat.engine.resources.openstack.heat.scaling_policy [-] Alarm scaleup_policy, new state alarm
2016-10-10 12:04:48.680 22213 INFO heat.engine.resources.openstack.heat.scaling_policy [-] scaleup_policy Alarm, adjusting Group scale_group with id teststack1-scale_group-ujt3ixg3yvfr by 1
2016-10-10 12:04:48.802 22215 INFO heat.engine.stack [-] Stack UPDATE IN_PROGRESS (teststack1-scale_group-ujt3ixg3yvfr): Stack UPDATE started
2016-10-10 12:04:48.858 22215 INFO heat.engine.resource [-] updating TemplateResource “ws5tn26msbub” [11dbdc5d-dc67-489b-9738-7ee6984c286e] Stack “teststack1-scale_group-ujt3ixg3yvfr” [0c311ad5-cb76-4956-b038-ab2e44721cf1]
2016-10-10 12:04:48.919 22214 INFO heat.engine.service [req-ddf93f69-5fdc-4218-a427-aae312f4a02d – 41294ddb9af747c8b46dc258c3fa61e1] Updating stack teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w
2016-10-10 12:04:48.922 22214 INFO heat.engine.resource [req-ddf93f69-5fdc-4218-a427-aae312f4a02d – 41294ddb9af747c8b46dc258c3fa61e1] Validating Server “server”
2016-10-10 12:04:49.317 22214 INFO heat.engine.stack [-] Stack UPDATE IN_PROGRESS (teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w): Stack UPDATE started
2016-10-10 12:04:49.346 22215 INFO heat.engine.resource [-] creating TemplateResource “mmm6uxmlf3om” Stack “teststack1-scale_group-ujt3ixg3yvfr” [0c311ad5-cb76-4956-b038-ab2e44721cf1]
2016-10-10 12:04:49.366 22214 INFO heat.engine.update [-] Resource server for stack teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w updated
2016-10-10 12:04:49.405 22212 INFO heat.engine.service [req-ddf93f69-5fdc-4218-a427-aae312f4a02d – 41294ddb9af747c8b46dc258c3fa61e1] Creating stack teststack1-scale_group-ujt3ixg3yvfr-mmm6uxmlf3om-m5idcplscfcx
2016-10-10 12:04:49.419 22212 INFO heat.engine.resource [req-ddf93f69-5fdc-4218-a427-aae312f4a02d – 41294ddb9af747c8b46dc258c3fa61e1] Validating Server “server”
2016-10-10 12:04:49.879 22212 INFO heat.engine.stack [-] Stack CREATE IN_PROGRESS (teststack1-scale_group-ujt3ixg3yvfr-mmm6uxmlf3om-m5idcplscfcx): Stack CREATE started
2016-10-10 12:04:49.889 22212 INFO heat.engine.resource [-] creating Server “server” Stack “teststack1-scale_group-ujt3ixg3yvfr-mmm6uxmlf3om-m5idcplscfcx” [36c613d1-b89f-4409-b965-521b1ae2cbf3]
2016-10-10 12:04:50.406 22214 INFO heat.engine.stack [-] Stack DELETE IN_PROGRESS (teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w): Stack DELETE started
2016-10-10 12:04:50.443 22214 INFO heat.engine.stack [-] Stack DELETE COMPLETE (teststack1-scale_group-ujt3ixg3yvfr-ws5tn26msbub-zpeebwwwa67w): Stack DELETE completed successfully
2016-10-10 12:04:50.930 22215 INFO heat.engine.update [-] Resource ws5tn26msbub for stack teststack1-scale_group-ujt3ixg3yvfr updated
2016-10-10 12:05:07.865 22212 INFO heat.engine.stack [-] Stack CREATE COMPLETE (teststack1-scale_group-ujt3ixg3yvfr-mmm6uxmlf3om-m5idcplscfcx): Stack CREATE completed successfully

~~~

Step 7 : We can see the event-list of created stack for more understanding.

[root@allinone7 VIKRANT(keystone_admin)]# heat event-list 0f163366-c599-4fd5-a797-86cf40f05150
+—————-+————————————–+———————————————————————————————————————————-+——————–+———————-+
| resource_name  | id                                   | resource_status_reason                                                                                                           | resource_status    | event_time           |
+—————-+————————————–+———————————————————————————————————————————-+——————–+———————-+
| teststack1     | 6ddf5a0c-c345-43ad-8c20-54d67cf8e2a6 | Stack CREATE started                                                                                                             | CREATE_IN_PROGRESS | 2016-10-10T12:02:37Z |
| scale_group    | 528ed942-551d-482b-95ee-ab72a6f59280 | state changed                                                                                                                    | CREATE_IN_PROGRESS | 2016-10-10T12:02:37Z |
| scale_group    | 9d7cf5f4-027f-4c97-92f2-86d208a4be77 | state changed                                                                                                                    | CREATE_COMPLETE    | 2016-10-10T12:02:53Z |
| scaleup_policy | a78e9577-1251-4221-a1c7-9da4636550b7 | state changed                                                                                                                    | CREATE_IN_PROGRESS | 2016-10-10T12:02:53Z |
| scaleup_policy | cb690cd5-5243-47f0-8f9f-2d88ca13780f | state changed                                                                                                                    | CREATE_COMPLETE    | 2016-10-10T12:02:55Z |
| cpu_alarm_high | 9addbccf-cc18-410a-b1f6-401b56b09065 | state changed                                                                                                                    | CREATE_IN_PROGRESS | 2016-10-10T12:02:55Z |
| cpu_alarm_high | ed9a5f49-d4ea-4f68-af9e-355d2e1b9113 | state changed                                                                                                                    | CREATE_COMPLETE    | 2016-10-10T12:02:56Z |
| teststack1     | 14be65fc-1b33-478e-9f81-413b694c8312 | Stack CREATE completed successfully                                                                                              | CREATE_COMPLETE    | 2016-10-10T12:02:56Z |
| scaleup_policy | e65de9b1-6854-4f27-8256-f5f9a13890df | alarm state changed from insufficient data to alarm (Transition to alarm due to 1 samples outside threshold, most recent: 97.05) | SIGNAL_COMPLETE    | 2016-10-10T12:05:09Z |
| scaleup_policy | a499bfef-1824-4ef3-8c7f-e86cf14e11d6 | alarm state changed from alarm to alarm (Remaining as alarm due to 1 samples outside threshold, most recent: 95.7083333333)      | SIGNAL_COMPLETE    | 2016-10-10T12:07:14Z |
| scaleup_policy | 2a801848-bf9f-41e0-acac-e526d60f5791 | alarm state changed from alarm to alarm (Remaining as alarm due to 1 samples outside threshold, most recent: 95.0833333333)      | SIGNAL_COMPLETE    | 2016-10-10T12:08:55Z |
| scaleup_policy | f57fda03-2017-4408-b4b9-f302a1fad430 | alarm state changed from alarm to alarm (Remaining as alarm due to 1 samples outside threshold, most recent: 95.1444444444)      | SIGNAL_COMPLETE    | 2016-10-10T12:10:55Z |
+—————-+————————————–+———————————————————————————————————————————-+——————–+———————-+

We can see three instances running.

[root@allinone7 VIKRANT(keystone_admin)]# nova list
+————————————–+——————————————————-+——–+————+————-+———————–+
| ID                                   | Name                                                  | Status | Task State | Power State | Networks              |
+————————————–+——————————————————-+——–+————+————-+———————–+
| 041345cc-4ebf-429c-ab2b-ef0f757bfeaa | te-yvfr-mmm6uxmlf3om-m5idcplscfcx-server-hxaqqmxzv4jp | ACTIVE | –¬†¬†¬†¬†¬†¬†¬†¬†¬† | Running¬†¬†¬†¬† | internal1=10.10.10.54 |
| bebbd5a0-e0b2-40b4-8810-978b86626267 | te-yvfr-r7vn2e5c34b6-by4oq22vnxbo-server-ktblt3evhvd6 | ACTIVE | –¬†¬†¬†¬†¬†¬†¬†¬†¬† | Running¬†¬†¬†¬† | internal1=10.10.10.55 |
| 845abae0-9834-443b-82ec-d55bce2243ab | te-yvfr-ws5tn26msbub-zpeebwwwa67w-server-pxu6pqcssmmb | ACTIVE | –¬†¬†¬†¬†¬†¬†¬†¬†¬† | Running¬†¬†¬†¬† | internal1=10.10.10.53 |
+————————————–+——————————————————-+——–+————+————-+———————–+

 

[1] https://access.redhat.com/documentation/en/red-hat-enterprise-linux-openstack-platform/7/single/auto-scaling-for-compute/#example_auto_scaling_based_on_cpu_usage

How to configure lbaasv2 in openstack Kilo packstack setup ?

In this article I am going to show the configuration of lbaasv2 on openstack kilo packstack setup. By default lbaasv1 configuration is present, we have to modify some files to make lbaasv2  work.

First of all, I suggest you to refer the below presentation to understand the difference between lbaasv1 and lbaasv2. Most importantly, the slide number 9.

https://www.openstack.org/assets/Uploads/LBaaS.v2.Liberty.and.Beyond.pdf

Step 1 : Ensure that openstack packstack setup is installed using lbaas.

~~~

grep LBAAS /root/answer.txt
CONFIG_LBAAS_INSTALL=y

~~~

Step 2 : Make the below changes. Before making any change I suggest you to take the backup of conf file.

a) Changes made in /etc/neutron/neutron.conf 

~~~

diff /etc/neutron/neutron.conf /var/tmp/LBAAS_BACKUP/neutron.conf
79,80c79
< #service_plugins =neutron.services.loadbalancer.plugin.LoadBalancerPlugin,neutron.services.l3_router.l3_router_plugin.L3RouterPlugin
< service_plugins = neutron_lbaas.services.loadbalancer.plugin.LoadBalancerPluginv2,neutron.services.l3_router.l3_router_plugin.L3RouterPlugin

> service_plugins =neutron.services.loadbalancer.plugin.LoadBalancerPlugin,neutron.services.l3_router.l3_router_plugin.L3RouterPlugin

~~~

b) Changes made in /etc/neutron/neutron_lbaas.conf 

~~~

diff /etc/neutron/neutron_lbaas.conf /var/tmp/LBAAS_BACKUP/neutron_lbaas.conf
53,54c53
< #service_provider=LOADBALANCER:Haproxy:neutron_lbaas.services.loadbalancer.drivers.haproxy.plugin_driver.HaproxyOnHostPluginDriver:default
< service_provider = LOADBALANCERV2:Haproxy:neutron_lbaas.drivers.haproxy.plugin_driver.HaproxyOnHostPluginDriver:default

> service_provider=LOADBALANCER:Haproxy:neutron_lbaas.services.loadbalancer.drivers.haproxy.plugin_driver.HaproxyOnHostPluginDriver:default

~~~

c) Changes made in /etc/neutron/lbaas_agent.ini

~~~

diff /etc/neutron/lbaas_agent.ini /var/tmp/LBAAS_BACKUP/lbaas_agent.ini
31,32c31
< #device_driver = neutron.services.loadbalancer.drivers.haproxy.namespace_driver.HaproxyNSDriver
< device_driver = neutron_lbaas.drivers.haproxy.namespace_driver.HaproxyNSDriver

> device_driver = neutron.services.loadbalancer.drivers.haproxy.namespace_driver.HaproxyNSDriver

~~~

Step 3 : Run the below command to activate the lbaasv2 agent.

# neutron-db-manage --service lbaas upgrade head
# systemctl disable neutron-lbaas-agent.service
# systemctl stop neutron-lbaas-agent.service
# systemctl restart neutron-server.service
# systemctl enable neutron-lbaasv2-agent.service
# systemctl start neutron-lbaasv2-agent.service

Verify that lbaasv2 agent is running.
ps -ef | grep 'neutron-lbaasv2'  |grep -v grep
neutron  24609     1  0 06:01 ?        00:00:14 /usr/bin/python2 /usr/bin/neutron-lbaasv2-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /usr/share/neutron/neutron-lbaas-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/lbaas_agent.ini --config-dir /etc/neutron/conf.d/common --config-dir /etc/neutron/conf.d/neutron-lbaasv2-agent --log-file /var/log/neutron/lbaas-agent.log

Step 4 : Creating loadbalancer using LbaaSv2.

a) Create loadbalancer.

[root@allinone-7 ~(keystone_admin)]# neutron lbaas-loadbalancer-create –name Snet_test_1 9bed29a5-8cb3-436a-89fc-6ca6a8467c03
Created a new loadbalancer:
+———————+————————————–+
| Field               | Value                                |
+———————+————————————–+
| admin_state_up      | True                                 |
| description         |                                      |
| id                  | f0513999-9b07-48c4-b8b8-645b322a0e78 |
| listeners           |                                      |
| name                | Snet_test_1                          |
| operating_status    | OFFLINE                              |
| provider            | haproxy                              |
| provisioning_status | PENDING_CREATE                       |
| tenant_id           | 90686d89a72143179f7608cb9b6d0898     |
| vip_address         | 10.10.1.9                            |
| vip_port_id         | 6d95724a-1232-45ba-8992-7ffc1983b2b9 |
| vip_subnet_id       | 9bed29a5-8cb3-436a-89fc-6ca6a8467c03 |
+———————+————————————–+

b) Creating listener.

[root@allinone-7 ~(keystone_admin)]# neutron lbaas-listener-create –loadbalancer 9455e883-2fb2-49d8-8468-2b24003de808 –protocol TCP –protocol-port 80 –name Snet_test_1_80
Created a new listener:
+————————–+————————————————+
| Field                    | Value                                          |
+————————–+————————————————+
| admin_state_up           | True                                           |
| connection_limit         | -1                                             |
| default_pool_id          |                                                |
| default_tls_container_id |                                                |
| description              |                                                |
| id                       | 78bc2864-b962-4483-a287-80afe45ec6ec           |
| loadbalancers¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† | {“id”: “f0513999-9b07-48c4-b8b8-645b322a0e78”} |
| name                     | Snet_test_1_80                                 |
| protocol                 | TCP                                            |
| protocol_port            | 80                                             |
| sni_container_ids        |                                                |
| tenant_id                | 90686d89a72143179f7608cb9b6d0898               |
+————————–+————————————————+

c) Creating pool in listener.

[root@allinone-7 ~(keystone_admin)]# neutron lbaas-pool-create –lb-algorithm ROUND_ROBIN –listener Snet_test_1_80 –protocol TCP –name Snet_test_1_pool80
Created a new pool:
+———————+————————————————+
| Field               | Value                                          |
+———————+————————————————+
| admin_state_up      | True                                           |
| description         |                                                |
| healthmonitor_id    |                                                |
| id                  | 48d9b744-c7d5-41c0-873e-5d477a1f7853           |
| lb_algorithm        | ROUND_ROBIN                                    |
| listeners¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† | {“id”: “78bc2864-b962-4483-a287-80afe45ec6ec”} |
| members             |                                                |
| name                | Snet_test_1_pool80                             |
| protocol            | TCP                                            |
| session_persistence |                                                |
| tenant_id           | 90686d89a72143179f7608cb9b6d0898               |
+———————+————————————————+

d) Creating members using below commands.

~~~
# neutron lbaas-member-create –subnet 9bed29a5-8cb3-436a-89fc-6ca6a8467c03 –address 10.10.1.5 –protocol-port 80 Snet_test_1_pool80
# neutron lbaas-member-create –subnet 9bed29a5-8cb3-436a-89fc-6ca6a8467c03 –address 10.10.1.6 –protocol-port 80 Snet_test_1_pool80
~~~

e) It’s working fine in round-robin manner. I have used only private range. I am curl from inside the namespace hence I am able to reach the private range.

~~~
[root@allinone-7 ~(keystone_admin)]# ip netns exec qdhcp-049b58b3-716f-4445-ae24-32a23f8523dd bash
[root@allinone-7 ~(keystone_admin)]# for i in {1..5} ; do curl  10.10.1.9 ; done
web2
web1
web2
web1
web2
~~~

f) Even if I am using public IP I am able to access them. Let’s come out of namespace and verify the same by accessing the public IP.

~~~
[root@allinone-7 ~(keystone_admin)]# exit
[root@allinone-7 ~(keystone_admin)]# for i in {1..5} ; do curl  192.168.122.4 ; done
web1
web2
web1
web2
web1
~~~

Troubleshooting Tips :

  • Make sure httpd service is running in instances.
  • iptables are not blocking the httpd traffic.
  • selinux content are right on created http file.
  • if you are facing issue while getting response from curl using load balancer ip check whether you are getting response using instance ip or not.

Step by Step configuing openstack Neutron LbaaS in packstack setup ?

In this article, I am going to show the procedure of creating LbaaSv1 load balancer in packstack setup using two instances.

First of all, I didn’t find any image with HTTP package in it hence I created my own Fed 22 image with http and cloud packages [cloud-utils, cloud-init] installed.

If you are not going to install the cloud packages then you will face issue while spawning the instances like routes will not be configured in instance eventually you will not be able to reach the instance.

Step 1 : Downloaded one fedora 22 ISO and launch a KVM using that ISO. Installed http and cloud packages in it.

Step 2 : Poweroff the KVM and locate the qcow2 created corresponding to KVM using below command.

# virsh domblklist myimage

myimage is KVM name.

Step 3 : Reset the image so that it can become clean for use in openstack environment.

# virt-sysprep -d myimage

Step 4 : Use the qcow2 path found in Step 2 to compress the qcow2 image.

# ls -lsh /home/vaggarwa/VirtualMachines/fedora-unknown.qcow2
1.8G -rw——- 1 qemu qemu 8.1G Mar 25 11:56 /home/vaggarwa/VirtualMachines/fedora-unknown.qcow2

#virt-sparsify –compress /home/vaggarwa/VirtualMachines/fedora-unknown.qcow2 fedora22.qcow2

# ll -lsh fedora22.qcow2
662M -rw-r–r– 1 root root 664M Mar 25 11:59 fedora22.qcow2

Notice the difference before and after compression. Upload this image to glance.

Step 5 : Spawn two instances web1 and web2 while spawning the instances I am changing the index.html file to web1 and web2 respectively.

# nova boot –flavor m1.custom1 –security-groups lbsg –image c3dedff2-f0a9-4aa1-baa9-9cdc08860f6d –file /var/www/html/index.html=/root/index1.html –nic net-id=9ec24eff-f470-4d4e-8c23-9eeb41dfe749 web1

# nova boot –flavor m1.custom1 –security-groups lbsg –image c3dedff2-f0a9-4aa1-baa9-9cdc08860f6d –file /var/www/html/index.html=/root/index2.html –nic net-id=9ec24eff-f470-4d4e-8c23-9eeb41dfe749 web2

Note : I have created a new security group lbsg to allow HTTP/HTTPS traffic

Step 6 : Once the instances are spawned, you need to login into each instance and change the selinux content of the index.html file. If you want, you can disable the selinux in Step 1 itself to avoid this step.

# ip netns exec qdhcp-9ec24eff-f470-4d4e-8c23-9eeb41dfe749 ssh root@10.10.1.17

# restorecon -Rv /var/www/html/index.html

Step 7 : Create a pool which can redirect the traffic in ROUND_ROBIN manner.

# neutron lb-pool-create –name lb1 –lb-method ROUND_ROBIN –protocol HTTP –subnet 26316551-44d7-4326-b011-a519b556eda2

Note : This pool and instances are spawned using internal network.

Step 8 : Add two instances as member of pool.

# neutron lb-member-create –address 10.10.1.17 –protocol-port 80 lb1

# neutron lb-member-create –address 10.10.1.18 –protocol-port 80 lb1

Step 9 : Create a virtual IP from internal work. Port which is going to created corresponding to virtual IP. We will be attaching the floating IP to that port only.

# neutron lb-vip-create –name lb1-vip –protocol-port 80 –protocol HTTP –subnet 26316551-44d7-4326-b011-a519b556eda2 lb1

Step 10 : Attaching the floating-ip to newly created port.

# neutron floatingip-associate 09bdbe29-fa85-4110-8dd2-50d274412d8e 25b892cb-44c3-49e2-88b3-0aec7ec8a026

Step 11 : LbaaS also creates a new namespace.

# ip netns list
qlbaas-b8daa41a-3e2a-408e-862b-20d3c52b1764
qrouter-5f7f711c-be0a-4dd0-ba96-191ef760cef7
qdhcp-9ec24eff-f470-4d4e-8c23-9eeb41dfe749

# ip netns exec qlbaas-b8daa41a-3e2a-408e-862b-20d3c52b1764 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
23: tap25b892cb-44: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether fa:16:3e:ae:0b:2a brd ff:ff:ff:ff:ff:ff
inet 10.10.1.19/24 brd 10.10.1.255 scope global tap25b892cb-44
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:feae:b2a/64 scope link
valid_lft forever preferred_lft forever

Step 12 : In my case floating IP was 192.168.122.3, I ran curl on that IP, and it’s confirmed that response is coming from both member of pools in ROUND_ROBIN manner.

# for i in {1..5} ; do curl  192.168.122.3 ; done

web1
web2
web1
web2
web1

Flat Provider network with OVS

In this article, I am going to show the configuration of flat provider network. It helps to avoid the NAT which in turn improves the performance. Most importantly, compute node can reach external world directly skipping the network node.

I have referred the below link for configuration and understanding the setup.

http://docs.openstack.org/liberty/networking-guide/scenario-provider-ovs.html

I am showing the setup from packstack all-in-one.

Step 1 : As we are not going to use any tenant network here hence I left that blank. flat is mentioned in type_drivers as my external network is of flat type. If you are using VLAN provider network, you can replace it accordingly.

egrep -v “^(#|$)” /etc/neutron/plugin.ini
[ml2]
type_drivers = flat
tenant_network_types =
mechanism_drivers =openvswitch
[ml2_type_flat]
flat_networks = external
[ml2_type_vlan]
[ml2_type_gre]
[ml2_type_vxlan]
[securitygroup]
enable_security_group = True

I will be create network with name of external hence I mentioned the same in flat_networks. Comment the default vxlan settings.

Step 2 : Our ML2 plugin file is configured, now it’s turn for openvswitch configuration file.

As I will be creating network with name external hence mentioned the same in bridge_mapping. br-ex is the external bridge to which port (interface) is assigned. I have disabled the tunneling.

egrep -v “^(#|$)” /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini
[ovs]
enable_tunneling = False
integration_bridge = br-int
tunnel_bridge = br-tun
local_ip =192.168.122.163
bridge_mappings = external:br-ex
[agent]
polling_interval = 2
tunnel_types =vxlan
vxlan_udp_port =4789
l2_population = False
arp_responder = False
enable_distributed_routing = False
[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

Step 3 : Creating external network.

[root@allinone7 ~(keystone_admin)]# neutron net-create external1 –shared –provider:physical_network external –provider:network_type flat
Created a new network:
+—————————+————————————–+
| Field                     | Value                                |
+—————————+————————————–+
| admin_state_up            | True                                 |
| id                        | 6960a06c-5352-419f-8455-80c4d43dedf8 |
| name                      | external1                            |
| provider:network_type     | flat                                 |
| provider:physical_network | external                             |
| provider:segmentation_id  |                                      |
| router:external           | False                                |
| shared                    | True                                 |
| status                    | ACTIVE                               |
| subnets                   |                                      |
| tenant_id                 | a525deb290124433b80996d4f90b42ba     |
+—————————+————————————–+

As I am using flat network type hence mentioned the same for network_type, if your external network is VLAN provider network, you need to add one more parameter segmentation ID. It’s important to use the same physical_network name which you have used in Step 1 and Step 2 configuration files.

Step 4 : Creating subnet. My external network is 192.168.122.0/24
[root@allinone7 ~(keystone_admin)]# neutron net-list
+————————————–+———–+———+
| id                                   | name      | subnets |
+————————————–+———–+———+
| 6960a06c-5352-419f-8455-80c4d43dedf8 | external1 |         |
+————————————–+———–+———+

[root@allinone7 ~(keystone_admin)]# neutron subnet-create external1 192.168.122.0/24 –name external1-subnet –gateway 192.168.122.1
Created a new subnet:
+——————-+——————————————————+
| Field             | Value                                                |
+——————-+——————————————————+
| allocation_pools¬† | {“start”: “192.168.122.2”, “end”: “192.168.122.254”} |
| cidr              | 192.168.122.0/24                                     |
| dns_nameservers   |                                                      |
| enable_dhcp       | True                                                 |
| gateway_ip        | 192.168.122.1                                        |
| host_routes       |                                                      |
| id                | 38ac41fd-edc7-4ad7-a7fa-1a06000fc4c7                 |
| ip_version        | 4                                                    |
| ipv6_address_mode |                                                      |
| ipv6_ra_mode      |                                                      |
| name              | external1-subnet                                     |
| network_id        | 6960a06c-5352-419f-8455-80c4d43dedf8                 |
| tenant_id         | a525deb290124433b80996d4f90b42ba                     |
+——————-+——————————————————+
[root@allinone7 ~(keystone_admin)]# neutron net-list
+————————————–+———–+——————————————————-+
| id                                   | name      | subnets                                               |
+————————————–+———–+——————————————————-+
| 6960a06c-5352-419f-8455-80c4d43dedf8 | external1 | 38ac41fd-edc7-4ad7-a7fa-1a06000fc4c7 192.168.122.0/24 |
+————————————–+———–+——————————————————-+

Step 5 : Spawn the instance using “external” network directly.

[root@allinone7 ~(keystone_admin)]# nova list
+————————————–+—————-+——–+————+————-+————————-+
| ID                                   | Name           | Status | Task State | Power State | Networks                |
+————————————–+—————-+——–+————+————-+————————-+
| 36934762-5769-4ac1-955e-fb475b8f6a76 | test-instance1 | ACTIVE | –¬†¬†¬†¬†¬†¬†¬†¬†¬† | Running¬†¬†¬†¬† | external1=192.168.122.4 |
+————————————–+—————-+——–+————+————-+————————-+

You will be able to connect to this instance directly.

How to integrate Keystone packstack with AD ?

In this article, I am going to show the integration of keystone with active directory. In case of packstack, by default keystone is running under apache, I have written article on this before. I am going to use the same setup to configure keystone with AD.

I have referred Red Hat article to configure keystone with AD. In that article, steps suggested are for running keystone without httpd, but there is not much difference in steps. You just need to restart the apache service instead of keystone to bring the changes into reflect.

Step 1 : I have configured Windows AD setup which is very easy, after installation just run the “dcpromo.exe” command to configure the AD.

Step 2 : After configuring the AD, as suggested in Red  Hat article, I have created user and group using Windows Power CLI. If you are facing issue while setting the password use the GUI that will be much easier.

Step 3 : Time to make the changes on openstack side.

Again followed the steps for v3 api, glance and keystone provided in article, just restart the httpd service in place of keystone.

Step 4 : Below is my domain keystone configuration file. Note : I am not using any certificate hence I modified some of the options in file like port number from 636 to 389 and ldaps to ldap.

[root@allinone domains(keystone_admin)]# cat /etc/keystone/domains/keystone.ganesh.conf
[ldap]
url =  ldap://192.168.122.133:389
user = CN=svc-ldap,CN=Users,DC=ganesh,DC=com
password                 = User@123
suffix                   = DC=ganesh,DC=com
user_tree_dn             = CN=Users,DC=ganesh,DC=com
user_objectclass         = person
user_filter = (memberOf=cn=grp-openstack,CN=Users,DC=ganesh,DC=com)
user_id_attribute        = cn
user_name_attribute      = cn
user_mail_attribute      = mail
user_pass_attribute      =
user_enabled_attribute   = userAccountControl
user_enabled_mask        = 2
user_enabled_default     = 512
user_attribute_ignore    = password,tenant_id,tenants
user_allow_create        = False
user_allow_update        = False
user_allow_delete        = False

[identity]
driver = keystone.identity.backends.ldap.Identity

Step 5 : Restart the httpd service and create a domain matching the NetBIOS name of AD in my case it’s GANESH.

Step 6 : Verify that you are able to list the users present in domain.

[root@allinone domains(keystone_admin)]# openstack user list –domain GANESH
+——————————————————————+———-+
| ID                                                               | Name     |
+——————————————————————+———-+
| a557f06c03960d3b3de7d670774c1c329efe9f33e17c5aa894f0207ec78766e6 | svc-ldap |
+——————————————————————+———-+

Step 7 : I created one test user in AD “user1” and then again issued the command in openstack setup, and I can see that new user is showing in below output.

[root@allinone domains(keystone_admin)]# openstack user list –domain GANESH
+——————————————————————+———-+
| ID                                                               | Name     |
+——————————————————————+———-+
| a557f06c03960d3b3de7d670774c1c329efe9f33e17c5aa894f0207ec78766e6 | svc-ldap |
| f71c9fb8479994f287978a2b25f5796a80871b472de07bdee7794806e0902d7e | user1    |
+——————————————————————+———-+

Just in case, if someone is curious about the calls which are going to ldap server from packstack setup.

Below calls can be seen in tcpdump while collecting tcpdump in background

[root@allinone domains(keystone_admin)]# openstack user list –domain GANESH

tshark -tad -n -r /tmp/ldap.pcap -Y ldap
Running as user “root” and group “root”. This could be dangerous.
6 2016-03-13 04:25:45 192.168.122.50 -> 192.168.122.133 LDAP 125 bindRequest(1) “CN=svc-ldap,CN=Users,DC=ganesh,DC=com” simple
7 2016-03-13 04:25:45 192.168.122.133 -> 192.168.122.50 LDAP 88 bindResponse(1) success
9 2016-03-13 04:25:45 192.168.122.50 -> 192.168.122.133 LDAP 232 searchRequest(2) “CN=Users,DC=ganesh,DC=com” singleLevel
10 2016-03-13 04:25:45 192.168.122.133 -> 192.168.122.50 LDAP 332 searchResEntry(2) “CN=svc-ldap,CN=Users,DC=ganesh,DC=com”¬† | searchResEntry(2) “CN=user1,CN=Users,DC=ganesh,DC=com”¬† | searchResDone(2) success¬† [2 results]
11 2016-03-13 04:25:45 192.168.122.50 -> 192.168.122.133 LDAP 73 unbindRequest(3)

Step 8 : Listing all the present domains, roles and adding the user to project add assigning role to it.

[root@allinone domains(keystone_admin)]# openstack domain list
+———————————-+———+———+———————————————————————-+
| ID                               | Name    | Enabled | Description                                                          |
+———————————-+———+———+———————————————————————-+
| d313e92c985b456295c254e827bbbd1b | GANESH  | True    |                                                                      |
| db1b4320ec764bdfb45106cdeadc754c | heat    | True    | Contains users and projects created by heat                          |
| default                          | Default | True    | Owns users and tenants (i.e. projects) available on Identity API v2. |
+———————————-+———+———+———————————————————————-+

[root@allinone domains(keystone_admin)]# openstack role list
+———————————-+——————+
| ID                               | Name             |
+———————————-+——————+
| 5ca3a634c2b649dd9e2033509fb561cc | heat_stack_user  |
| 65f8c50174af4818997d94f0bfeb5183 | ResellerAdmin    |
| 68a199b73276438a8466f51a03cd2980 | admin            |
| 8c574229aa654937a5a53d3ced333c08 | heat_stack_owner |
| 9a408ea418884fee94e10bfc8019a6f3 | SwiftOperator    |
| 9fe2ff9ee4384b1894a90878d3e92bab | _member_         |
+———————————-+——————+

[root@allinone domains(keystone_admin)]# openstack role add –project demo –user f71c9fb8479994f287978a2b25f5796a80871b472de07bdee7794806e0902d7e _member_

 

 

Reference :

I found very good information about comparison of keystone v2 and keystone v3.

[1] http://www.madorn.com/keystone-v3-api.html#.VuUiC5SbRIt

Various nova instance migration techniques.

In this article, I am going to list the various nova instances techniques. I have used my packstack all-in-one setup and two extra compute nodes to show these tests. I am using the local storage

  • Offline storage migration : Downtime required.

As my instances ephemeral disks are configured on local storage hence the first migration which comes to our mind is the offline migration :

[root@allinone6 ~(keystone_admin)]# nova migrate test-instance1 –poll

Above command will not give the option to specify the destination host on which we want to run the instance, scheduler will choose the destination host for you.

Once the migration is completed successfully, you will see the instance is running (ACTIVE) on other compute node.

I have seen the below state of instance during the migration.

ACTIVE –> RESIZE –> VERIFY_RESIZE –> ACTIVE

If I am checking the instance action list I can see that it has performed the migrate and resize operations both.

[root@allinone6 ~(keystone_admin)]# nova instance-action-list test-instance1
+—————+——————————————+———+—————————-+
| Action        | Request_ID                               | Message | Start_Time                 |
+—————+——————————————+———+—————————-+
| create¬†¬†¬†¬†¬†¬†¬† | req-93d78dbe-8914-46b9-9605-0e9ff7ed76e8 | –¬†¬†¬†¬†¬†¬† | 2016-03-06T02:13:57.000000 |
| migrate¬†¬†¬†¬†¬†¬† | req-f0c214d7-d5ed-4633-a147-056dad6611a2 | –¬†¬†¬†¬†¬†¬† | 2016-03-07T05:04:26.000000 |
| confirmResize | req-6d97c3cf-a509-4e6c-a016-457569ca46b3 | –¬†¬†¬†¬†¬†¬† | 2016-03-07T05:05:14.000000 |
+—————+——————————————+———+—————————-+

Even thou it’s performing the resize but flavor will remain the same. Only reason which I can find for resize is that migrate and resize are sharing the same code.

Below are the nova-api.log from the controller node.

[root@allinone6 ~(keystone_admin)]# grep ‘req-f0c214d7-d5ed-4633-a147-056dad6611a2’ /var/log/nova/nova-api.log
2016-03-07 00:04:26.198 3819 DEBUG nova.api.openstack.wsgi [req-f0c214d7-d5ed-4633-a147-056dad6611a2 None] Action: ‘action’, calling method: <bound method AdminActionsController._migrate of <nova.api.openstack.compute.contrib.admin_actions.AdminActionsController object at 0x3aff8d0>>, body: {“migrate”: null} _process_stack /usr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py:934
2016-03-07 00:04:26.228 3819 DEBUG nova.compute.api [req-f0c214d7-d5ed-4633-a147-056dad6611a2 None] [instance: c8cb4bcc-2b4e-4478-9a7c-61f5170fb177] flavor_id is None. Assuming migration. resize /usr/lib/python2.7/site-packages/nova/compute/api.py:2559
2016-03-07 00:04:26.229 3819 DEBUG nova.compute.api [req-f0c214d7-d5ed-4633-a147-056dad6611a2 None] [instance: c8cb4bcc-2b4e-4478-9a7c-61f5170fb177] Old instance type m1.tiny,  new instance type m1.tiny resize /usr/lib/python2.7/site-packages/nova/compute/api.py:2578
2016-03-07 00:04:26.305 3819 INFO oslo.messaging._drivers.impl_rabbit [req-f0c214d7-d5ed-4633-a147-056dad6611a2 ] Connecting to AMQP server on 192.168.122.234:5672
2016-03-07 00:04:26.315 3819 INFO oslo.messaging._drivers.impl_rabbit [req-f0c214d7-d5ed-4633-a147-056dad6611a2 ] Connected to AMQP server on 192.168.122.234:5672
2016-03-07 00:04:26.563 3819 INFO nova.osapi_compute.wsgi.server [req-f0c214d7-d5ed-4633-a147-056dad6611a2 None] 192.168.122.234 “POST /v2/618cb39791784d7fb7a80d17eb99b306/servers/c8cb4bcc-2b4e-4478-9a7c-61f5170fb177/action HTTP/1.1” status: 202 len: 209 time: 0.4026990

It’s a offline operation, we can confirm the same using uptime of instance.

[root@allinone6 ~(keystone_admin)]# ip netns exec qdhcp-0b9572fb-29fe-4705-b50c-74aa00acb983 ssh cirros@10.10.3.15
cirros@10.10.3.15’s password:
$ uptime
22:05:31 up 0 min,  1 users,  load average: 0.12, 0.04, 0.01

  • Evacuating the instance from failed compute node.

This makes sense while using the shared storage.

Instance is running on compute26 and I shutdown the same node, instance remains in the ACTIVE state but I am not able to ping the instance. Actually the instance is down.
[root@allinone6 ~(keystone_admin)]# nova service-list
+—-+——————+———–+———-+———+——-+—————————-+—————–+
| Id | Binary           | Host      | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+—-+——————+———–+———-+———+——-+—————————-+—————–+
| 1¬† | nova-consoleauth | allinone6 | internal | enabled | up¬†¬†¬† | 2016-03-07T05:20:44.000000 | –¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† |
| 2¬† | nova-scheduler¬†¬† | allinone6 | internal | enabled | up¬†¬†¬† | 2016-03-07T05:20:44.000000 | –¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† |
| 3¬† | nova-conductor¬†¬† | allinone6 | internal | enabled | up¬†¬†¬† | 2016-03-07T05:20:44.000000 | –¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† |
| 5¬† | nova-compute¬†¬†¬†¬† | allinone6 | nova¬†¬†¬†¬† | enabled | up¬†¬†¬† | 2016-03-07T05:20:39.000000 | –¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† |
| 6¬† | nova-cert¬†¬†¬†¬†¬†¬†¬† | allinone6 | internal | enabled | up¬†¬†¬† | 2016-03-07T05:20:44.000000 | –¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† |
| 7¬† | nova-compute¬†¬†¬†¬† | compute26 | nova¬†¬†¬†¬† | enabled | down¬† | 2016-03-07T05:19:19.000000 | –¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† |
| 8¬† | nova-compute¬†¬†¬†¬† | compute16 | nova¬†¬†¬†¬† | enabled | up¬†¬†¬† | 2016-03-07T05:20:44.000000 | –¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬†¬† |
+—-+——————+———–+———-+———+——-+—————————-+—————–+

Started the evacuation of instance[,s] from the failed node using below command, in this case we can specify the destination compute node.

[root@compute16 ~(keystone_admin)]# nova host-evacuate –target_host allinone6 compute26
+————————————–+——————-+—————+
| Server UUID                          | Evacuate Accepted | Error Message |
+————————————–+——————-+—————+
| c8cb4bcc-2b4e-4478-9a7c-61f5170fb177 | True              |               |
+————————————–+——————-+—————+

Below is the state of instances which I noticed in nova list output.

ACTIVE — REBUILD — ACTIVE

In below command, we can see the evacuate task is inserted.

[root@allinone6 ~(keystone_admin)]# nova instance-action-list test-instance1
+—————+——————————————+———+—————————-+
| Action        | Request_ID                               | Message | Start_Time                 |
+—————+——————————————+———+—————————-+
| create¬†¬†¬†¬†¬†¬†¬† | req-93d78dbe-8914-46b9-9605-0e9ff7ed76e8 | –¬†¬†¬†¬†¬†¬† | 2016-03-06T02:13:57.000000 |
| migrate¬†¬†¬†¬†¬†¬† | req-f0c214d7-d5ed-4633-a147-056dad6611a2 | –¬†¬†¬†¬†¬†¬† | 2016-03-07T05:04:26.000000 |
| confirmResize | req-6d97c3cf-a509-4e6c-a016-457569ca46b3 | –¬†¬†¬†¬†¬†¬† | 2016-03-07T05:05:14.000000 |
| migrate¬†¬†¬†¬†¬†¬† | req-43e92f8e-04d6-4379-98c1-8ce72094766f | –¬†¬†¬†¬†¬†¬† | 2016-03-07T05:17:09.000000 |
| confirmResize | req-4a11404d-e448-4692-86f0-063a0dfd2d4a | –¬†¬†¬†¬†¬†¬† | 2016-03-07T05:17:42.000000 |
| evacuate¬†¬†¬†¬†¬† | req-1b22f414-36c1-487e-9560-359e2ecd2800 | –¬†¬†¬†¬†¬†¬† | 2016-03-07T05:22:18.000000 |
+—————+——————————————+———+—————————-+

We can see the below nova-api.log corresponding to “evacuate” operation.

[root@allinone6 ~(keystone_admin)]# grep ‘req-1b22f414-36c1-487e-9560-359e2ecd2800’ /var/log/nova/nova-api.log
2016-03-07 00:22:18.162 3819 DEBUG nova.api.openstack.wsgi [req-1b22f414-36c1-487e-9560-359e2ecd2800 None] Action: ‘action’, calling method: <bound method Controller._evacuate of <nova.api.openstack.compute.contrib.evacuate.Controller object at 0x3b01f10>>, body: {“evacuate”: {“host”: “allinone6”, “onSharedStorage”: false}} _process_stack /usr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py:934
2016-03-07 00:22:18.209 3819 DEBUG nova.compute.api [req-1b22f414-36c1-487e-9560-359e2ecd2800 None] [instance: c8cb4bcc-2b4e-4478-9a7c-61f5170fb177] vm evacuation scheduled evacuate /usr/lib/python2.7/site-packages/nova/compute/api.py:3258
2016-03-07 00:22:18.219 3819 DEBUG nova.servicegroup.drivers.db [req-1b22f414-36c1-487e-9560-359e2ecd2800 None] Seems service is down. Last heartbeat was 2016-03-07 05:19:19. Elapsed time is 179.219082 is_up /usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py:75
2016-03-07 00:22:18.270 3819 INFO nova.osapi_compute.wsgi.server [req-1b22f414-36c1-487e-9560-359e2ecd2800 None] 192.168.122.41 “POST /v2/618cb39791784d7fb7a80d17eb99b306/servers/c8cb4bcc-2b4e-4478-9a7c-61f5170fb177/action HTTP/1.1” status: 200 len: 225 time: 0.1504130

  • Live-migration [block migration]

In this case we are doing the live migration of instance despite of not having the shared storage configured for the instance. It will perform the scp of disk from source to destination compute node.

ACTIVE¬† –> MIGRATING –> ACTIVE

Below is the command for the live-migration.

[root@allinone6 ~(keystone_admin)]# nova live-migration –block-migrate test-instance1 compute16

In instance-action-list, you will not be able to see anything.

[root@allinone6 ~(keystone_admin)]# nova instance-action-list test-instance1
+——–+——————————————+———+—————————-+
| Action | Request_ID                               | Message | Start_Time                 |
+——–+——————————————+———+—————————-+
| create | req-93d78dbe-8914-46b9-9605-0e9ff7ed76e8 | –¬†¬†¬†¬†¬†¬† | 2016-03-06T02:13:57.000000 |
+——–+——————————————+———+—————————-+

You can see the below logs in nova-api.log file while doing the live migration.

From : /var/log/nova/nova-api.log

~~~
2016-03-06 23:55:42.463 3820 INFO nova.osapi_compute.wsgi.server [req-22ca9202-8a90-4998-a0d2-6705e7bbfa71 None] 192.168.122.234 “GET /v2/618cb39791784d7fb7a80d17eb99b306/servers/c8cb4bcc-2b4e-4478-9a7c-61f5170fb177 HTTP/1.1” status: 200 len: 1903 time: 0.1395051
2016-03-06 23:55:42.468 3818 DEBUG nova.api.openstack.wsgi [req-6fbe4aa6-2e40-403d-b573-18c6f50669f5 None] Action: ‘action’, calling method: <bound method AdminActionsController._migrate_live of <nova.api.openstack.compute.contrib.admin_actions.AdminActionsController object at 0x3aff8d0>>, body: {“os-migrateLive”: {“disk_over_commit”: false, “block_migration”: true, “host”: “compute16”}} _process_stack /u
sr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py:934
2016-03-06 23:55:42.505 3818 DEBUG nova.compute.api [req-6fbe4aa6-2e40-403d-b573-18c6f50669f5 None] [instance: c8cb4bcc-2b4e-4478-9a7c-61f5170fb177] Going to try to live migrate instance to compute16 live_migrate /usr/lib/python2.7/site-packages/nova/compute/api.py:3234
2016-03-06 23:55:42.765 3818 INFO nova.osapi_compute.wsgi.server [req-6fbe4aa6-2e40-403d-b573-18c6f50669f5 None] 192.168.122.234 “POST /v2/618cb39791784d7fb7a80d17eb99b306/servers/c8cb4bcc-2b4e-4478-9a7c-61f5170fb177/action HTTP/1.1” status: 202 len: 209 time: 0.2986290
2016-03-06 23:55:49.325 3819 DEBUG keystoneclient.session [-] REQ: curl -i -X GET http://192.168.122.234:35357/v3/auth/tokens -H “X-Subject-Token: TOKEN_REDACTED” -H “User-Agent: python-keystoneclient” -H “Accept: application/json” -H “X-Auth-Token: TOKEN_REDACTED” _http_log_request /usr/lib/python2.7/site-packages/keystoneclient/session.py:155
~~~