Profiling Ansible memory usage per task

Sai's Blog

In this post, let’s look at how I was able to profile memory usage per task in a piece of software that uses ansible for configuration. While the post talks about getting memory profiling working for any ansible playbook, it also goes deeper into the TripleO example.

TripleO, Config-Download and Ansible

In OpenStack Queens, TripleO defaults to use an agent running on each overcloud node called os-collect-config. This agent periodically polls the undercloud Heat API for software configuration changes that need to be applied to the node. The os-collect-config agent runs os-refresh-config and os-apply-config as needed whenever new software configuration changes are detected. 

This model is a “pull” style model given each node polls the Heat API and pulls changes, then applies them locally. With the Ansible-based config-download, director/TripleO has switched to a “push” style model. Heat is still used to create the stack (and all of the Red Hat…

View original post 965 more words

Performance Monitoring of Red Hat Satellite 6 using satellite-performance

Continuous time-series metric collection of Satellite  & all Capsules are essential while satellite running at scale.
This post is helps to configure & monitor metrics using satellite-performance
1) Tools:
  • Collectd – Daemon to collect System Performance Statistics
    • Collects CPU, Memory, Disk, Network, Per Process stats (Regex), Postgresql, mongodb, turbostat, qpid, foreman, DynFlow, Passenger, Puppet, Tomcat, collectd..etc
  • Graphite/Carbon
    • Carbon receives metrics, and flushes them to whisper database files
    • Graphite is webapp frontend to carbon
  • Grafana – Visualize metrics from multiple backends.
    • Dashboards saved in json and customized by Ansible during deployment

2) Architecture

monitoring

3) How do i configure performance?

Archit has come up with a nice blog for configuration

Description of metrics collected in satperf:
http://arcolife.github.io/blog/2016/10/05/monitoring-in-satperf-metrics-collection

4)  Example Graphs
4.1 )  Passenger Mem
foreman
4.2) Postgresql DB (candlepin & foreman)
db
4.3) Candlepin DB
candlepindb
4.4) Puppet Registrations
pupper_registrations
4.2) Dynflow Mem
dynflow
Thanks to Archit, Jhutar for providing inputs & help!

RedHat Satellite 6.2 Considerations for Large scale deployments

Red Hat Satellite is a complete system management product that allows system administrators to manage the full life cycle of Red Hat deployments across physical, virtual, and private clouds. Red Hat Satellite delivers system provisioning, configuration management, software management, and subscription management- all while maintaining high scalability and security.  Satellite 6.2 is third major release of the next generation Satellite with a raft of improvements that continue to narrow the gaps in functionality found in Satellite 5 in many critical areas of the product.  This Blog  provides basic guidelines and considerations for tuning Red Hat Satellite 6.2 & capsule for Large scale deployments

1) Increase open-files-limit for Apache with systemd on satellite & Capsule server

# cat /etc/systemd/system/httpd.service.d/limits.conf

[Service]

LimitNOFILE=1000000

# systemctl daemon-reload

# katello-service restart

2) Increase open-files-limit for Qpid with systemd on satellite & Capsule server

# cat /etc/systemd/system/qpidd.service.d/limits.conf

[Service]

LimitNOFILE=1000000

# systemctl daemon-reload

# katello-service restart

3) Increase postgresql shared_buffer

While registering content hosts at scale to Satellite server, shared_buffers needs to be set appropriately in postgresql.conf. Recommended: 256 MB

4) Increase postgresql max_connections

When registering content hosts at scale, it is recommended to increase max_connections setting (set to 100 by default) as per your needs and HW profile. For example, you might need to set the value to 200 when you are registering 200 content hosts in parallel.

5) Storage planning for qpid

When you use katello-agent extensively, plan storage capacity for /var/lib/qpidd in advance. Currently, in Satellite 6.2 /var/lib/qpidd requires 2MB disk space per a content hos.

6) Increase open-files-limit for Qpid Dispatch Router with systemd on satellite & Capsule server

# cat /etc/systemd/system/qdrouterd.service.d/limits.conf

[Service]

LimitNOFILE=1000000

# systemctl daemon-reload

# katello-service restart

 

Special Thanks to Jan Jutar & Archit Sarma for the help to get scale numbers.

 

 

 

 

 

 

 

Starting MongoDB on CentOS with NUMA disabled

Have been noticing this every time i run mongodb on centos/RHEL or any other.

Error:
Sun Dec 20 06:26:16.832 [initandlisten] ** WARNING: You are running on a NUMA machine.
Sun Dec 20 06:26:16.832 [initandlisten] ** We suggest launching mongod like this to avoid performance problems:
Sun Dec 20 06:26:16.832 [initandlisten] ** numactl –interleave=all mongod [other options]
Sun Dec 20 06:26:16.832 [initandlisten]

TO avoid performance issues, its recommended to run mongodb with interleaving memory across all NUMA nodes.

Dint find any way to solve this.
Kill existing mongod and restart mongodb with below.

numactl –interleave=all runuser -s /bin/bash mongodb -c “/usr/bin/mongod –dbpath /var/lib/mongodb”

 

Qcow2 image creation with preallocation options to boost IO performance

While i started evaluating KVM IO performance, i leveraged qcow2 preallocation capabilities which helped me to boot IO performance. Its always good to preallocate image for better performance. For ex, while attaching disks to instances in openstack.

Earlier if we want to preallocate qcow2 image we had to do manually using falloc as shown below.

qemu-img create -f qcow2 -o preallocation=metadata /tmp/test.qcow2 8G
fallocate -l 8591507456 /tmp/test.qcow2

Now we have these options very useful.

qemu-img create -f qcow2 /tmp/test.qcow2 -o preallocation=falloc 1G

preallocation=falloc
Formatting ‘/tmp/test.qcow2′, fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=’falloc’ lazy_refcounts=off refcount_bits=16

“falloc” mode preallocated space for image by calling posic_fallocate()

preallocation=full

qemu-img create -f qcow2 /tmp/test.qcow2 -o preallocation=full 1G
Formatting ‘/tmp/test.qcow2′, fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=’full’ lazy_refcounts=off refcount_bits=16

“full” mode preallocates space for image by writing zeros to underlying storage. Similing to dd

Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1232570

Thanks to my IO Guru Stefan Hajnoczi for clarifying some of the IO part internals.