Last time, I talked about how much I like using Ganglia for monitoring a large number of distributed servers.

One of the issues I ran into that is barely covered in the documentation is how to set this up if you cannot use multicast.  Multicast is the default method that ganglia uses to discover nodes.  This is great, it means that auto-discovery works… kinda… The issue is that most cloud providers squash your ability to do multicast .  This is a good thing, can you imagine having to share a room with the guy who can’t stop screaming through the bull-horn every 2 milliseconds?  So, if I want to use ganglia in EC2, the Amazon cloud, how do I go about doing that ?

To get around this issue, you need to configure ganglia in unicast mode.  This is the mysterious part, what exactly is it, where do I set it and, how do I have multiple clusters in unicast mode all report to the same web-UI?  Most of the tutorials I read alluded to the fact that you could have multiple clusters setup in ganglia, and most speculated [ some even correctly ] about how to do it, but none really implemented it.  So, here is how you can disable multicast in ganglia and instead, enable unicast with multiple clusters.

First, to get started with this, there are a couple of ganglia components that you really need to be familiar with.

gmetad

gmetad is the ‘server’ side of ganglia.  It is responsible for taking the data from the remote collectors and stuffing it into the backend database ( ganglia uses rrdtool).  You’ll have one of these bad-boys running for each web-ui you have setup.

Configuration

First of all, take a look at the full, default config file.  It’s got a lot of great comments in there and really helps to explain everything from soup to nuts.  That being said, Here’s what I used ( and my comments) to get me up and running.

Configuring this is done in ( default ) /etc/gmetad.conf

</p>

<h1>Each 'cluster' is its own data-source</h1>

<h1>I have two clusters, so, 2 data-sources</h1>

<h1>... plus my local host</h1>

<p>data_source &quot;Local&quot; localhost
data_source &quot;ClusterA&quot; localhost:8650
data_source &quot;ClusterB&quot; localhost:8655</p>

<h1>I have modified this from the default rrdtool</h1>

<h1>storage config for my purposes, I want to</h1>

<h1>store 3 full years of datapoints.Sure there</h1>

<h1>is a storage requirement, but that's what I need.</h1>

<p>RRAs &quot;RRA:AVERAGE:0.5:1:6307199&quot; &quot;RRA:AVERAGE:0.5:4:1576799&quot; &quot;RRA:AVERAGE:0.5:40:52704&quot;

Essentially, the above sets up two clusters, ClusterA and ClusterB.  The sources from these are coming from localhost:8650 and localhosty:8651 respectively  ( don’t worry, I’ll explain that bit below…).  The other thing for me is that I need to keep 3 full years of real datapoints.  ( rrdtool is designed to ‘aggregate’ your data after some time.  If you don’t adjust it, you lose resolution to the aggregation, which can be frustrating).

gmond

gmond is a data-collector.  It will, essentially, collect data from a host and send it … somewhere.  Let’s discuss where.

Before we address the multiple clusters piece, here’s how you disable multicast.  The default config file will contain three sections that you really care about:

( The things we need to change are:

   Cluster -> name

comment out the udp_send_channel -> mcast_join parameter

comment out the udp_recv_channel -> mcast_join parameter

comment out the udp_recv_channel -> bind parameter

)

</p>

<p>/* If a cluster attribute is specified, then all gmond hosts are wrapped inside
* of a &lt;CLUSTER&gt; tag. If you do not specify a cluster tag, then all &lt;HOSTS&gt; will
* NOT be wrapped inside of a &lt;CLUSTER&gt; tag. */
cluster {
name = &quot;unspecified&quot;
owner = &quot;unspecified&quot;
latlong = &quot;unspecified&quot;
url = &quot;unspecified&quot;
}</p>

<p>/* The host section describes attributes of the host, like the location */
host {
location = &quot;unspecified&quot;
}</p>

<p>/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {</p>

<h1>Comment this out for unicast</h1>

<h1>mcast_join = 239.2.11.71</h1>

<p>port = 8649
}</p>

<p>/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {</p>

<h1>Comment this out for unicast</h1>

<h1>mcast_join = 239.2.11.71</h1>

<p>port = 8649</p>

<h1>Comment this out for unicast</h1>

<h1>bind = 239.2.11.71</h1>

<p>}</p>

<p>

So, in order to convert this to unicast, you would just comment out the above, and set the port to some available tcp/ip port… that simple!

So, I have 3 clusters, localhost, ClusterA and ClusterB.  To get this working with Unicast ( unicast meaning that I talk to one specific endpoint ), I need to have a separate gmond running on my server for EACH cluster.

So, on the ganglia server, I have 3 gmond config files:

(localhost)

&lt;/pre&gt;
/*
 * The cluster attributes specified will be used as part of the &lt;CLUSTER&gt;
 * tag that will wrap all hosts collected by this instance.
 <em>/
cluster {
 name = &quot;Local&quot;
 owner = &quot;Scottie&quot;
 latlong = &quot;unspecified&quot;
 url = &quot;unspecified&quot;
}
/</em> The host section describes attributes of the host, like the location */
host {
 location = &quot;GangliaSever&quot;
}</p>

<p>/* Feel free to specify as many udp_send_channels as you like. Gmond
 used to only support having a single channel */
udp_send_channel {
host = localhost
port = 8649
}</p>

<p>/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8649
}</p>

<p>/* You can specify as many tcp_accept_channels as you like to share
 an xml description of the state of the cluster */
tcp_accept_channel {
 port = 8649
}

Remember the ‘data-sources’ from your gmetad.conf file? Well, if you look up, you’ll see that the data-source for the ‘Local’ cluster was ‘localhost:8649’  Essentially, gmetad will talk to this gmond on localhost:8649 for receiving data.  Now, the remainder of your gmond.conf file is important, it dictates all of the monitoring that the gmond instance will do.  Only change the section that I have listed above.

Now for the two remaining clusters:

ClusterA:

/*
 * The cluster attributes specified will be used as part of the &lt;CLUSTER&gt;
 * tag that will wrap all hosts collected by this instance.
 <em>/
cluster {
 name = &quot;ClusterA&quot;
 owner = &quot;Scottie&quot;
 latlong = &quot;unspecified&quot;
 url = &quot;unspecified&quot;
}
/</em> The host section describes attributes of the host, like the location */
host {
 location = &quot;GangliaSever&quot;
}</p>

<p>/* Feel free to specify as many udp_send_channels as you like. Gmond
 used to only support having a single channel */
udp_send_channel {
host = localhost
port = 8650
}</p>

<p>/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8650
}</p>

<p>/* You can specify as many tcp_accept_channels as you like to share
 an xml description of the state of the cluster */
tcp_accept_channel {
 port = 8650
}

Cluster B:

/*
 * The cluster attributes specified will be used as part of the &lt;CLUSTER&gt;
 * tag that will wrap all hosts collected by this instance.
 <em>/
cluster {
 name = &quot;ClusterB&quot;
 owner = &quot;Scottie&quot;
 latlong = &quot;unspecified&quot;
 url = &quot;unspecified&quot;
}
/</em> The host section describes attributes of the host, like the location */
host {
 location = &quot;GangliaSever&quot;
}</p>

<p>/* Feel free to specify as many udp_send_channels as you like. Gmond
 used to only support having a single channel */
udp_send_channel {
host = localhost
port = 8655
}</p>

<p>/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8655
}</p>

<p>/* You can specify as many tcp_accept_channels as you like to share
 an xml description of the state of the cluster */
tcp_accept_channel {
 port = 8655
}

Now that we’ve got our ‘server’ setup to receive data for each of our clusters, we need to configure the actual hosts that are part of that cluster to forward data in.  Essentially, this is going to be the same ‘gmond’ configuration, but will forward data to the ‘gmond’ that we just setup on the server.

Let’s say we have three hosts:

Grumpy ( is our local server)

Sleepy ( Cluster A)

Doc ( Cluster B)

Now, let’s configure their gmond’s to talk to our server (Grumpy) and start saving off our data.  First of all, Grumpy is already configured up and running, so if you connected to the ganglia interface at this point ( and your gmetad is running ), you should see ‘Grumpy’ showing up in the ‘Local’ cluster.

On each of these hosts, you only change the host field to be the name or IP address of your ganglia ‘server’ ( udp_send_channel->host)

:

&lt;pre&gt;/*
 * The cluster attributes specified will be used as part of the &lt;CLUSTER&gt;
 * tag that will wrap all hosts collected by this instance.
 */
cluster {
 name = &quot;ClusterA&quot;
 owner = &quot;Scottie&quot;
 latlong = &quot;unspecified&quot;
 url = &quot;unspecified&quot;
}</p>

<p>/* The host section describes attributes of the host, like the location */
host {
 location = &quot;GangliaSever&quot;
}</p>

<p>/* Feel free to specify as many udp_send_channels as you like. Gmond
 used to only support having a single channel */
udp_send_channel {
host = grumpy
port = 8650
}</p>

<p>/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8650
}</p>

<p>/* You can specify as many tcp_accept_channels as you like to share
 an xml description of the state of the cluster */
tcp_accept_channel {
 port = 8650
}&lt;/pre&gt;

On Doc ( Cluster B ), you make the same change ( udp_send_channel->host ):

/*
 * The cluster attributes specified will be used as part of the &lt;CLUSTER&gt;
 * tag that will wrap all hosts collected by this instance.
 <em>/
cluster {
 name = &quot;ClusterB&quot;
 owner = &quot;Scottie&quot;
 latlong = &quot;unspecified&quot;
 url = &quot;unspecified&quot;
}
/</em> The host section describes attributes of the host, like the location */
host {
 location = &quot;GangliaSever&quot;
}</p>

<p>/* Feel free to specify as many udp_send_channels as you like. Gmond
 used to only support having a single channel */
udp_send_channel {
host = grumpy
port = 8655
}</p>

<p>/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
port = 8655
}</p>

<p>/* You can specify as many tcp_accept_channels as you like to share
 an xml description of the state of the cluster */
tcp_accept_channel {
 port = 8655
}

Once you start the gmond process on each server, wait a few and they will appear in the ganglia interface. Simple as that!