Monday, October 12, 2009

How to make ejabberd cluster setup a bit easier

Even though there is a decent official documentation and many excellent posts on the topic of setting ejabberd cluster, it still could be confusing, as it was for me. It is possible to avoid some guessing work though and make things more "automated". Following explanation uses the directory structure of installation created by standard ejabberd binary installer:

1. Install ejabberd on a single node
2. Locate ejabberdctl.cfg in conf directory of your installation and adjust Erlang node name on the last line:

ERLANG_NODE=ejabberd@`hostname -f`  
 
3. Run ejabberd:
 your_ejabberd_dir/bin/ejabberdctl start
4. Make sure it runs:
 your_ejabberd_dir/bin/ejabberdctl status 
 By now you're done with first node.  
For the second and consequent nodes:
5. Synchronize node's Erlang cookie with one at 1st node (check out "Clustering setup" in ejabberd documentation on how it's done); 
6. repeat steps 1 to 4; 
7. Run:
 your_ejabberd_dir/bin/ejabberdctl debug 
At Erlang shell prompt (press Enter when asked "press any key"), type:
FirstNode = 'ejabberd@first', %% use the name of your first node (ejabberd@, see p.2 above)
mnesia:stop(),
mnesia:delete_schema([node()]),
mnesia:start(),
mnesia:change_config(extra_db_nodes, [FirstNode]),
mnesia:change_table_copy_type(schema, node(), disc_copies). 
 

The above script is a replacement of p.2,3 of official ejabberd clustering setup doc. It takes advantage of not having to manually figure out Mnesia location and proper syntax of the command suggested in there. 
8. End debug session by pressing Ctrl-c, Ctrl-c;
9. Continue with p.4 of official ejabberd clustering setup document. 
 
The piece of code above could probably be useful elsewhere, for example as part of ejabberd admin interface. Imagine having "Join cluster" and "Leave cluster" buttons somewhere on a Nodes page. Also, it's probably possible to save some manual work in situations where you want a bunch of ejabberd nodes to join existing cluster. In this case you could wrap something similar to above code into a single function and do rpc call on each of these nodes. All such things would obviously require a bit more work, such as checking if the running node is already part of the cluster etc.