Sunday, November 21, 2010

gen_client: running BOSH and new configuration options

As a result of efforts to make gen_client run over BOSH, there are now 2 new groups of options:

  • connection - allows to specify type of connection and its parameters. Available types are tcp, ssl and bosh. 
  • auth - allows to specify type of authentication and its parameters. There are currently 2 types reflecting exmpp authentication: basic (or legacy auth) and sasl. .
    Note: this version of gen_client will not work with exmpp versions lower than 0.9.5.

    Example: to login with through BOSH served at using SASL DIGEST-MD5:

    gen_client:start("", [{connection, {bosh, ""}}, {auth, [{password, "my_pwd"}, {sasl, "DIGEST-MD5"}]}]. 

    Yes, that's that simple. The default connection type is tcp, and default auth type is SASL PLAIN, so the code that uses previous versions of gen_client shouldn't break.

    Note: as of now (v. 0.9.5) exmpp supports only SASL PLAIN, SASL ANONYMOUS and SASL DIGEST-MD5. Also, BOSH will only work with SASL, but not with basic authentication.

    Wednesday, August 11, 2010

    The gen_client API overview

    I've been using gen_client in few real projects, which helped to weed out many bugs, shape the design and evaluate features. I feel that it's now ready for public usage, so I would like to briefly explain some concepts used by gen_client and show how they might help to make XMPP client programming easier.

    Starting the client.

    start(Jid, Host, Port, Password) 
    start(Jid, Host, Port, Password, Options)
    start(Account, Domain, Host, Port, Password, Options)
    start(Account, Domain, Resourse, Host, Port, Password, Options)

    Above calls create a client session and return {ok, ClientRef} tuple in case everything went well. ClientRef (which is a Pid of gen_server process associated with the client session) then can be used to make gen_client API calls. For example:

    {ok, Client} = gen_client:start("", "", 5222, "test",
    [{debug, true}, {presence, {true, "I'm online."}}, {reconnect, 15000}]),
    gen_client:add_plugin(Client, disco_plugin, [test_disco, []]).

    Options describe different aspect of the client, such as (default values go first):

    • {debug, false | true} - printing out debug info;
    • {presence, {true, Msg} | false} - should the client send a presence, and if yes, specify the presence message;
    • {reconnect, Timeout} - should the client reconnect after losing the connection, and the timeout for reconnection;
    • {log_in, true | false} - should the client log in automatically; useful when you want to choose between logging in and registration;
    • more to come... 


    add_handler(Client, Handler)
    add_handler(Client, Handler, Priority)
    remove_handler(Client, HandlerKey)

    Handler is a callback function that handles incoming stanzas. Handlers can be added to (or removed from) the client session at will. add_handler/2,3 calls return the key value which could be used to remove the handler later, if needed. Each handler could be assigned a priority at the time it's added to the client. When stanza is received, it will be applied to the chain of handlers according to handlers' priority and/or output. For example, a handler can interrupt the chain of subsequent handler calls by returning stop.


    add_plugin(Client, Plugin, Args)
    add_plugin(Client, Plugin, Args, Priority)
    remove_plugin(Client, PluginKey) 

    Theoretically handlers should be sufficient for processing of any incoming stanzas.
    However, in many cases handling of stanzas involves fair amount of repetitive code. For example, responses to discovery requests (disco_info and disco_items) have to have certain headers, workflow sequences that utilize ad-hoc commands need to keep state etc. Plugins are meant to encapsulate such  common processing blocks, letting developers to focus on specifics.
    To be a plugin, the module has to implement gen_client_plugin behavior, namely init/1, terminate/1 and handle/3 functions.  The gen_client:add_plugin/3 makes  The idea is that handle/3 will contain the bulk of boilerplate code, at the same time letting the plugin users to customize it by passing arguments to either init/1 or handle/3 functions.
    If above explanation sounds somewhat obscure, hopefully looking at the code of disco_plugin and test example in test_gen_client:test/0 will make things a bit clearer. The other available plugin is  adhoc_plugin, which I am planning to talk about in more details in following posts.

    Blocking and non-blocking requests

    XMPP messaging is asynchronous, and that's great. However, quite often your code needs to get an immediate response before the flow can continue. The examples are discovery, ping and command requests, pubsub retrival requests and many others. Of course, it's possible to allocate a callback for the expected response, but this does make the code harder to write and understand.
    gen_client supports both asynchronous and synchronous requests. The former is simply a wrapper of exmpp:send_packet/2:

    send_packet(ClientRef, Packet)

    where ClientRef is a client session reference created by one of gen_client:start functions (see above).

    The synchronous request is a bit more interesting. Here are definitions:

    send_sync_packet(Client, Packet, Timeout)
    send_sync_packet(Client, Packet, Trigger, Timeout)

    Here's how it works:

    The calling process sends the packet and timeout value to the client session process. The client session process creates a temporary handler that would "look" for "matching" incoming message, and then sends the packet to the server and waits for the response to arrive within specified timeout.
    How do we describe "matching"? By defining a "trigger" function that tests incoming message against some condition. For requests, "matching" means that response message will have the same id attribute as the request. So send_sync_packet/3 does just that: creates a function that looks at id attribute of incoming message and if it happens to be equal to request id, signals the client session that response has arrived. 
    So as we can see, send_sync_packet/3 could be treated as "send iq and wait for response", which makes it somewhat close to sendIQ function in Strophe.js
    In case your "matching criteria" is different from simple id matching, you can use send_sync_packet/4 that allows to define arbitrary "trigger" function.

    Important: even though the calling process will be blocked, the active handlers will still work, because they will be called in separate (exmpp controlling) process. In other words, while your main process waits for response to a particular request, another kinds of incoming messages will still be handled in parallel.

    That's all for now. Make sure to check out the code and please let me know what do you think.

    Saturday, July 24, 2010

    The gen_client v0.9: new design and features

    The gen_client underwent a lot of changes over the past few months.  The most significant are:
    • Plugin framework added;
    • Startup options introduced. The ones that work now are:
    1. "presence" - allows to specify if presence should be send after logging in;
    2. "reconnect" - auto-reconnect,
    3.  "script" - arbitrary function to call right after the connection was established; the default is exmpp:login/1
    Other options coming soon: mechanisms (i.e  BASIC, PLAIN etc), connection types (i.e. TCP, BOSH, SSL).
    • Dynamic handlers implementation was switched to gen_event;
    • The code can now be built with faxien, in addition to Emakefile.
    I have to mention that API was changed almost entirely, just in case you did use previous versions. 
    Check out the example of using API in test_gen_client:test/0. There are 2 Jabber account that I opened at in order to make it easier to try it live. That's what is happening:
    1. The first account signs in, announces the presence and adds "disco" plugin with user-defined content (implemented in test_disco module).
    2. The second  account signs in, announces presence and sends two synchronous requests to the first account to retrieve "disco_info" and "disco_items".
    test_gen_client:test() output
    Below is a code for test/0:

    test() ->
     {ok, Client1} = gen_client:start("", "", 5222, "test", 
                      [{debug, true}, {presence, {true, "I'm online."}}, {reconnect, 15000}]), 
     gen_client:add_plugin(Client1, disco_plugin, [test_disco, []]),
      %% Log in with 2nd client and send discovery request to 1st client
     {ok, Client2} = gen_client:start("", "", 5222, "test", 
                      [{debug, true}, {presence, {true, "I'm online."}}, {reconnect, 15000}]),
     %% We want to know what resource the first client was assigned, as disco requests should be sent to a particular resource
     Jid1 = gen_client:get_client_jid(Client1),
     %% We need to convert it to string to comply with exmpp_client_disco calls 
     Jid1Str = exmpp_jid:to_list(Jid1),
     {ok, Info} = gen_client:send_sync_packet(Client2, exmpp_client_disco:info(Jid1Str), 10000),
     io:format("Disco info from gen_client:~p~n", [gen_client:get_xml(Info)]),
     {ok, Items} = gen_client:send_sync_packet(Client2, exmpp_client_disco:items(Jid1Str), 10000),
     io:format("Disco items from gen_client:~p~n", [gen_client:get_xml(Items)]), 

    The gen_client code can be obtained here.

    Coming soon: adhoc commands plugin with examples. 

    Thursday, April 15, 2010

    The end of cross-domain hassle for BOSH

        Exciting news to anyone who is using BOSH with ejabberd - since release 2.1.3 you don't have to proxy your http-bind link anymore. Imagine no nginx configuration, no Tape, just point exactly to where your http-bind link is. This, I believe, albeit looking rather insignificant news, will bring XMPP development to the new level of acceptance.
    I learned first about it from Jack Moffitt's site. When I had a chance, I installed ejabberd 2.1.3 on AWS and fired a test web page that logs in using Strophe right from my local box. The habit of crafting nginx configuration first has developed over time so strong that I didn't actually expect it to work. Fortunately, ejabberd proved me wrong - it works, and much better and faster without proxying (which is logical to expect).
       So if you read tutorial on how to set up BOSH with ejabberd, you can now skip the whole topic talking about nginx, cross-domain limitations and proxy. In your Strophe code, instead of doing, for example:
    var BOSH_SERVICE = "/http-bind";
    you do:
    var BOSH_SERVICE = "http://your_ejabberd_server:5280/http-bind";

       The consequences are many, the most important I think is how easy it becomes to embed BOSH anywhere.

    Saturday, April 10, 2010

    Quick fix for digest authentication

    At rare times when I can't find Erlang code for things that have long been available in other languages, my pride of being Erlang programmer takes it very personally. One of these times came recently, when I had to call a web service protected by digest authentication from my code. Couldn't find it anywhere, but I probably spent more time trying to google the solution than I did writing the code, which is available here. Be forewarned that it was tested only with one particular web service, and by no means it tries to implement full spec. It was basically written by reading Wikipedia. hex/1 function was ripped from ejabberd code base that implements digest check on server side. Enjoy!

    Thursday, February 11, 2010

    gen_client example walk-through

    In this post I will show and comment the code for the XMPP bot that solves the problem of temporary subscriptions. Please see previous post for explanation of the problem.

    The code is using gen_client behaviour, extension of exmpp library. You can find source code for gen_client here.

    First, we start client process, using one of variations of gen_client:start (line 19). The last two parameters are module and its parameters. Module must implement gen_client behavior, and that's how you create your own client logic.
    This example is using dummy_client module, which we could consider as "next to minimal" implementation of gen_client - all it does is sending "available" presence upon start (run/2 callback), and "unavailable" presence upon termination (stop/1 callback). How useful is this? The answer: we can add functionality to the callback module at any time during run-time, using either addHandler/2 or variations of  send_sync_packet with attached triggers.

    Let's look at the code for some examples:

    Lines 20 through 34: we are dynamically adding a handler for monitoring JIDs going offline. Once the JID sends "unavailable" presence, the handler cancels all JID's subscriptions.
    As long as tidy_bot is on duty, there will be no mess anymore!

    Lines 37 through 52: we also want to clean the mess that may have been created while tidy_bot wasn't online. So we obtain a list of subscriptions, using synchronized request (line 118), and for every such subscription the presence probe to its owner is being sent (line 73). Subscriptions, whose owners had not responded to the probe, would be canceled. Probe request is a synchronized request with trigger function.
    Note how are we getting response to both synchronized responses in the same process. 
    Because sync calls is a distinctive feature of gen_client, let me explain with some more details:

    The variation send_sync_packet/4 uses "trigger function" (specified by 3rd parameter), which takes each incoming stanza and returns true if  this stanza carries a relevant response. This triggers the end of processing for synchronized request, and the calling process receives {ok, IncomingStanza}. If, on the other hand, there was no "triggering" stanza during timeout period (specified by 4st parameter of send_sync_packet/4), the calling process will receive a timeout atom.
    In our case, the trigger function expects to see "available" presence for the JID we've sent the probe to (line 57). If this happens within the timeout interval (4 seconds, as defined by this send_sync_packet call), the calling process receives {ok, PresenceMessage}, otherwise it receives a timeout.
    Note that because of the way the trigger function in this particular case was constructed, the calling process doesn't even need to analyze the message itself.

    The simpler variation, send_sync_packet/3, is not defining trigger function. In this case, the default "trigger function" will be used. It will just try to match identifiers of incoming and outgoing stanzas. So send_sync_packet/3 is most appropriate for sending IQ stanzas, where the request almost always matches the response by identifier.

    There is also a case of asynchronous request (line 88). In this particular case, it's "fire and forget" approach, i.e. we don't want to analyze the response. If we did, we could do it by assigning a handler function, as you've already seen before.

    It's important to note that synchronous requests only block the calling process, but not the handling of incoming stanzas. So, taking our example, the handler we have set up for monitoring "unavailable" presence will still be operational while the probe requests get sent. Moreover, every handler call spawns a separate process, so incoming stanzas don't wait for prior ones to be processed.

    A little more about handlers and triggers. gen_client internally applies each of added handlers to incoming stanzas. Triggers implicitly add specialized handlers that "wrap" trigger function into appropriate call. After the handler was applied to a stanza, the gen_client decides either to keep the handler for subsequent processing, or to dispose of it. Trigger-based handlers always get disposed, by their purpose of serving a single synchronous request. As for other handlers, the rule is that the handler will be applied to all incoming stanzas until it returns stop atom, at which point it will be disposed of. There is also a possibility to remove handler (gen_client:remove_handler/2), if you cared to save a handler reference returned by add_handler/2.

    I want to take the opportunity to thank Jean-Lou Dupont for his excellent erlang syntax highlighter, which I'm using below. 

    %% Author: bokner
    %% Created: Feb 3, 2010
    %% Description: Monitors and clears temporary pubsub subscriptions.
    %% Include files
    %% Exported Functions
    %% API Functions
    tidy_subscriptions(Jid, Password, Host, Port, PubSub) ->
     {ok, Session, _C} = gen_client:start(Jid, Host, Port, Password, dummy_client, ["On tidy duty"]),
     JidOfflineHandler = 
      fun(#received_packet{packet_type = presence, type_attr = "unavailable", from = PeerJid}, #client_state{jid = BotJid} = _State) when BotJid /= PeerJid ->
         {Node, Domain, _Resource} = PeerJid, 
         case exmpp_jid:bare_compare(BotJid, exmpp_jid:make(Node, Domain)) of
          false ->
           io:format("~p gone offline~n", [PeerJid]),
           unsubscribe_from_all_nodes(Session, PeerJid, PubSub);              
          _Other ->
       (_Other, _Session) ->
     gen_client:add_handler(Session, JidOfflineHandler), 
     %% Get subscriptions 
      Session, PubSub, 
      fun(SubscriptionList) ->
         lists:foreach(fun(S) -> 
                   fun() -> 
                      unsubscribe_temporary(Session, PubSub,
                                 exmpp_xml:get_attribute(S, "jid", undefined),
                                 exmpp_xml:get_attribute(S, "node", undefined),
                                 exmpp_xml:get_attribute(S, "subid", undefined)
               ) end),
    unsubscribe_temporary(Session, PubSub, Jid, Node, _Subid) ->
     %% Prepare handler for presence
     ProbeSuccessfull = fun(#received_packefrom = FullJid, packet_type = presence, type_attr = "available"}, _State) ->
                 {Acc, Domain, Resource} = FullJid,          
                 case exmpp_jid:parse(Jid) of
                  {jid, Jid, Acc, Domain, Resource} ->
                   io:format("probe matches for ~p~n", [FullJid]),
                  _NoMatch ->
                   io:format("probe doesn't match for ~p, ~p~n", [Jid, FullJid]),
                (_NonPresence, _State) ->
     %% Send presence probe
     io:format("Sending probe to ~p:~n", [Jid]),
     ProbeResult = gen_client:send_sync_packet(Session, exmpp_stanza:set_recipient(
                           exmpp_presence:probe(), Jid), ProbeSuccessfull, 4000),
     io:format("result of probe for ~p:~n~p~n", [Jid, ProbeResult]),
     case ProbeResult of 
      timeout ->
       unsubscribe_from_node(Session, Jid, Node, PubSub),
      {ok, #received_packet{type_attr = Type}} ->
       io:format("Probe:~p:~p~n", [Jid, Type]);
      Other ->
       io:format("Unexpected:~p~n", [Other])
    unsubscribe_from_node(Session, Jid, Node, PubSub) ->
     io:format("Unsubscribing ~p from ~p...", [Jid, Node]),
     gen_client:send_packet(Session, exmpp_client_pubsub:unsubscribe(Jid, PubSub, Node)),
    unsubscribe_from_all_nodes(Session, {Acc, Domain, Resource} = Jid, PubSub) ->
     io:format("Unsubscribing ~p~n", [Jid]),
      Session, PubSub, 
      fun(SList) ->
         lists:foreach(fun(S) -> 
                   fun() -> 
                      JidAttr = exmpp_xml:get_attribute(S, "jid", undefined),
                      case JidAttr == exmpp_jid:to_binary(Acc, Domain, Resource) of
                       true ->
                        unsubscribe_from_node(Session, JidAttr, exmpp_xml:get_attribute(S, "node", undefined), PubSub);
                       false ->
               ) end        
    process_subscriptions(Session, PubSub, Fun) ->
     {ok, SubscriptionPacket} = gen_client:send_sync_packet(Session, exmpp_client_pubsub:get_subscriptions(PubSub), 5000),
     %%io:format("Subscriptions:~p~n", [SubscriptionPacket]),
     Payload = exmpp_iq:get_payload(exmpp_iq:xmlel_to_iq(SubscriptionPacket#received_packet.raw_packet)),
       exmpp_xml:get_element(Payload, "subscriptions"),

    Wednesday, February 10, 2010

    gen_client in action: fixing temporary subscriptions

    In this post I'd like to discuss one particular pubsub challenge, namely temporary subscriptions. I will also use this discussion as an opportunity to show some gen_client code that was written in order to deal with this challenge.

    Let's assume we have a web-based XMPP weather service available for free public access, so everyone can come to our site, choose places and watch weather data updated in real-time. This is what pubsub is designed for: data streams being publishers, the clients being subscribers; each data stream will publish to its respective node, and the clients will subscribe to nodes they are interested in. Now comes interesting part: we expect significant number of clients to come use our service, so  we don't want to create accounts for each client. Instead clients will be automatically signed in with some shared account and assigned random resource name. XEP-0060 allows subscriptions based on full JIDs, so each client will still have its own subscriptions. However, using random resource names  impose having temporary subscriptions, because the moment client signs out, the resource name he was using becomes unusable, and so do subscriptions.

    So how do we deal wth temporary subscriptions? XEP-0060 (1.13rc13, p. 12.4) describes how it should be: once subscriber goes offline, the temporary subscription gets canceled. Unfortunately, it looks like ejabberd (v. 2.1.2 at the time of writing)  doesn't yet have it working, at least I was unable to configure temporary subscriptions the way XEP-0060 suggests. This means that once resource is gone, its subscriptions are still hanging around. Bad (very bad) thing about  it, not to mention excessive memory consumption, is that ejabberd will push data meant for these orphaned subscriptions to the resources of the same account that are still online. This will be an absolute mess - even if your client code is smart enough to filter foreign subscriptions (possibly by matching subscription identifiers), the enormous traffic will be generated pretty soon. Remember, we have a single shared account for all our clients.  Conclusion: we absolutely have to find the way to get rid of orphaned subscriptions.

    Here is a source code of the bot (using gen_client) that supports temporary subscriptions by monitoring resources and getting rid of subscriptions at the moment resource they belong to goes offline. Additionally, it cleans up such subscriptions on a startup.  I'm planning to do code walk-through and explain gen_client capabilities shown in the code in one of posts following shortly.

    This is, of course, a temporary solution and should only be used if your XMPP server doesn't support temporary subscriptions.

    I still have a feeling that there might be an easier solution, but I haven't found any practical cases of using temporary subscriptions, so if anyone has related experience, please come forward and share it here.

    Thursday, January 28, 2010

    gen_client behaviour for building XMPP clients in Erlang.

     The gen_client project aims to provide a structured way to write XMPP client code in Erlang.  The framework heavily relies on exmpp, but also borrows some ideas from my favorite Strophe javascript library. The objective of the project is to create a set of generic behaviours  and let a client developer to "fill in  the blanks", i.e. to implement callback methods pretty much in the same fashion the code based on OTP/Erlang behaviours is written.
    Why not to just use exmpp, one might ask? Sure you can. However, going from basic examples 
    to decently capable client code is not so easy in exmpp. Motivation behind gen_client is to make coding XMPP in Erlang as effortless as Strophe does for Javascript.
    One example of what I mean by "efortless" is how exmpp controls handling of incoming stream. By default, sending and receiving stanzas happens in a single process. Clearly, this is not very useful unless your XMPP client is happy with "question-answer" flow (as in echo_client.erl example from exmpp distribution), as opposed to asynchronous flow.
    Of course, exmpp has means to assign a separate process for handling incoming stream (exmpp_session:set_controlling_process/2). However, it would be nice to have this as default, which is what gen_client does.
    Continuing with this,  sometimes you may need a synchronous handling. For instance, you'd have to search through the whole tree of pubsub nodes, do some calculations and send results elsewhere. While this kind of task can be coded using callbacks, it does make coding much harder to deal with compared to sequential style. With gen_client, you can choose between asynchronous (gen_client:send_packet) and synchronous (gen_client:send_sync_packet) requests.

    And of course, each incoming stanza will be handled by gen_client in a spawned process, so your client can do many things at once - we are using Erlang for the reason, right?

    To start with gen_client, write your module that implements gen_client behaviour. And off you go:
    gen_client:start(Username,  Domain,  Host,  Port,  Password, Module, [ModuleArgs]).

    Summary of features that are already there:

    • Simultaneous handling of multiple incoming stanzas;
    • Synchronous and asynchronous requests;
    • Attaching IQ/presence/message handlers at runtime (somewhat similar to Strophe's addHandler style);
    • Support for ad-hoc commands (XEP-0050) and service discovery (XEP-0030);
    • Compatibility with exmpp and hence ability to reuse its codebase.

    Documentation and examples will follow time permitting. This is the work in progress, mostly experimental, so please use with caution. Usual disclaimers are in place. Please share your thoughts and ask questions, if any. This code is being used in real projects, so I appreciate any feedback from you as means of moving gen_client to a production quality.