Sunday, May 12, 2013

The State of Mobile VoIP, part 3: SIP clients

At this point in time (May 2013) there is an abundance of SIP clients for Android. 'Twas not so a short few years ago.

First, in early 2009 I believe, came sipdroid. I think it was the API additions of the Cupcake release that allowed the software to stream audio to/from the speaker/microphone. sipdroid was a simple client that allowed making basic SIP-to-SIP audio-only calls.

Later, as front-facing cameras became more prevalent, sipdroid added videoconferencing. I think it was the first app with video but I can't say for certain. They were certainly proud of the addition.

Other than this capability, sipdroid is fairly basic. It only allows 2 accounts and it doesn't seem to be possible to dial out of the second account. The second account appears to only be there for receiving calls.

sipdroid has some support for setting up a pbxes.org account, but that's not much of an addition.

The other major SIP client is csipsimple. It is a lot more sophisticated. Its major feature addition is encryption. It can use SRTP, ZRTP or just TLS. Additionally, csipsimple supports multiple accounts. Right from the dialler, you can pick whichever account you wish to use to make the call. Also, it comes with an account set-up wizard for major providers. csipsimple has both STUN and ICE support. Video conferencing is possible on csipsimple but I don't know the details. The maintainer - Regis Montoya - is very responsive and generally cool guy.

It used to be that the version of csipsimple in the Android market was useless. It wouldn't make calls, and you had to download the latest nightly version to get anything that worked. However, the market version has been pretty good for a while now, and I think they even just uploaded the blessed 1.0 version.

There is a separate download for the extra CODECs. I would recommend them if the people you talk to use them also, but I personally think the regular CODECs are fine.

The last client I should mention is of course the built-in SIP stack that originated in Gingerbread. It is quite basic. It does not integrate with the dialler. All it can do is call other SIP targets. In the Contacts application, if you've ever seen "Internet call", this is what they're for. You enter something like sip:someone@someplace.com and then you can call it. It only supports G.711 (ulaw and alaw). It doesn't support STUN or ICE, although occasionally it will work (I assume if your network will "fix up" the packet for you.) Frankly, I place it roughly in the category of "fun toy", like walkie talkies when you were a kid, but I wouldn't rely on it.

The hands down favorite is csipsimple. It just does everything and if it doesn't, you can re-configure it. Since it has matured, I've actually been using it for most of my calling.

Which brings us to the next important topic...

Next: SIP providers

Thursday, May 2, 2013

The State of Mobile VOIP, part 2: SIP

I'm going to start this series discussing the grand daddy of videoconferencing - SIP. SIP stands for Session Initiation Protocol, which only does call-setup, but the acronym has come to refer to the entire protocol family. The family includes SDP - Session Description Protocol - which describes the media protocols (G.711, Speex, etc.) that each client supports - and RTP - real-time protocol - which is like a real-time, multiplexing version of UDP.

SIP was designed by data comm guys so it "fits" quite well in the Internet world. It is extensible, unlike the proprietary counterparts. It is fairly simple (despite its reputation). In fact, the packets are ASCII text. SIP typically runs over UDP port 5060.

Because SIP uses UDP, and because it was designed before NAT became prevalent, SIP has to play games in order to work through a NAT router. When SIP sends out its initiation packet, it puts the port number that media packets should be sent back to on the originating host. Since this port is mangled by NAT routers, SIP stacks must first figure out which port the outside world sees.

There are various protocols for discovering this. Commonly used is STUN - Simple Traversal of UDP through NAT. First, the stack will send a packet originating from the media port to an external server, typically port 3478. This server will reply with the port that it saw the packet from i.e. the port that NAT switched the packet to.

When the SIP stack gets this reply, it can put the externally reachable port into the SIP initiation packet. This will allow the called host to send media packets back to the originating host.

[ pictures! ]

As mentioned before, a SIP packet will contain an SDP payload. This message lists all the codecs that the caller supports. The called host will reply with the common codec that it prefers, if it accepts the call.

Once the call is established, packets travel directly between the two endstations. This is important, and different from protocols like Skype and Redphone where the packets first travel to a central server and then out to the recipient.

SIP has a variety of ways that it can encode the sound of voice into data. The classic way is to just do what the phone company does: G.711 or 8 bit logarithmic samples at 8kHz. More advanced methods can produce much better quality at a lower bitrate. They work by simply modelling the configuration of the human vocal path (commonly known as Linear Predictive Coding.) These include the granddaddy GSM, and newer codecs like CELP and SILK.

Calls may be encrypted in one of at least two ways - SRTP and ZRTP. SRTP uses SSL. ZRTP uses a public-key system not unlike PGP. No surprise it was invented by Phillip Zimmerman.

Next time: SIP clients for Android.