My adventures with the Lync Edge AV service

Standard

A call came in last week from a customer that’s using Lync 2013 on premises and Exchange Online. “We can’t reach our voicemail anymore”. Lync to Lync calls, PSTN to Lync calls, none could be forwarded to the Office 365 UM services. Funny enough, if I wanted to leave them a voicemail (being a federated partner), I managed to do so without any problem.

That did make sense in a way, since I’m being forwarded by Office 365 to contact the UM servers directly when using Lync. Then again, this would not solve the internal issues.

First discovery I made was that the Edge server in unable to resolve any external DNS queries. Having some firewall changes lately, I blamed it on the firewall and waited for that to be tested again. Indeed, something was preventing the Edge from sending DNS queries to the internet. That’s fixed now, but still – same issue. Additionally, it was not affecting only communications with the Office 365 UM service, all external communications that required the usage of the AV service failed.

Needless to say, throughout the entire process – No errors on the Edge server. Management store replication is ok, certificates are ok, server is patched, restarted – It’s all dandy and happy in Edge kingdom.

I insisted on rechecking all the firewall rules again – all seemed to be in place. I used James Cussen‘s great tool to test Edge connectivity – All results were successful.

After examining the UCCAPI log files of the clients and tracing the Edge server’s logs – everything was still ‘working’ as far as the Edge server was concerned. We could see the SIP traffic working perfectly (plus, we had IM and presence functioning), and the sessions would only drop as soon as the other party “picked up” the call.

This is where things are getting a little interesting. Back to Networking 101, if I’m testing a TCP connection – I will only accept the session as “Successful” if the handshake is completed:

TCP Handshake

This means that if the service is not responding, I will not get the server’s ACK, and the connection will time out.

When using UDP, it’s a different story:

UDP

So “testing” a UDP service might be a little tricky…

This had me suspicious about the AV service, being the one in charge of our RTP traffic.

With no other options left, I started tracing the actual UDP sessions.
Here’s how it looks when the AV service is not cooperating:

lync.exe 172.25.20.99 AV.Edge.com TURN TURN:TURN:Allocate Request {UDP:59, IPv4:58}
lync.exe 172.25.20.99 AV.Edge.com TCP TCP:Flags=......S., SrcPort=10963, DstPort=HTTPS(443),
lync.exe AV.Edge.com 172.25.20.99 TCP TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=10963,
lync.exe 172.25.20.99 AV.Edge.com TCP TCP:Flags=......S., SrcPort=10851, DstPort=HTTPS(443),
lync.exe AV.Edge.com 172.25.20.99 TCP TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=10851,
lync.exe 172.25.20.99 AV.Edge.com TURN TURN:TURN:Allocate Request {UDP:59, IPv4:58}
lync.exe 172.25.20.99 AV.Edge.com TURN TURN:TURN:Allocate Request {UDP:59, IPv4:58}
lync.exe 172.25.20.99 AV.Edge.com TCP TCP:Flags=......S., SrcPort=10963, DstPort=HTTPS(443),
lync.exe AV.Edge.com 172.25.20.99 TCP TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=10963,

Digging a little dipper into the TURN Allocate Request, we can see all the right details:

Frame: Number = 194, Captured Frame Length = 232, MediaType = WiFi
+ WiFi: [Unencrypted Data] .T....., (I)
+ LLC: Unnumbered(U) Frame, Command Frame, SSAP = SNAP(Sub-Network Access Protocol), DSAP = SNAP(Sub-Network Access Protocol)
+ Snap: EtherType = Internet IP (IPv4), OrgCode = XEROX CORPORATION
+ Ipv4: Src = 172.25.20.99, Dest = AV.Edge.com, Next Protocol = UDP, Packet ID = 22808, Total IP Length = 168
+ Udp: SrcPort = 8588, DstPort = 3478, Length = 148
+ TURN: TURN:Allocate Request

This is where I should be getting back a TURN:Allocate Response from the application. Yet, no reply.

Tried stopping the Edge AV service – it said “Stopping” for 30 minutes but never stopped, even when using the -Force switch. Trying to kill the process and the task was unsuccessful either.
This is where I tried to remove the Lync Edge components from “Programs and Features”. This failed as well, saying there was a problem with the “Lync Server Media Relay Driver” on the Local Area Connection interface.
Immediately went to “Network Connections” and what do you know?! This is what I see:

Media Relay Driver

I uninstalled it, ran Bootstrapper again, and retried the connection. The result was clear:

lync.exe 172.25.20.99 AV.Edge.com TURN TURN:TURN:Allocate Request {UDP:40, IPv4:39}
lync.exe 172.25.20.99 AV.Edge.com TURN TURN:TURN:Allocate Request {UDP:45, IPv4:39}
lync.exe 172.25.20.99 AV.Edge.com TLS TLS:TLS Rec Layer-1 HandShake: Client Hello. {TLS:47, SSLVersionSelector:46, TCP:44, IPv4:39}
lync.exe AV.Edge.com 172.25.20.99 TURN TURN:Control message, TURN:Allocate Error Response {TCP:41, IPv4:39}
lync.exe AV.Edge.com 172.25.20.99 TURN TURN:TURN:Allocate Error Response {UDP:45, IPv4:39}
lync.exe AV.Edge.com 172.25.20.99 TURN TURN:TURN:Allocate Response {UDP:40, IPv4:39}
lync.exe AV.Edge.com 172.25.20.99 TLS TLS:TLS Rec Layer-1 HandShake: Server Hello. Server Hello Done. {TLS:47, SSLVersionSelector:46, TCP:44, IPv4:39}

Almost Immediatley you can see that the application is responding and we can get both the TURN:Allocate Response and the TLS sessions complete.

Remember this next time you’re having issues with the Lync Edge AV service.

Advertisements

6 thoughts on “My adventures with the Lync Edge AV service

  1. soder

    Hi y0av,

    I dont really see the explanation why the screnshot showed any issue here? Isnt the Lync Server Media relay service supposed to be installed and checked on a Lync edge server or what?

    • Hi soder,

      The Lync Server Media relay service should be installed and checked on the edge server, and should respond as well.
      The first screenshot shows how it looks like when it’s not responding, the second shows how it looks when it’s responding.
      In this case, the solution was to reinstall the Lync Server Media relay service.

      • soder

        Sorry, I wasnt clear enough in my previous post: I referred to the last screenshot which shows the Windows Network interface card properties with the Lync Server Media Relay driver enabled (red ink square highlighting it in the “This connection uses the following items”. What is not clear for me is that this component should be CHECKED or UNCHECKED, based on the description in your blogpost it is not evident. I know it should be checked, but for a novice reader the purpose of that screenshot is not evident.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s