Needs Improvement

Colin Coller's Blog
Posts: 62 | Comments: 156 | TrackBacks: ?

March 8, 2006

WebException with SecureChannelFailure

Finally some .NET content:

I've written a Windows service that runs on several Windows XP tablet PCs, and a Web service that's hosted on several Windows Server 2003 servers. Both services are written in C# with .NET 1.1 SP1. The Windows service calls methods on the Web service using HTTPS. There are SSL blades (Cisco) between the tablets and the servers; tablets speak HTTPS to the blades, the blades speak HTTP to the servers.

Most of the time everything works properly. Sometimes, though, communication between the Windows service and the Web service breaks down, and every call results in a WebException with SecureChannelFailure:

02/16/2006 12:06:36
Type : System.Net.WebException, System, Version=1.0.5000.0, Culture=neutral, 
PublicKeyToken=b77a5c561934e089 Message : The underlying connection was closed: 
Could not establish secure channel for SSL/TLS.
Source : System.Web.Services
Help link : 
Status : SecureChannelFailure
Response : 
TargetSite : System.Net.WebResponse GetWebResponse(System.Net.WebRequest)
Stack Trace :    
   at System.Web.Services.Protocols.WebClientProtocol.GetWebResponse(WebRequest request)
   at System.Web.Services.Protocols.HttpWebClientProtocol.GetWebResponse(WebRequest request)
   at System.Web.Services.Protocols.SoapHttpClientProtocol.Invoke(String methodName, Object[] parameters)
   at
	... 
	Inner Exception
	---------------
	Type : System.ComponentModel.Win32Exception, System, Version=1.0.5000.0, 
	Culture=neutral, PublicKeyToken=b77a5c561934e089
	Message : The message or signature supplied for verification has been altered
	Source : System
	Help link : 
	NativeErrorCode : -2146893041
	ErrorCode : -2147467259
	TargetSite : Int32 EndRead(System.IAsyncResult)
	Stack Trace :    at System.Net.TlsStream.EndRead(IAsyncResult asyncResult)
	   at System.Net.Connection.ReadCallback(IAsyncResult asyncResult)

The error code corresponds to SEC_E_MESSAGE_ALTERED.

We've tried the following things to troubleshoot the problem:

  1. Restart the Windows service.
    No effect.
  2. Restart Windows on the tablet.
    No effect.
  3. Check to see if a proxy server is configured.
    There is no proxy server configured.
  4. Access the Web service URL from the tablet PC with Internet Explorer.
    We can successfully access the Web service using Internet Explorer. There are no delays, no certificate warnings or errors, and no other obvious errors that would explain the exception. The server certificate details appear to be correct.
  5. Access the Web service URL from the tablet PC with another .NET application.
    We can successfully access the Web service using both sockets and WebRequest. There are no delays or errors.
  6. Research the problem.
    There's nothing on SecureChannelFailure in the KB. There are very few posts about it on Google News, and the few people we've been able to contact about their posts have told us "I never figured it out, I just added retries" or "I never figured it out, it went away in production". We haven't found anything helpful.
  7. Disable HTTP keep-alive.
    This is a common cure for Web service ailments. We overrode GetWebRequest on the proxy class and set WebRequest.KeepAlive to false. No effect.
  8. Override certificate policy and log non-zero problem codes.
    This is a common cure for certificate-related SSL ailments. We created a class whose ICertificatePolicy.CheckValidationResult method logged any non-zero problem codes. Nothing is ever logged.
  9. Synchronize all Web service calls on a shared object.
    The idea was to rule out concurrency issues. No effect.
  10. Disable TLS and SSL2 in Internet Explorer, and explicitly set ServicePointManager.SecurityProtocol to SSL3.
    The idea was to rule out protocol mismatch, however unlikely. No effect.
  11. Trace network traffic between tablet and server.
    When the Windows service makes an unsuccessful Web service request, the following occurs:
    • The tablet sends encrypted data to the SSL blade.
    • The SSL blade decrypts some data and sends it to the application server. The decrypted data is the beginning of a normal Web service request.
    • [Something happens, but we can't tell what, because the data is encrypted and we don't have access to the SSL blade.]
    • The SSL blade closes the connection to application server, which attempts to respond to the request, fails because it's been truncated, and returns HTTP 400.
    • The SSL blade sends an encrypted SSL alert to the tablet, which throws the exception, and closes the connection to the tablet.

Making things more confusing:

  • While restarting the Windows service has no effect, stopping the service, overwriting its executable with an identical backup copy, and starting it again temporarily resolves the issue. In fact, it works so reliably that my client is seriously considering writing a batch file that automates this process. I call it voodoo.bat.
  • While restarting Windows on the tablet has no effect, if we take an image of the old tablet and apply it to a new tablet with identical hardware, the old tablet won't be able to connect but the new one will.
  • We're only experiencing the problem in production, not in any of our other environments.
  • Most of the tablets work properly, and some that don't work can be made to work again not by changing the contents of the request, but by voodoo.

Making things painful:

  • Everything is encrypted, so we can't tell if the data is valid or not.
  • The SSL blades are production hardware, so changing their logging level or putting them into diagnostic mode requires change requests, arranging someone to work outside hours, red tape, and hassles.
  • Even if we can isolate the problem to the tablet or the SSL blades, we may never know the root cause of the problem or find a solution.

If anyone reading this has any experience calling methods on Web services using HTTPS, or if anyone is experiencing the same problem, feel free to post experiences or suggestions here. I'll update this post as things progress.

Colin

07:35 PM | Colin

TrackBacks

# Hello World Hello World!  My name is Colin Coller, and I'm a Solution Developer (SD) at Avanade Canada. ...

01:19 PM | colinco

# discount air fares ?????????¦???????

07:14 AM | discount air fares

# free nokia ringtones ?????????¦???????

04:00 AM | free nokia ringtones

# discount tire ?????????¦???????

05:04 AM | discount tire