Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Problem

Pipelines created to read from Microsoft SQL Server or writing to Microsoft SQL Server fail with the following error

Code Block
Error: "Socket is closed". ClientConnectionId:<ID>

Symptom

Pipelines that are configured to run on Dataproc, either reading from SQL Server or writing to SQL Server fail while running the pipeline with a Socket is closed exception. The complete error message is as follows:

Code Block
ava.lang.RuntimeException: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: The driver could not establish a secure connection to SQL Server by using Secure Sockets Layer (SSL) encryption. Error: "Socket is closed". ClientConnectionId:<ID>
	at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:171) ~[hadoop-mapreduce-client-core-2.8.5.jar:na]

This happens because the Dataproc by default uses Conscrypt SSL provider that has a bug when creating SSL Context using Conscrypt SSL Provider.

Solution

To fix the issue while running the pipeline disable using conscrypt while creating Dataproc cluster. This can be done by setting the following runtime argument for the pipeline.

Code Block
system.profile.properties.dataproc:dataproc.conscrypt.provider.enable false

The following screenshot shows how to set this for a pipeline using the UI

Image RemovedImage Added

Page Properties
hiddentrue

Related issues