Problem
Pipelines created to read from Microsoft SQL Server or writing to Microsoft SQL Server fail with the following error
Code Block |
---|
Error: "Socket is closed". ClientConnectionId:<ID> |
Symptom
Pipelines that are configured to run on Dataproc, either reading from SQL Server or writing to SQL Server fail while running the pipeline with a Socket is closed exception. The complete error message is as follows:
Code Block |
---|
ava.lang.RuntimeException: java.lang.RuntimeException: com.microsoft.sqlserver.jdbc.SQLServerException: The driver could not establish a secure connection to SQL Server by using Secure Sockets Layer (SSL) encryption. Error: "Socket is closed". ClientConnectionId:<ID>
at org.apache.hadoop.mapreduce.lib.db.DBInputFormat.setConf(DBInputFormat.java:171) ~[hadoop-mapreduce-client-core-2.8.5.jar:na] |
This happens because the Dataproc by default uses Conscrypt SSL provider that has a bug when creating SSL Context using Conscrypt SSL Provider.
Solution
To fix the issue while running the pipeline disable using conscrypt while creating Dataproc cluster. This can be done by setting the following runtime argument for the pipeline.
Code Block |
---|
system.profile.properties.dataproc:dataproc.conscrypt.provider.enable false |
The following screenshot shows how to set this for a pipeline using the UI
Image RemovedImage Added