Error Message "Could Not Connect to the Leading JobManager" Is Displayed When a Command Is Executed on the Flink Client
Symptom
During the creation of the Flink cluster, the following error message is displayed after the yarn-session.sh command execution is suspended for a while:
2018-09-20 22:51:16,842 | WARN | [main] | Unable to get ClusterClient status from Application Client | org.apache.flink.yarn.YarnClusterClient (YarnClusterClient.java:253)org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running.at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:861)at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)at org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:516)at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:717)at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:514)at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:511)at java.security.AccessController.doPrivileged(Native Method)at javax.security.auth.Subject.doAs(Subject.java:422)at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:511)Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway.at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:79)at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:856)... 10 common frames omittedCaused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
Possible Causes
The SSL communication encryption is enabled for Flink, but no correct SSL certificate is configured.
Solution
For MRS 2.x or earlier, perform the following operations:
Method 1:
Run the following command to disable the Flink SSL communication encryption, and modify the client configuration file conf/flink-conf.yaml.
security.ssl.internal.enabled: false
Method 2:
Enable the Flink SSL communication encryption and retain the default value of security.ssl.internal.enabled.
Configure the SSL as follows:
- If the keystore or truststore file path is a relative path, allow the Flink client directory where the command is executed to access this relative path directly.security.ssl.internal.keystore: ssl/flink.keystoresecurity.ssl.internal.truststore: ssl/flink.truststore
Add -t option to the CLI yarn-session.sh command of Flink to transmit the KeyStore and TrustStore files to each execution node.
yarn-session.sh -t ssl/ 2
- If the keystore or truststore file path is an absolute path, the keystore or truststore files must exist in the absolute path on Flink Client and all nodes.security.ssl.internal.keystore: /opt/client/Flink/flink/conf/flink.keystoresecurity.ssl.internal.truststore: /opt/client/Flink/flink/conf/flink.truststore
For MRS 3.x or later, perform the following operations:
Method 1:
Run the following command to disable the Flink SSL communication encryption, and modify the client configuration file conf/flink-conf.yaml.
security.ssl.enabled: false
Method 2:
Enable the Flink SSL communication encryption and retain the default value of security.ssl.enabled.
Configure the SSL as follows:
- If the keystore or truststore file path is a relative path, allow the Flink client directory where the command is executed to access this relative path directly.security.ssl.keystore: ssl/flink.keystoresecurity.ssl.truststore: ssl/flink.truststore
Add -t option to the CLI yarn-session.sh command of Flink to transmit the KeyStore and TrustStore files to each execution node.
yarn-session.sh -t ssl/ 2
- If the keystore or truststore file path is an absolute path, the keystore or truststore files must exist in the absolute path on Flink Client and all nodes.security.ssl.keystore: /opt/client/Flink/flink/conf/flink.keystoresecurity.ssl.truststore: /opt/client/Flink/flink/conf/flink.truststore
- Symptom
- Possible Causes
- Solution