The Elasticsearch-Hadoop (ES-Hadoop) connector combines the massive data storage and in-depth processing capabilities of Hadoop with the real-time search and analysis capabilities of Elasticsearch. It allows you to quickly get to know big data and work better in the Hadoop ecosystem.
This topic uses the ES-Hadoop of MRS as an example to describe how to connect to a CSS cluster. You can configure any other applications that need to use the Elasticsearch cluster. Ensure the network connection between the client and the Elasticsearch cluster is normal.
This topic uses a private IP address as an example to describe how to access a cluster. The cluster access address varies with the network configurations used. For details, see Network Configuration.
If the cluster has only one node, the IP address and port number of this single node are displayed, for example, 10.62.179.32:9200. If the cluster has multiple nodes and all of them are data nodes, the IP addresses and port numbers of all these nodes are displayed; if some of them are client nodes, only the IP addresses and port numbers of these client nodes are displayed; for example, 10.62.179.32:9200,10.62.179.33:9200.
curl -X GET http://<host>:<port>
curl -X GET http://<host>:<port> -u <user>:<password>
curl -X GET https://<host>:<port> -u <user>:<password> -ik
Variable | Description |
|---|---|
<host> | IP address of each node in the cluster. If the cluster contains multiple nodes, there will be multiple IP addresses. You can use any of them. |
<port> | Port number for accessing a cluster node. Generally, the port number is 9200. |
<user> | Username for accessing the cluster. |
<password> | Password of the user. If the password contains special characters, enclose the username and password in single quotation marks, for example, curl -u "user:password!" "http://<host>:<port>". |
hadoop fs -mkdir /tmp/hadoop-eshadoop fs -put elasticsearch-hadoop-x.x.x.jar /tmp/hadoop-eshadoop fs -put commons-httpclient-3.1.jar /tmp/hadoop-es
Enter beeline or hive to go to the execution page and run the following commands:
add jar hdfs:///tmp/hadoop-es/commons-httpclient-3.1.jar;add jar hdfs:///tmp/hadoop-es/elasticsearch-hadoop-x.x.x.jar;
CREATE EXTERNAL table IF NOT EXISTS student(id BIGINT,name STRING,addr STRING)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'TBLPROPERTIES('es.nodes' = 'xxx.xxx.xxx.xxx:9200','es.port' = '9200','es.net.ssl' = 'false','es.nodes.wan.only' = 'false','es.nodes.discovery'='false','es.input.use.sliced.partitions'='false','es.resource' = 'student/_doc');
CREATE EXTERNAL table IF NOT EXISTS student(id BIGINT,name STRING,addr STRING)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'TBLPROPERTIES('es.nodes' = 'xxx.xxx.xxx.xxx:9200','es.port' = '9200','es.net.ssl' = 'false','es.nodes.wan.only' = 'false','es.nodes.discovery'='false','es.input.use.sliced.partitions'='false','es.nodes.client.only'='true','es.resource' = 'student/_doc','es.net.http.auth.user' = 'username','es.net.http.auth.pass' = 'password');
chown -R omm truststore.jks
CREATE EXTERNAL table IF NOT EXISTS student(id BIGINT,name STRING,addr STRING)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'TBLPROPERTIES('es.nodes' = 'https://xxx.xxx.xxx.xxx:9200','es.port' = '9200','es.net.ssl' = 'true','es.net.ssl.truststore.location' = 'certFilePath','es.net.ssl.truststore.pass' = 'certPassword','es.nodes.wan.only' = 'false','es.nodes.discovery'='false','es.nodes.client.only'='true','es.input.use.sliced.partitions'='false','es.resource' = 'student/_doc','es.net.http.auth.user' = 'username','es.net.http.auth.pass' = 'password');
Parameter | Default Value | Description |
|---|---|---|
es.nodes | localhost | Address for accessing the CSS cluster. You can check the private network address in the cluster list. |
es.port | 9200 | Port number for accessing a cluster. Generally, the port number is 9200. |
es.nodes.wan.only | false | Whether to perform node sniffing. |
es.nodes.discovery | true | Whether to disable node discovery. |
es.input.use.sliced.partitions | true | Whether to use slices. Its value can be:
NOTE: If this parameter is set to true, the index prefetch time may be significantly prolonged, and may even be much longer than the data query time. You are advised to set this parameter to false to improve query efficiency. |
es.resource | NA | Specifies the index and type to be read and written. |
es.net.http.auth.user | NA | Username for accessing the cluster. Set this parameter only if the security mode is enabled. |
es.net.http.auth.pass | NA | Password of the user. Set this parameter only if the security mode is enabled. |
es.net.ssl | false | Whether to enable SSL. If SSL is enabled, you need to configure the security certificate information. |
es.net.ssl.truststore.location | NA | Path of the .jks certificate file, for example, file:///tmp/truststore.jks. |
es.nodes.client.only | false | Check whether the IP address of an independent Client node is configured for es.nodes (that is, whether the Client node is enabled during Elasticsearch cluster creation). If yes, change the value to true, or an error will be reported, indicating that the data node cannot be found. |
es.net.ssl.truststore.pass | NA | Password of the .jks certificate file. |
For details about ES-Hadoop configuration items, see the official configuration description.
INSERT INTO TABLE student VALUES (1, "Lucy", "address1"), (2, "Lily", "address2");
select * from student;
The query result is as follows:
+-------------+---------------+---------------+| student.id | student.name | student.addr |+-------------+---------------+---------------+| 1 | Lucy | address1 || 2 | Lily | address2 |+-------------+---------------+---------------+2 rows selected (0.116 seconds)
GET /student/_search
Figure 1 Kibana query result

To access a security-mode Elasticsearch cluster that uses HTTPS, a security certificate must be loaded. Perform the following steps to obtain the security certificate and upload it to the client:
keytool -import -alias newname -keystore ./truststore.jks -file ./CloudSearchService.cer
keytool -import -alias newname -keystore .\truststore.jks -file .\CloudSearchService.cer
In the preceding command, newname indicates the user-defined certificate name.
After this command is executed, you will be prompted to set the certificate password and confirm the password. Securely store the password. It will be used for accessing the cluster.