With CSS, you can use in-house developed Logstash to ingest data into OpenSearch for efficient search and exploration. Supported data formats include JSON and CSV.
Logstash is an open-source, server-side data processing pipeline that ingests data from multiple sources simultaneously, processes and transforms the data, and then sends it to OpenSearch. For more information about Logstash, visit https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html
Depending on where Logstash is deployed, there are two data ingestion scenarios:
Figure 1 illustrates the data ingestion process when Logstash is deployed on an external network.
Figure 1 Data ingestion process when Logstash is deployed on an external network

ssh -g -L <Local port of the jump host:Private network address and port number of a node> -N -f root@<Private IP address of the jump host>
For example, port 9200 on the jump host is accessible from the public network, the private network address and port number of the node are 192.168.0.81 and 9200, respectively, and the private IP address of the jump host is 192.168.0.227. You need to run the following command to perform port mapping:
ssh -g -L 9200:192.168.0.81:9200 -N -f root@192.168.0.227
For example, data file access_20181029_log needs to be ingested, the file storage path is /tmp/access_log/ (create the access_log folder if it does not already exist), and the data file contains the following data:
| All | Heap used for segments | | 18.6403 | MB || All | Heap used for doc values | | 0.119289 | MB || All | Heap used for terms | | 17.4095 | MB || All | Heap used for norms | | 0.0767822 | MB || All | Heap used for points | | 0.225246 | MB || All | Heap used for stored fields | | 0.809448 | MB || All | Segment count | | 101 | || All | Min Throughput | index-append | 66232.6 | docs/s || All | Median Throughput | index-append | 66735.3 | docs/s || All | Max Throughput | index-append | 67745.6 | docs/s || All | 50th percentile latency | index-append | 510.261 | ms |
cd /<Logstash installation directory>/vi logstash-simple.conf
input {Location of data}filter {Related data processing}output {elasticsearch {hosts => "<EIP of the jump host>:<Number of the port assigned external network access permissions on the jump host>"}}
Consider the data files in the /tmp/access_log/ path mentioned in 4 as an example. Assume that data ingestion starts from the first row of the file, the filtering condition is left unspecified (indicating no data processing operations are performed), the public IP address and port number of the jump host are 192.168.0.227 and 9200, respectively, and the name of the destination index is myindex. Edit the configuration file as follows, and enter :wq to save the change and exit.
input {file{path => "/tmp/access_log/*"start_position => "beginning"}}filter {}output {elasticsearch {hosts => "192.168.0.227:9200"index => "myindex"}}
If a license error is reported, set ilm_enabled to false to try and rectify the error.
If the cluster has the security mode enabled, you need to download a certificate first.
Consider the data files in the /tmp/access_log/ path mentioned in 4 as an example. Assume that data ingestion starts from the first row of the file, the filtering condition is left unspecified (indicating no data processing operations are performed), the public IP address and port number of the jump host are 192.168.0.227 and 9200, respectively, The name of the index for importing data is myindex, and the certificate is stored in /logstash/config/CloudSearchService.cer. Edit the configuration file as follows, and enter :wq to save the change and exit.
input{file {path => "/tmp/access_log/*"start_position => "beginning"}}filter {}output{elasticsearch{hosts => ["https://192.168.0.227:9200"]index => "myindex"user => "admin" # Username for accessing the security-mode clusterpassword => "******" # Password for accessing the security-mode clustercacert => "/logstash/config/CloudSearchService.cer"manager_template => falseilm_enabled => falsessl => truessl_certificate_verification => false}}
./bin/logstash -f logstash-simple.conf
This command must be executed in the directory where the logstash-simple.conf file is located. For example, if the logstash-simple.conf file is stored in /root/logstash-7.1.1/, navigate to this directory before executing the command.
Run the following command to search for data. Check the search results. If they are consistent with the ingested data, data ingestion has been successful.
GET myindex/_search
Figure 2 illustrates the data ingestion process when Logstash is deployed on an ECS that resides in the same VPC as the destination cluster.
Figure 2 Data ingestion process when Logstash is deployed on an ECS

For example, the file access_20181029_log is stored in the /tmp/access_log/ path of the ECS, and the file contains the following data:
| All | Heap used for segments | | 18.6403 | MB || All | Heap used for doc values | | 0.119289 | MB || All | Heap used for terms | | 17.4095 | MB || All | Heap used for norms | | 0.0767822 | MB || All | Heap used for points | | 0.225246 | MB || All | Heap used for stored fields | | 0.809448 | MB || All | Segment count | | 101 | || All | Min Throughput | index-append | 66232.6 | docs/s || All | Median Throughput | index-append | 66735.3 | docs/s || All | Max Throughput | index-append | 67745.6 | docs/s || All | 50th percentile latency | index-append | 510.261 | ms |
cd /<Logstash installation directory>/vi logstash-simple.conf
Enter the following content in logstash-simple.conf:
input {Location of data}filter {Related data processing}output {elasticsearch{hosts => "<Private network address and port number of the node>"}}
If the cluster contains multiple nodes, you are advised to replace the value of <Private network address and port number of a node> with the private network addresses and port numbers of all nodes in the cluster to prevent node faults. Use commas (,) to separate the nodes' private network addresses and port numbers. The following is an example:
hosts => ["192.168.0.81:9200","192.168.0.24:9200"]
If the cluster contains only one node, the format is as follows:
hosts => "192.168.0.81:9200"
Consider the data files in the /tmp/access_log/ path mentioned in 2 as an example. Assume that data ingestion starts from the first row of the file, the filtering condition is left unspecified (indicating no data processing operations are performed), the private network address and port number of the node in the destination cluster are 192.168.0.81 and 9200, respectively, and the name of the destination index is myindex. Edit the configuration file as follows, and enter :wq to save the change and exit.
input {file{path => "/tmp/access_log/*"start_position => "beginning"}}filter {}output {elasticsearch {hosts => "192.168.0.81:9200"index => "myindex"}}
If the cluster has the security mode enabled, you need to download a certificate first.
Consider the data files in the /tmp/access_log/ path mentioned in step 2 as an example. Assume that data ingestion starts from the first row of the file, the filtering condition is left unspecified (indicating no data processing operations are performed), the public IP address and port number of the jump host are 192.168.0.227 and 9200, respectively, The name of the index for importing data is myindex, and the certificate is stored in /logstash/config/CloudSearchService.cer. Edit the configuration file as follows, and enter :wq to save the change and exit.
input{file {path => "/tmp/access_log/*"start_position => "beginning"}}filter {}output{elasticsearch{hosts => ["https://192.168.0.227:9200"]index => "myindex"user => "admin" # Username for accessing the security-mode clusterpassword => "******" # Password for accessing the security-mode clustercacert => "/logstash/config/CloudSearchService.cer"manager_template => falseilm_enabled => falsessl => truessl_certificate_verification => false}}
./bin/logstash -f logstash-simple.conf
Run the following command to search for data. Check the search results. If they are consistent with the ingested data, data ingestion has been successful.
GET myindex/_search
To access a security-mode OpenSearch cluster that uses HTTPS, a security certificate must be loaded. To obtain this security certificate (CloudSearchService.cer), follow these steps: