Advanced
Тема интерфейса

Quickly Creating a ClickHouse Cluster and Performing Statistical Analysis

ClickHouse offers easy-to-use, flexible, and stable hosting services in the cloud. A data warehouse can be created in minutes for massive real-time data query and analysis, improving the overall efficiency of data value mining. By leveraging the massively parallel processing (MPP) architecture, ClickHouse can query data several times faster than conventional data warehouses.

Background Information

Assume that there is a student score table and you need to use ClickHouse to perform the following operations:

  • Create the user information table demo_t.
  • Add the user gender and subject to the user information.
  • Query basic user information by user ID.
  • Delete the user information table after the service ends.
    Table 1 Score table

    user_id

    name

    sex

    subject

    score

    time

    10000

    A

    1

    Chinese

    89

    2023-07-01 09:00:00

    10001

    B

    0

    Math

    132

    2023-07-01 09:00:00

    10002

    C

    0

    Math

    90

    2023-07-02 09:00:00

    10003

    D

    0

    English

    120

    2023-07-01 14:00:00

    10004

    E

    1

    Chinese

    101

    2023-07-01 09:00:00

    10005

    F

    1

    Chinese

    110

    2023-07-01 09:00:00

    Table 2 Description

    Parameter

    Description

    10000

    User ID, which uniquely identifies a user.

    2023-07-01 09:00:00

    Data import time

    A

    Student name

    1

    Gender female (0 indicates male).

    Chinese

    Discipline

    89

    Score

Step 1: Buying a ClickHouse Cluster

  1. Log in to the CloudTable console.
  2. Select a region in the upper left corner.
  3. Click Cluster Management.
  4. Click Buy Cluster in the upper right corner of the Cluster Management page and set related parameters. For details about how to configure ports for security group rules, see ClickHouse security group rules.
  5. Click Buy Now. On the displayed page, confirm the specifications and click Finish.
  6. Return to the cluster list to view the cluster creation progress. If the cluster status is Running, the cluster is successfully created. For details, see "Using ClickHouse > Creating a ClickHouse Cluster" in the CloudTable Service User Guide.
    Table 3 ClickHouse security group rules

    Direction

    Action

    Port/Range

    Type

    Destination/Source Address

    Description

    Outbound

    Allow

    All

    IPv4/IPv6

    0.0.0.0/0

    Permit in the outbound direction

    Inbound

    Allow

    8123

    Security group of the CloudTable ClickHouse cluster

    ClickHouse HTTP port number

    Allow

    9000

    ClickHouse TCP port number

    Allow

    8443

    ClickHouse HTTPS port number

    Allow

    9440

    Secure TCP security port of ClickHouse

    Allow

    2181

    ZooKeeper client connection monitoring port

Step 2: Downloading the ClickHouse Client and Verification File.

  1. Log in to the CloudTable console.
  2. Select a region in the upper left corner.
  3. Click Help in the navigation pane.
  4. Choose Download the ClickHouse Client under Helpful Links on the right of the help page to download the client installation package.
  5. Click Download Client Verification File to download the verification file.

Step 3: Preparing an ECS

  1. Purchase an ECS and log in to the ECS console.
  2. Select a region in the upper left corner.
  3. In the service list on the left, choose Computing > Elastic Cloud Server. The Elastic Cloud Server page is displayed.
  4. Click Buy ECS in the upper right corner. The parameter configuration page is displayed.
  5. Configure ECS parameters, including basic settings, instance, OS, storage replica, network, security group, public access, ECS management, advanced settings, and quantity.
  6. Check the configurations, select the agreement, and click Submit. After the ECS is created, it will be started by default.

    For details, see User Guide > "Getting Started" > "Creating an ECS".

    Note

    To ensure successful connection of the cluster to the VPC, the security group configurations must align with those of the ECS.

Step 4: Adding a Security Group

Add the IP address of the local host to the ECS security group.

  1. Obtain the IP address of the local host. Press Win+R. The Run dialog box is displayed.
  2. Enter cmd in the text box and click OK. The cmd window is displayed.
  3. Enter ipconfig in the command window and press Enter to query the IP address of the local host.
  4. Log in to the ECS console.
  5. On the ECS list page, click the ECS name. On the Basic Information tab page, click the Security Group tab. On the displayed page, click Inbound Rules.
  6. Click Add Rule in the upper right corner of the page.
  7. Enter the local IP address obtained in 3 as the source IP address. Click OK. The security group is added.

Step 5: Installing and Verifying the ClickHouse Client

You can manually install the client on an ECS.

  1. Use the SSH login tool (such as PuTTY) to log in to the Linux ECS through the EIP.

    For details about how to log in to the ECS, see "Remotely Logging In to a Linux ECS (Using an SSH Password)" in Logging In to a Linux ECS of the Elastic Cloud Server User Guide.

  2. Upload the client downloaded in 2 to the Linux ECS.
  3. Install the client and connect to the cluster.
    1. Use the SSH login tool to remotely log in to the Linux ECS through the EIP.

      For details, see Login Using an SSH Password in the Elastic Cloud Server User Guide.

    2. Go to the root directory of the SSH login tool.
      cd /
    3. Create a folder in the root directory.
      mkdir Folder name
    4. Go to the directory of the created folder.
      cd /Folder name/
    5. Place the client in the directory.
    6. Decompress the client package.
      tar -zxf Client package name
    7. Decompress the client verification file to the same directory as the client.
      1. Decompress the client verification file.
        cd <Path for storing the client verification file >
        tar xzvf Client_sha256.tar.gz
      2. Obtain the client verification code.
        sha256sum ClickHouse_Client_23.3.tar.gz
      3. Check the verification code in the client verification file and compare it with the client verification code. A match indicates no tampering, while a mismatch suggests tampering.
        less ClickHouse_Client_23.3.tar.gz.sha256
    8. Load the .so file.
      sh install.sh
    9. Go to the bin directory.
      cd bin/

      Grant the 700 permission to the directory.

      chmod 700 clickhouse
    10. Connect to the port of the ClickHouse cluster.

      Use the following command to connect to a normal cluster.

      ./clickhouse client --host Private IP address of the cluster --port 9000 --user admin --password Password

      For details about the security cluster connection commands, see "Using ClickHouse > Connecting to a ClickHouse Cluster > ClickHouse Security Channel Encryption" in the CloudTable Service User Guide.

      ./clickhouse client --host Private IP address of the cluster --port 9440 --user admin --password Password --secure --config-file /root/config.xml
      Note
      • Private IP Address: cluster access address on the cluster details page. Replace it with the access address of the cluster you purchased.
      • Password: the password set when you purchase the cluster. If there are special characters, use backslashes (\) to escape them. If the password is enclosed in single quotation marks ('), the special characters do not need to be escaped.

Step 6: Inserting Data

In the command window in Step 5, run the following commands to create a data table using the ClickHouse cluster and query the table data.

  1. Create a database.
    create database DB_demo;
  2. Use the database.
    use DB_demo;
  3. Create a table.
    create table DB_demo_t(user_id Int32,name String,sex Tinyint ,subject String,score Int32,time datetime)engine=TinyLog;
  4. Insert data.
    insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10000','A','1','Chinese','89','2023-07-01 09:00:00');
    insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10001','B','0','Math','132','2023-07-01 09:00:00');
    insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10002','C','0','Math','90','2023-07-02 09:00:00');
    insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10003','D','0','English','120','2023-07-01 14:00:00');
    insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10004','E','1','Chinese','101','2023-07-01 09:00:00');
    insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10005','F','1','Chinese','110','2023-07-01 09:00:00');
  5. Query the data.
    • Query the imported data.
      host-172-16-13-95 :) select * from DB_demo_t;
      SELECT *
      FROM DB_demo_t
      Query id: 4e119f77-0592-4131-bbe2-31f42bc069a1
      ┌─user_id─┬─name─┬─sex─┬─subject─┬─score─┬────────────────time─┐
      │ 10000 │ A │ 1 │ Chinese │ 89 │ 2023-07-01 09:00:00 │
      │ 10001 │ B │ 0 │ Math │ 132 │ 2023-07-01 09:00:00 │
      │ 10002 │ C │ 0 │ Math │ 90 │ 2023-07-02 09:00:00 │
      │ 10003 │ D │ 0 │ English │ 120 │ 2023-07-01 14:00:00 │
      │ 10004 │ E │ 1 │ Chinese │ 101 │ 2023-07-01 09:00:00 │
      │ 10005 │ F │ 1 │ Chinese │ 110 │ 2023-07-01 09:00:00 │
      └─────────┴──────┴─────┴────────┴───────┴─────────────────────┘
      6 rows in set. Elapsed: 0.004 sec.
  6. Delete data.
    • Delete the table.
      drop table DB_demo_t;
    • Delete the database.
      drop database DB_demo;