Advanced
Тема интерфейса

What Factors Influence Data Balancing?

During Doris's operation, the frontend (FE) node continuously monitors each disk's load through the metadata. Should a data imbalance arise, the FE node promptly redistributes data from the overloaded disk to the underutilized one until an equilibrium is achieved across all disks.

What Factors Influence Data Balancing?

  • Frequent data writing can lead to constant changes in the load experienced by each node and disk, thereby initiating data balancing.
  • Moreover, if new data is written amidst an ongoing data balancing process, it may interfere with the current balance.
  • Disk load calculations are influenced by the presence of junk files. Therefore, mass deletion of these files can lead to disk load imbalances.

Symptom

Unchecked data disk space in Doris can lead to a full disk, preventing further write processes. To prevent this, it is crucial to manage Doris system operations to circumvent a full disk scenario.

Cause Analysis

The imbalance of disks often stems from excessive data writing coupled with the deletion of numerous junk files.

Solutions

  • To mitigate frequent shifts in disk load, it is advisable to schedule data writing sessions.
  • Additionally, manually removing junk files and shortening the junk file expiration period can help. Once the data is balanced and the volume of junk files is minimized, the original junk file expiration period can be reinstated. For guidance on adjusting the recycle bin duration, see Setting the Recycle Bin Duration.