Function
This API is used to submit jobs to a queue using SQL statements.
The job types support DDL, DCL, IMPORT, QUERY, and INSERT. The IMPORT function is the same as that described in Importing Data (Deprecated). The difference lies in the implementation method.
Additionally, you can use other APIs to query and manage jobs. For details, see the following sections:
- Querying Job Status
- Querying Job Details
- Querying Job Execution Results (Method 2) (Deprecated)
- Exporting Query Results
- Querying All Jobs
- Canceling a Job (Recommended)
This API is synchronous if job_type in the response message is DCL.
URI
- URI format
POST /v1.0/{project_id}/jobs/submit-job
- Parameter descriptions
Table 1 URI parameter Parameter
Mandatory
Type
Description
project_id
Yes
String
Definition
Project ID, which is used for resource isolation. For how to obtain a project ID, see Obtaining a Project ID.
Example: 48cc2c48765f481480c7db940d6409d1
Constraints
None
Range
The value can contain 1 to 64 characters. Only letters and digits are allowed.
Default Value
None
Request Parameters
Parameter | Mandatory | Type | Description |
|---|---|---|---|
sql | Yes | String | Definition SQL statement to be executed Constraints None Range None Default Value None |
currentdb | No | String | Definition Database where the SQL statement is executed. This parameter does not need to be configured during database creation. Constraints None Range The value cannot contain only digits or start with an underscore (_). Only digits, letters, and underscores (_) are allowed. Default Value None |
queue_name | No | String | Definition Name of the queue to which the job is to be submitted Constraints None Range The name cannot contain only digits or start with an underscore (_). Only digits, letters, and underscores (_) are allowed. Default Value None |
conf | No | Array of strings | Definition You can set the parameter in key-value pairs. Constraints None Range For details about the supported configuration items, see Table 3. Default Value None |
tags | No | Array of objects | Definition Job tags. For details, see Table 4. Constraints None Range None Default Value None |
engine_type | No | String | Definition Type of the engine that executes jobs. Constraints None Range
Default Value spark |
Parameter | Description |
|---|---|
spark.sql.files.maxRecordsPerFile | Definition Maximum number of records to be written into a single file. If the value is zero or negative, there is no limit. Constraints None Range None Default Value 0 |
spark.sql.autoBroadcastJoinThreshold | Definition Maximum size of the table that displays all working nodes when a connection is executed. You can set this parameter to -1 to disable the display. Constraints Currently, only the configuration unit metastore table that runs the ANALYZE TABLE COMPUTE statistics noscan command and the file-based data source table that directly calculates statistics based on data files are supported. Range None Default Value 209715200 |
spark.sql.shuffle.partitions | Definition Default number of partitions used to filter data for join or aggregation. Constraints None Range None Default Value 200 |
spark.sql.dynamicPartitionOverwrite.enabled | Definition Whether DLI deletes all partitions that meet the conditions before overwriting the partitions Constraints None Range
Default Value false |
spark.sql.files.maxPartitionBytes | Definition Maximum number of bytes to be packed into a single partition when a file is read. Constraints None Range None Default Value 134217728 |
spark.sql.badRecordsPath | Definition Path of bad records Constraints None Range None Default Value None |
dli.sql.sqlasync.enabled | Definition Indicates whether DDL and DCL statements are executed asynchronously. The value true indicates that asynchronous execution is enabled. Constraints None Range None Default Value true |
dli.sql.job.timeout | Definition Sets the job running timeout interval. If the timeout interval expires, the job is canceled. Unit: second Constraints None Range None Default Value None |
spark.sql.optimizer.dynamicPartitionPruning.enabled | Definition This parameter is used to control whether to enable dynamic partition pruning. Dynamic partition pruning can help reduce the amount of data that needs to be scanned and improve query performance when executing SQL queries. Constraints None Range
Default Value true |
Parameter | Mandatory | Type | Description |
|---|---|---|---|
key | Yes | String | Definition Tag key Constraints None Range A tag key can contain up to 128 characters, cannot start or end with a space, and cannot start with _sys_. Only letters, digits, spaces, and the following special characters are allowed: _.:+-@ Default Value None |
value | Yes | String | Definition Tag value Constraints None Range A tag value can contain up to 255 characters. Only letters, digits, spaces, and the following special characters are allowed: _.:+-@ Default Value None |
Response Parameters
Parameter | Mandatory | Type | Description |
|---|---|---|---|
is_success | Yes | Boolean | Definition Whether the request is successfully executed. true indicates that the request is successfully executed. Range None |
message | Yes | String | Definition System prompt. If the execution succeeds, this parameter may be left blank. Range None |
job_id | Yes | String | Definition ID of a job returned after a job is generated and submitted using SQL statements. The job ID can be used to query the job status and results. Range None |
job_type | Yes | String | Definition Job type Range DDL, DCL, IMPORT, EXPORT, QUERY, and INSERT.
|
schema | No | Array of Map | Definition If the statement type is DDL, the column name and type of DDL are displayed. Range None |
rows | No | Array of objects | Definition When the statement type is DDL and dli.sql.sqlasync.enabled is set to false, the execution results are returned directly. However, only a maximum of 1,000 rows can be returned. If there are more than 1,000 rows, obtain the results asynchronously. That is, when submitting the job, set xxxx to true, and then obtain the results from the job bucket configured by DLI. The path of the results on the job bucket can be obtained from the result_path in the return value of the ShowSqlJobStatus API. The full data of the results will be automatically exported to the job bucket. Range None |
job_mode | No | String | Definition Job execution mode. Options:
Range None |
Example Request
Submit a SQL job. The job execution database and queue are db1 and default, respectively. Then, add the tags workspace=space1 and jobName=name1 for the job.
{"currentdb": "db1","sql": "desc table1","queue_name": "default","conf": ["dli.sql.shuffle.partitions = 200"],"tags": [{"key": "workspace","value": "space1"},{"key": "jobName","value": "name1"}]}
Example Response
{"is_success": true,"message": "","job_id": "8ecb0777-9c70-4529-9935-29ea0946039c","job_type": "DDL","job_mode":"sync","schema": [{"col_name": "string"},{"data_type": "string"},{"comment": "string"}],"rows": [["c1","int",null],["c2","string",null]]}
Status Codes
Table 6 describes status codes.
Status Code | Description |
|---|---|
200 | Submitted successfully. |
400 | Request error. |
500 | Internal server error. |
Error Codes
If an error occurs when this API is called, the system does not return the result similar to the preceding example, but returns an error code and error message. For details, see Error Codes.