Apache Drill 主要用于查询,主要关注select 和 建表语句,Drill 支持select 很标准, 这次主要介绍create语句,
Drill可使用的SQL语句:
系统设置语句
ALTER SESSION SET
ALTER SYSTEM SET
ALTER SYSTEM RESET
ALTER SYSTEM RESET ALL
RESET
SET
查询及建表语句
USE schema_name;
SELECT
CREATE TABLE AS(CTAS)
CREATE VIEW
DROP TABLE
DROP VIEW
查询系统信息
SHOW SCHEMAS(select * from INFORMATION_SCHEMA.`SCHEMATA)
SHOW DATABASES(select * from INFORMATION_SCHEMA.`
SHOW TABLES(select * from INFORMATION_SCHEMA.`TABLES`)
SHOW FILES
DESCRIBE
查询执行计划
EXPLAIN PLAN FOR
EXPLAIN PLAN WITHOUT IMPLEMENTATION FOR
系统表的查询示例
--查表
SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE
FROM INFORMATION_SCHEMA.`TABLES`
ORDER BY TABLE_NAME DESC;
--查字段
SELECT COLUMN_NAME, DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'Orders' AND TABLE_SCHEMA = 'HiveTest.SalesDB' AND COLUMN_NAME LIKE '%Total';
建表的语句示例:
注意drill的表只能在dfs上创建,一张表就对应一个目录,如dfs.tmp.sengtest01
--普通表(默认是Parquet格式 )
CREATE TABLE dfs.tmp.`sengtest01`(id,name) AS
SELECT columns[0], columns[1] FROM hdfs.`/BASEDATA/MASTERDATA/table1.csv` LIMIT 3;
drop TABLE dfs.tmp.`sengtest01`
--分区表
CREATE TABLE dfs.tmp.`sengtest02`(id,name) PARTITION BY (id) AS
SELECT columns[0], columns[1] FROM hdfs.`/BASEDATA/MASTERDATA/table1.csv` LIMIT 10000;
--json格式的表
ALTER SESSION SET `store.format`='json'
CREATE TABLE dfs.tmp.`sengtest03`(id,name) AS
SELECT columns[0], columns[1] FROM hdfs.`/BASEDATA/MASTERDATA/table1.csv` LIMIT 10000;
--view
CREATE view dfs.tmp.`sengtest05`(id,name) AS
SELECT columns[0], columns[1] FROM hdfs.`/BASEDATA/MASTERDATA/table1.csv` LIMIT 100;
具体的view就是一个描述,以下是一个示例
[seng@sengtest tmp]$ more sengtest05.view.drill
{
"name" : "sengtest05",
"sql" : "SELECT `columns`[0], `columns`[1]\nFROM `hdfs`.`/BASEDATA/MASTERDATA/table1.csv`\nFETCH NEXT 100 ROWS ONLY",
"fields" : [ {
"name" : "id",
"type" : "ANY",
"isNullable" : true
}, {
"name" : "name",
"type" : "ANY",
"isNullable" : true
} ],
"workspaceSchemaPath" : [ ]
}