在Hive中,可以使用FROM 'file_path' [OPTIONS]
语句来读取外部文件,并通过ROW FORMAT
和STORED AS
子句来指定数据的格式
- CSV格式:
CREATE EXTERNAL TABLE table_name ( column1 datatype, column2 datatype, ... ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
- JSON格式:
CREATE EXTERNAL TABLE table_name ( column1 datatype, column2 datatype, ... ) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' WITH SERDEPROPERTIES ( "serialization.format" = "1" ) STORED AS TEXTFILE;
- Parquet格式:
CREATE EXTERNAL TABLE table_name ( column1 datatype, column2 datatype, ... ) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' WITH SERDEPROPERTIES ( "serialization.format" = "1" ) STORED AS PARQUET;
- ORC格式:
CREATE EXTERNAL TABLE table_name ( column1 datatype, column2 datatype, ... ) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' WITH SERDEPROPERTIES ( "serialization.format" = "1" ) STORED AS ORC;
请将table_name
、column1
、column2
、datatype
等替换为实际的表名、列名和数据类型。同时,根据需要修改OPTIONS
和SERDEPROPERTIES
中的参数。