Suppose you want to load data into an already populated Hive table and you want to apply data as update of existing records and insert of new records. Neither Hive nor Impala query language includes MERGE command (or UPSERT known from Teradata). Also it is a good practice to ovewrite a whole table partition at once. Here is the command which performs what you would achieve in RDBMS using MERGE or UPSERT command.
Assumptions:
-- on Hive
set hive.exec.dynamic.partition.mode=true
set hive.exec.dynamic.partition.mode=nonstrict;
-- on Hive
INSERT OVERWRITE TABLE MY_TARGET_TABLE PARTITION (PARTITION_COLUMN)
SELECT
COALESCE(SRC.COLUMN_1, TGT.COLUMN_1) AS COLUMN_1,
COALESCE(SRC.COLUMN_2, TGT.COLUMN_2) AS COLUMN_2,
COALESCE(SRC.COLUMN_3, TGT.COLUMN_3) AS COLUMN_3,
COALESCE(SRC.PARTITION_COLUMN, TGT.PARTITION_COLUMN) AS PARTITION_COLUMN
FROM MY_SOURCE_TABLE SRC
FULL OUTER JOIN MY_TARGET_TABLE TGT
ON TGT.KEY_COLUMN_1 = SRC.KEY_COLUMN_1
AND TGT.KEY_COLUMN_2 = SRC.KEY_COLUMN_2
WHERE coalesce(SRC.COLUMN_3, TGT.COLUMN_3) = 'something' /* filter condition */
;
On Hive you must set properties:
set hive.exec.dynamic.partition.mode=true
set hive.exec.dynamic.partition.mode=nonstrict