Hive allows you to emit all the elements of an array into multiple rows using the
explode UDTF, but there is no easy way to explode multiple arrays at the same time.
Say you have a table
my_table which contains two array columns, both of the same size. (Say you had an ordered list of multiple values, possibly of different types). Then you need to have to explode the arrays, and have a row which contains the values from the two arrays.
This can be easily solved with Brickhouse’s
numeric_range UDTF. It emits an integer according to a specified numeric range. It takes 1,2 or 3 arguments. For 1 argument, it emits an integer from 0 to n -1. For 2, it emits from the first argument to the second argument value – 1. For 3, it uses the third argument as an increment value.
array_index UDF simply returns an array’s value at the i’th index. It is needed because currently Hive’s bracket [ ] operators support only constant values. ( See https://issues.apache.org/jira/browse/HIVE-1955 ). Brickhouse also contains a
map_index UDF to return the value of a map for a particular key.