Questions tagged [hive]

Questions about Hive and HiveQL.

Apache Hive is a data warehouse tool for working with large data sets.

Main Site: https://hive.apache.org

Questions about Hive and HiveQL.

34 questions
5
votes
1 answer

Cumulative sum using hiveql

I have a table in Hive which looks like: col1 col2 b 1 b 2 a 3 b 2 c 4 c 5 How do I, with hiveql, group up col1 elements together, sum them up, sort by the sum, as well as create a…
klx123
  • 53
  • 1
  • 1
  • 4
2
votes
1 answer

joining the where clause to select statement of two different tables

How can I properly structure the below query to make it work? I would like to have the Ref_CD = MBR_ID_TYPE_ID to be my select statement on the beginning of it. select MBR_ID_TYP_ID || '-' || ( select ref_desc from ref where…
Ina gurey
  • 21
  • 2
1
vote
1 answer

SQL filter only if each unique value has more than N records

Here is my sample SQL statement: SELECT DAY, name, value FROM my_table WHERE DAY = '${date}' GROUP BY DAY name, value ORDER BY name ASC For example, 3 unique names in 'name' column: Alice, Bob, Clark. Alice has 5…
TJCLK
  • 127
  • 1
  • 6
1
vote
0 answers

Hive query with RANK() in a WHERE clause

I want to do this, to get all the rows for a given a,b combo that have the highest value of c: SELECT a, b, c FROM x WHERE RANK() OVER (PARTITION BY a,b ORDER BY c DESC) = 1 It fails, saying invalid column reference 'c': (possible column names are:…
PhilHibbs
  • 539
  • 1
  • 6
  • 20
0
votes
0 answers

Sum of integers between two rows in Hive

Suppose I have a Hive table that looks like this: a b 41 77 8 32 31 76 I would like to have another column say c which applies a function that loops through all integers between columns a and b. For example, c could contain the sum of…