Questions tagged [hiveql]

Variant of SQL used in the Apache Hive data warehouse infrastructure. Use this tag for questions related to the Hive Query Language including HiveQL syntax and HiveQL-specific functions.

1
vote
2answers
20 views

Changing dd/mm/yyyy/ hh/mm/ss format to yyyymm in Hive

I'm using Hive at the moment. I have a column (column A) of strings which is in the following format 11/9/2009 0:00:00. I'd like to extract the yyyymm. i.e. I'd like the above string to be 200909. I'...
0
votes
1answer
20 views

How Can I Run Sequential Temp Tables & Final SELECT Query

I'm used to BigQuery where I can run temp tables with the 'WITH' clause and then join those temp tables with a final query. However, I am now using a Hive db via DataGrip where I cannot run sequential ...
1
vote
1answer
13 views

Hive explain plan where to see full table scan?

How can I see from hive EXPLAIN is there a full table scan?
1
vote
1answer
27 views

How to “filter” records in Hive table?

Imagine table with id, status and modified_date. One id can have more than one record in table. I need to get out only that row for each id that has current status together with the modified_date when ...
0
votes
0answers
12 views

how to save hive storage data in s3 as unencrypted format

I am trying to load the unencrypted data from other source in hive and trying to save in another location in S3 using EMR. However the problem is while saving the data in S3, its storing as part ...
0
votes
1answer
16 views

How to Create DDL in HIVE and save it as a file in your directory

Currently, I use the following code to show the DDL of tables in HIVE: Show create table cus_data I'm trying to write the results of that statement to a file in a given location on my command line. ...
1
vote
1answer
13 views

only keep distinct rows when doing collect_set over a moving windowing function in hive

Lets say I have a hive table that has 3 rows: merchant_id, week_id, acc_id. My goal is to collect the unique customers in the previous 4 weeks for each week and I am using a moving window to do this. ...
0
votes
0answers
25 views

Optimizing Query based on Inserting into multiple columns into target table from one column of a source table

I am trying to insert values from one column of a table to multiple columns of another table based on conditions. I have prepared the query but it's getting stuck at 80% of the MapReduce phase I am ...
0
votes
0answers
15 views

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

I am getting a weird issue while importing data to partition table. Below is the query. INSERT OVERWRITE table db.table1 partition(date) select A.*,A.date from db.table2 A; sometimes the query ...
0
votes
0answers
6 views

Hive database size check for specific period

I am using Ambari Sandbox.I want to check the hive database size for a specific time interval. I know below command which will give the entire size: dfs -du -s -h /path/to/table output I have ...
0
votes
0answers
11 views

Distribute by a Range of IDs into multiple Reducers and Sort by the IDs for efficient Access through Hive

All, I am trying to find out the most efficient way of storing data into my Hive table which enables the query engine to make the best use of bloom filters and storage index. This table has billions ...
1
vote
2answers
17 views

Get particular values from a string in hive

I have a hive table with two columns both are strings name details "john" , {"addr":"NY","phone":"1234"} "john" , {"addr":"CA", "phone":"7145"} "mary" , {"addr":"BOS","phone":"1234"} Is ...
0
votes
1answer
11 views

HiveSQL Alter table renames column and does not move data

I have a table with 100's of columns. I would like to move a column within this table to a new position. For example my table (named "sample_table") looks like: var1, var2, var3, var4 w, x, y, ...
1
vote
2answers
32 views

Explode on multiple columns in Hive

I'm trying to explode records in multiple columns in Hive. For example, if my dataset looks like this - COL_01 COL_02 COL_03 1 A, B X, Y, Z 2 D, E, F V, W I want this as ...
-2
votes
1answer
34 views

How to truncate data and drop all partitions from a Hive table using Spark

How can i delete all data and drop all partitions from a Hive table, using Spark 2.3.0 truncate table my_table; // Deletes all data, but keeps partitions in metastore alter table my_table drop ...
0
votes
2answers
31 views

Query taking time on production

We have this query where we are trying to identify customers with multiple credit option indicators. This query output we have to reflect in our report and share to business users. We have to run this ...
1
vote
1answer
45 views

Need guidance in re-writing this query

We have this query which we run to generate the calendar week data this query hits the same view twice. and perhaps creates a cartesian product due to absence of a join ON clause. Is there anyway to ...
1
vote
1answer
43 views

How to iterate over columns in the same row in HIVE table

I have a requirement like below: I've got a HIVE table containing below fields: Table: USER_PRODUCT user_id, product1_id, product2_id, product3_id, ... , product10_id Here, the actual item for ...
1
vote
1answer
26 views

Changing the partition spec of a hive table and move data

I have a external hive table employee which is partitioned by extract_timestamp (yyyy-mm-dd hh:mm:ss) as below. empid empname extract_time 1 abc 2019-05-17 00:00:00 2 def ...
0
votes
2answers
35 views

is there any built in function in Hive that calculates intersection of two lists in a hive table?

I have a hive table that have 3 columns : ["merchants_index", "weeks_index", "customer_index"]. The final goal is to calculate the percentage of repeat customers for each merchant in each week. By ...
2
votes
0answers
26 views

Force HiveServer2 to run MapReduce job

I am using hive-1.1.0. Submitting queries to HiveServer2 via Beeline which are read-only and contain no predicates will cause HiveServer2 to try to read the data from HDFS itself without spawning a ...
-3
votes
0answers
24 views

One to Many Join between two tables using SQL or HIVE

I have to two tables, one with sku column and second with location column. I want to map each sku to every location. Performing cross join is taking huge amount of compute time. Is there an alternate ...
0
votes
0answers
10 views

Calendar table in HiveQL to aggregate data by week?

I am looking for a way to aggregate tickets week over week from a data source within Hive. Not being very familiar with the syntax and what hive offers, I am curious if there is any built-in calendar ...
1
vote
2answers
35 views

How to add dynamic column with static value in hive

I have structure of table like below: 0: jdbc:hive2://vw118287.ds.dev.accenture.com> desc sample2; Getting log thread is interrupted, since query is done! +-------------+------------+----------+--+...
2
votes
4answers
80 views

remove extra zeros from string

i would like write a regex expression to remove extra zeros from a string. REGEXP_REPLACE(REGEXP_REPLACE("Input_String","^0+", ''),'0+$','') fails if input_string = 120 then output_string = 12 instead ...
-2
votes
0answers
11 views

HDFS Vs Partitioning in Hive

Replication=8 No Of Blocks=4 Each block is replicated 8 times. DIR-1-----[1,2,3] DIR-2-----[1,2,3,4] DIR-3-----[1,2,3,5] DIR-4-----[1,2,3,6] DIR-5-----[1,7,3] DIR-6-----[1,9,3] DIR-7-----[1,2,3,6,5] ...
0
votes
1answer
19 views

Problem with implementing hivemall regression function

Hello when attempting to use hive-malls regression tool kit I run into errors when attempting to build the feature representation. I've been following this guide https://hivemall.incubator.apache.org/...
1
vote
1answer
24 views

Concat String columns in hive

I need to concat 3 columns from my table say a,b,c. If the length of the columns is greater than 0 then I have to concat all 3 columns and store it as another column d in the below format. 1:a2:b3:c ...
0
votes
0answers
19 views

Spark access hive table with JsonSerde

I have a hive table create table json_tab ( c1 string, c2 int, c3 array<struct<c4:string, c5:int>> ) partitioned by (c6 bigint) row format serde 'org.apache.hive.hcatalog....
1
vote
0answers
42 views

Joining two CTEs gives different result than if CTEs were tables

I am creating two tables, the first has a serial number, and the second table is a filtered version of the first table and includes the original serial number as well as additional columns. I then do ...
1
vote
2answers
31 views

SQL Query to Select Min and Max Values For Each Day Over a Period

I would like to select all the rows that contain either min or max datetime values for each equipment_id, for every day included in the period. The code below selects the min and max datetime values ...
2
votes
1answer
49 views

How to create external tables from parquet files in s3 using hive 1.2?

I have created an external table in Qubole(Hive) which reads parquet(compressed: snappy) files from s3, but on performing a SELECT * table_name I am getting null values for all columns except the ...
1
vote
4answers
41 views

How to join hive tables based on condition of the joining column

We have a hive table like below: num value 123 A 456 B 789 C 101 D The joining table is: num Symbols 123 ASC 456001 JEN 456002 JEN 456003 JEN ...
0
votes
0answers
13 views

How to not rlike multiple string

Is there any way to filter out multiple string in a single rlike query row For example, I want the query result to be without 'apple' and 'pear' sample.name sample.text a I have an ...
1
vote
2answers
37 views

How to deal with a semicolon “;” in a string comparison?

I am writing a Hive query and I need to compare a PIN (col) to a string value. The pin is encrypted and the encrypted value contains special characters. I need to select all the rows with PIN ...
0
votes
1answer
39 views

is there any optimised way to write SQL query to find difference between two data-sets?

Following is the query and sample data-set (actual data-set is huge and residing in HDFS) I am trying to find out the diff in data-set 1 with following query. Is there any better way to achieve this ...
0
votes
0answers
53 views

left join using on clause with error:“both left and right aliases encountered in join” and using where clause but filter null values

I am using code: create table table3 as select a.*,b.* from table1 a left join table2 b on a.id=b.id where a.date>=b.date and a.age<b.age however,table1 has ...
0
votes
1answer
38 views

how to fetch the data from database in hive on hive CLI

i have final table in hive in which there are 3 columns Date,Time,A and date column contain multiple days and time column contain 24 hours data like this Date Time A 2019-...
2
votes
2answers
49 views

How to get 20 days before one date format:YYYYMMDD

How to get 20 days before one date format:YYYYMMDD? Function date_sub() seems not working. for example get date 20 days before '20180912' in Hive. I am using date_sub() in joining two tables by date....
0
votes
0answers
25 views

Can I export data from Hive into a CSV on my local machine as a Python script?

I am trying to copy the data from hive table into a CSV file on my local machine using a Python script without using Pandas. I tried connecting to Hive and can run select queries and read data from ...
1
vote
1answer
23 views

Hive query with case statement

I am trying to use a field in my data called priority in order to drive a numerical value for the DATE_ADD function. Essentially, the priority determines how many days before the issue is out of SLA. ...
0
votes
1answer
18 views

Pass Hive parameters to EMR Step

I am trying to use EMR to run a query on an EXTERNAL table partitioned by date, where the dt partition has the format YYYYmmdd i.e: 20190121. CREATE EXTERNAL TABLE `my_schema`.`tracking_table`( `id`...
0
votes
2answers
42 views

How to convert calculated formula from salesforce to SQL

We have one Opportunity table in salesforce and this table has one calculated column called as "Is_XYZ". Calculated formula for "Is_XYZ" column is - calculatedFormula: IF( AND( OR( ...
0
votes
1answer
18 views

Avoid .deflate file in hive

I am executing some queries in hive and the end result file is in .deflate format. The tables created in queries already have textFile format mentioned. I already tried setting set hive.exec....
1
vote
2answers
64 views

Getting last day of previous quarter

When I run SELECT MAKEDATE(YEAR(CURDATE()), 1) + INTERVAL QUARTER(CURDATE())-1 QUARTER - INTERVAL 1 DAY here, it works as intended. However, in Hive I get an error that I'm missing a closing ...
1
vote
1answer
39 views

How to get the percentage difference during group by stage of hive SQL if possible?

I have a query which does grouping by day and name and count the number of rows for that group. Below is the data which I get after group by phase. name day count A 2019-01-01 120 ...
3
votes
0answers
40 views

Hue distinct users in rolling 7 day count

I've spent 3 days researching this and trying to figure it out but with no luck. Right now I'm considering just loading the data into a new table one day at a time (would take too long and really don'...
0
votes
0answers
24 views

HIVE: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

When I execute hive cli in Ubuntu, I'am getting following error message: hadoopgudi@hadoopgudi-VirtualBox:~$ hive SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:...
-1
votes
0answers
19 views

Is it possible to create a partition table in hive without the non-partition table (base table)?

How to create a partition table in hive, without using the non-partition table (base table)?
2
votes
2answers
23 views

Alicloud SQL-Hive, SQL to exclude number only and alphabet only and single chinese word only

I have a column like below **col1** 1244 a888d ahahd 我 我是 19mon The output I would like to have is **col1** a888d 我是 19mon I was trying to use syntax below to exclude number only and alphabet ...

http://mssss.yulina-kosm.ru