You are not logged in.

Unanswered posts



Important! This site has been replaced. All content here is read-only. Please visit our brand-new community at https://community.talend.com/. We look forward to hearing from you there!



#1 2014-11-04 20:04:38

bwgriffith
Member
7 posts

bwgriffith said:

Demo2_UseCase_HiveELT - MapR Sandbox error

Tags: [, error, java]

I'm trying to work through the HiveELT demo on the MapR sandbox.  I was able to do the load data and create tables demos just fine.  However, it seems to be whenever the client needs to execute a mapreduce as part of the Hive job, it fails.
Regular mapreduce demos also work fine.  I can even run the generated HiveQL statement in Hue, so this seems to be only affecting Hive based MapReduce through the client.  Error portion of the Log is below, any help would be appreciated!

java.io.IOException: cannot find dir = maprfs://maprdemo:7222/user/talend/data/usecase1/in/orders/orders.txt in pathToPartitionInfo: [maprfs:/user/talend/data/usecase1/in/orders, maprfs:/user/talend/data/usecase1/in/customers]
    at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:298)
    at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:260)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.<init>(CombineHiveInputFormat.java:104)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:409)
    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1060)
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1052)
    at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:173)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:934)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:885)
[ERROR]: org.apache.hadoop.hive.ql.exec.Task - Job Submission failed with exception 'java.io.IOException(cannot find dir = maprfs://maprdemo:7222/user/talend/data/usecase1/in/orders/orders.txt in pathToPartitionInfo: [maprfs:/user/talend/data/usecase1/in/orders, maprfs:/user/talend/data/usecase1/in/customers])'

Offline

#2 2014-11-04 21:00:15

willm
Member
477 posts

willm said:

Re: Demo2_UseCase_HiveELT - MapR Sandbox error

Your error message seems to indicate you're missing your load file in hdfs... Can you either use the WebUI to browse HDFS to confirm that the file is there, or verify using PuTTY

Offline

#3 2014-11-04 21:15:11

bwgriffith
Member
7 posts

bwgriffith said:

Re: Demo2_UseCase_HiveELT - MapR Sandbox error

I verified the files exists, both through Hue, and the console.  The HiveQL created in the HiveMap task, can be run successfully through Hue as well, which validates the data is there.  But this still continues to fail on the last task.  I've made no changes to the VM, and ran the pre-req tasks to load data and create the metadata.

Offline

#4 2014-11-04 22:04:05

jandry
Talend Team


jandry said:

Re: Demo2_UseCase_HiveELT - MapR Sandbox error

Can you post the query and the create table statement?  And a screenshot of the process?

Offline

#5 2014-11-04 22:09:19

bwgriffith
Member
7 posts

bwgriffith said:

Re: Demo2_UseCase_HiveELT - MapR Sandbox error

I'm not sure where to grab the create table statement, as a pre-req job created the table fine.  I've attached a SS of the job that loads it, and where it fails.
The query statement that works in hue is (customer and order are the files referenced in the error):

SELECT
customers.customernumber, customers.customername, customers.streetaddress, customers.city, customers.zip, customers.state, SUM(orders.amount), COUNT(orders.amount), MIN(orders.amount), MAX(orders.amount), AVG(orders.amount)
FROM
 customers JOIN  orders ON(  orders.customernumber = customers.customernumber  )
GROUP BY customers.customernumber,  customers.customername,  customers.streetaddress,  customers.city,  customers.zip,  customers.state
mini_Talend_Error.jpg_20141104-2209.jpg

Last edited by bwgriffith (2014-11-04 22:10:16)

Offline

#6 2014-11-05 01:22:10

jandry
Talend Team


jandry said:

Re: Demo2_UseCase_HiveELT - MapR Sandbox error

Ok so this may be a simple path problem.  If you look at the job that created the tables and check where the orders.txt file is being created, I think it is different from the path in  your error message.

Offline

#7 2014-11-05 02:37:33

bwgriffith
Member
7 posts

bwgriffith said:

Re: Demo2_UseCase_HiveELT - MapR Sandbox error

Everything checks out.  I can even browse and query the orders and customer table in Hue.  I can even run the query referenced above.  This error only occurs in a hive task that executes a mapreduce.  Creating and dropping tables runs fine (I can connect to hive).
I'm not sure if its a reference issue.  But I downloaded this VM from Talend and made no changes before executing these tasks...

Offline

Board footer

Talend Contributor Agreement - Talend Website Privacy Policy