Problem Scenario 16 : You have been given following mysql database details as well as
other info.
user=retail_dba
password=cloudera
database=retail_db
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish below assignment.
1. Create a table in hive as below.
create table departments_hive(department_id int, department_name string);
2. Now import data from mysql table departments to this hive table. Please make sure that
data should be visible using below hive command, select" from departments_hive
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution :
Step 1 : Create hive table as said.
hive
show tables;
create table departments_hive(department_id int, department_name string);
Step 2 : The important here is, when we create a table without delimiter fields. Then default
delimiter for hive is ^A (\001). Hence, while importing data we have to provide proper
delimiter.
sqoop import \
-connect jdbc:mysql://quickstart:3306/retail_db \
~username=retail_dba \
-password=cloudera \
-table departments \
-hive-home /user/hive/warehouse \
-hive-import \
-hive-overwrite \
-hive-table departments_hive \
-fields-terminated-by '\001'
Step 3 : Check-the data in directory.
hdfs dfs -Is /user/hive/warehouse/departments_hive
hdfs dfs -cat/user/hive/warehouse/departmentshive/part'
Check data in hive table.
Select * from departments_hive;
Problem Scenario 92 : You have been given a spark scala application, which is bundled in
jar named hadoopexam.jar.
Your application class name is com.hadoopexam.MyTask
You want that while submitting your application should launch a driver on one of the cluster
node.
Please complete the following command to submit the application.
spark-submit XXX -master yarn \
YYY SSPARK HOME/lib/hadoopexam.jar 10
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution
XXX: -class com.hadoopexam.MyTask
YYY : -deploy-mode cluster
Problem Scenario 43 : You have been given following code snippet.
val grouped = sc.parallelize(Seq(((1,"twoM), List((3,4), (5,6)))))
val flattened = grouped.flatMap {A =>
groupValues.map { value => B }
}
You need to generate following output.
Hence replace A and B
Array((1,two,3,4),(1,two,5,6))
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution :
A case (key, groupValues)
B (key._1, key._2, value._1, value._2)
Problem Scenario 61 : You have been given below code snippet.
val a = sc.parallelize(List("dog", "salmon", "salmon", "rat", "elephant"), 3)
val b = a.keyBy(_.length)
val c = sc.parallelize(List("dog","cat","gnu","salmon","rabbit","turkey","wolf","bear","bee"), 3)
val d = c.keyBy(_.length) operationl
Write a correct code snippet for operationl which will produce desired output, shown below.
Array[(lnt, (String, Option[String]}}] = Array((6,(salmon,Some(salmon))),
(6,(salmon,Some(rabbit))),
(6,(salmon,Some(turkey))), (6,(salmon,Some(salmon))), (6,(salmon,Some(rabbit))),
(6,(salmon,Some(turkey))), (3,(dog,Some(dog))), (3,(dog,Some(cat))),
(3,(dog,Some(dog))), (3,(dog,Some(bee))), (3,(rat,Some(dogg)), (3,(rat,Some(cat)j),
(3,(rat.Some(gnu))). (3,(rat,Some(bee))), (8,(elephant,None)))
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution :
b.leftOuterJoin(d}.collect
leftOuterJoin [Pair]: Performs an left outer join using two key-value RDDs. Please note
that the keys must be generally comparable to make this work keyBy : Constructs twocomponent
tuples (key-value pairs) by applying a function on each data item. Trie result of
the function becomes the key and the original data item becomes the value of the newly
created tuples.
Problem Scenario 94 : You have to run your Spark application on yarn with each executor
20GB and number of executors should be 50. Please replace XXX, YYY, ZZZ
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
-class com.hadoopexam.MyTask \
xxx\
-deploy-mode cluster \ # can be client for client mode
YYY\
222 \
/path/to/hadoopexam.jar \
1000
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution
XXX: -master yarn
YYY : -executor-memory 20G
ZZZ: -num-executors 50
Page 6 out of 20 Pages |
Previous |