Big Data Integration and Processing Quiz 7 Answer

Big Data Integration and Processing Quiz 7 Answer


Big Data Integration and Processing 
Quiz 7 Answer



Quiz 7 - More on Spark



Q1) Which part of SPARK is in charge of creating RDDs?
  • Storage
  • Local CPU
  • Driver Program
  • Spark Executor
  • Worker Node



Q2) How does lazy evaluation work in Spark?
  • Transformations are not executed until the action stage.
  • Actions are queued and executed at a certain threshold.
  • Actions are not executed until the transformation stage.
  • Transformations are queued and executed at a certain threshold.



Q3) What are the consequences of lazy evaluation as mentioned in lecture?
  • There are no consequences.
  • Hiccups within the system during queue execution.
  • Errors sometimes do not show up until the action stage.



Q4) What is a wide transformation?
  • The name for the most used transformations.
  • Transformations that take a lot of nodes to complete.
  • A transformation that requires data shuffling across node partitions.
  • A longer time-taking transformation compared to narrow transformations.



Q5) Where does the data for each worker node get sent to after a collect function is called?
  • Spark SQL
  • Spark Context
  • Spark Streaming
  • Other Worker Nodes
  • None; Stays in the Same Node



Q6) What are DataFrames?
  • A type of narrow transformation.
  • A column like data format that can be read by Spark SQL.
  • A special type of data node that contains framework to manipulate SQL.



Q7) Can RDD's be converted into DataFrames directly without manipulation?
  • Yes
  • No: lines have to be converted into row.
  • No: RDD’s needed to be made relational first.
  • No: RDD’s cannot be converted into DataFrames.



Q8) What is the function of Spark SQL as mentioned in lecture? (Choose 3)
  • Connect to variety of databases.
  • Better worker node interpolation.
  • Enables relational queries on Spark.
  • Better ability to manipulate big data.
  • Deploy business intelligence tools over Spark.
  • Efficient data manipulation using SQL like structure.



Q9) What is a triplet in GraphX?
  • A type of data to contain vertex info.
  • A type of data to contain edge info.
  • A type of data to contain both edge and vertex info.
  • A type of data to contain the information on connections between vertices and edges.












Post a Comment

0 Comments