Building Batch Data Pipelines on GCP
All Quiz Answer
EL, ELT, ETL Quiz 1
Q1)Which of the following is the ideal use case for Extract and Load (EL)
- Scheduled periodic loads of log files (e.g. once a day)
_________________________________________________________________________________
Executing Spark on Cloud Dataproc Quiz 2
Q1) Which of the following statements are true about Cloud Dataproc ?
- Lets you run Spark and Hadoop clusters with minimal administration
- Helps you create job-specific clusters without HDFS
Q2) Match each of the terms with what they do when setting up clusters in Cloud Dataproc:
Term Definition
__ 1. Zone A. Costs less but may not be available always
__ 2. Standard Cluster mode B. Determines the Google data center where compute nodes will be
__ 3. Preemptible C. Provides 1 master and N workers
Answer
- B
- C
- A
Q3) Cloud Dataproc provides the ability for Spark programs to separate compute & storage by:
- Reading and writing data directory from/to Cloud Storage
_________________________________________________________________________________
Cloud Data Fusion and Cloud Composer Quiz 3
Q1)Cloud Data Fusion is the ideal solution when you need
- to build visual pipelines
_________________________________________________________________________________
Data Processing with Cloud Dataflow Quiz 4
Q1) Which of the following statements are true ?
- Dataflow executes Apache Beam pipelines
- Dataflow transforms support both batch and streaming pipelines
Q2) Match each of the Dataflow terms with what they do in the life of a dataflow job:
Term Definition
__ 1. Transform A. Output endpoint for your pipeline
__ 2. PCollection B. A data processing operation or step in your pipeline
__ 3. Sink C. A set of data in your pipeline
- B
- C
- A
0 Comments