Big Data Integration and Processing Quiz 3 Answer

Big Data Integration and Processing Quiz 3 Answer



Big Data Integration and Processing 
Quiz 3 Answer



Quiz 3 - Information Integration




Q1) What is the main problem with big data information integration?
  • Many sources
  • Mediated Schema
  • Pay-as-you-go model
  • Probabilistic Schema Mapping



Q2) What would be the two possible solutions associated with "big data" information integration as mentioned in lecture? (Choose 2)
  • Mediated Schema
  • Attribute Grouping
  • Pay-as-you-go Model
  • Customer Transactions
  • Probabilistic Schema Mapping



Q3) What are mediated schemas?
  • Schemas created from customer info.
  • A type of probabilistic schema mapping.
  • Schemas created entirely from attribute grouping.
  • Schema created from integrating two or more schemas.



Q4) In attribute grouping, how would one evaluate if two attributes should go together? (Choose 2)
  • Integrated Views
  • Candidate Designs
  • Customer Interaction
  • Similarity of Attributes
  • Probability of Two Attributes Co-occurring



Q5) What is a data item?
  • The real worth of a data value.
  • Data found in a mediated schema.
  • Data found in a customer transaction.
  • Data that represents an aspect of a real-world entity.



Q6) What is data fusion?
  • Another term for customer analytics.
  • Extracting the true value of a data item.
  • Extracting true sources from a data source.
  • Extracting a global value from a data source.



Q7) What is a potential problem of having too many data sources as mentioned in lecture?
  • Too many data values.
  • Schema mapping becomes impossible.
  • Too much data processing required for compression.
  • None, the problem is not a problem when using big data methodologies.



Q8) What do we mean when we say "the true value of a data item"?
  • Another term for data fusion.
  • Data created from statistical estimations.
  • Extrapolated data from a data item that represents the worth of that item.



Q9) What is a potential method to deal with too many data sources as mentioned in lecture?
  • None, the more the better.
  • Take less samples per tick.
  • Compare and weigh each source by their trustworthiness.
  • Randomly select a sample of sources to represent the various data sources.










Post a Comment

0 Comments