Which of the following statements regarding importing streaming data from InfoSphere Streams into Hadoop is TRUE?
A. InfoSphere Streams can both read from and write data to HDFS
B. The Streams Big Data toolkit operators that interface with HDFS uses Apache Flume to integrate with Hadoop
C. Streams applications never need to be concerned with making the data schemas consistent with those on Hadoop
D. Big SQL can be used to preprocess the data as it flows through InfoSphere Streams before the data lands in HDFS
Which of the following is TRUE about storing an Apache Spark object in serialized form?
A. It is advised to use Java serialization over Kryo serialization
B. Storing the object in serialized from will lead to faster access times
C. Storing the object in serialized from will lead to slower access times
D. All of the above
When we create a new table in Hive, which clause can be used in HiveSQL to indicate the storage file format?
A. SAVE AS
B. MAKE AS
C. FORMAT AS
D. STORED AS
Extracting structured data from various database into a "sandbox" location without writing code can be performed using which tool include with BigInsights?
A. Flume
B. Data Click
C. DataStage
D. Big SQL Load
What is the primary purpose of Flume in the Hadoop ecosystem?
A. To stream data from Hadoop
B. To move static files from the local file system into HDFS
C. To import data from a relational database or data warehouse into HDFS
D. To capture log data as it is written to log files and move them into HDFS
When creating a configuration file for a Flume agent, which of the following must be configured?
A. An interceptor
B. A database configuration file
C. A source, a sink, and a channel
D. All of the above
Consider the following Solr query:
curl "http://localhost:8983/solr/gettingstarted/select?wt=jsonandindent=trueandq=foundation"
What is the term that is being searched?
A. indent
B. json
C. gettingstarted
D. foundation
Which is a benefit of row oriented table design?
A. When writing a new row, if all of the row data is supplied at the same time the entire row can be written with a single disk seek
B. When columns of a single row are required at the same time, the entire row can be retrieved with a single disk seek regardless of row size
C. When new values of a column are supplied for all rows at once, that column data can be written efficiently and replace old column data without touching any other columns for the rows
D. When an aggregate needs to be computed over many rows but only a notably smaller subset of all columns of data, reading that smaller subset of data can be faster than reading all data
What Redaction feature needs to be selected when manually redacting a form through the Optim Review Tool?
A. Text Redaction
B. Image Redaction
C. Region Redaction
D. Redact by Information Type
Which Big SQL statement can be used to store a single row into an HBase table?
A. LOAD HIVE DATA
B. INSERT INTO HBASE
C. CREATE HBASE TABLE
D. LOAD USING ... INTO HBASE TABLE