Which statement is true in the context of evaluating metrics for machine learning algorithms?
A. A random classifier has AUC (the area under ROC curve) of 0.5
B. Using only one evaluation metric is sufficient
C. The F-score is always equal to precision
D. Recall of 1 (100%) is always a good result
Which is a preferred approach for simplifying the data transformation steps in machine learning model management and maintenance?
A. Implement data transformation, feature extraction, feature engineering, and imputation algorithms in one single pipeline.
B. Do not apply any data transformation or feature extraction or feature engineering steps.
C. Leverage only deep learning algorithms.
D. Apply a limited number of data transformation steps from a pre-defined catalog of possible operations independent of the machine learning use case.
What is an example of a supervised machine learning algorithm that can be applied to a continuous numeric response variable?
A. linear regression
B. k-means
C. local outlier factor (LOF)
D. naive Bayes
What is a class of machine learning problems where the algorithm is given feedback in the form of positive or negative reward in a dynamic environment?
A. reinforcement learning
B. feedback-based optimization
C. dynamic programming
D. reward learning
Given two multidimensional arrays of the same data type, A and B which two Python NumPy statements give the matrix product of the two matrices? (Choose two.)
A. A @ B
B. A x B
C. A * B
D. np.matprod(A,B)
E. np.dot(A,B)
Which IBM Watson Machine Learning deployment method offers the ultimate flexibility in deploying a machine learning model?
A. Watson Machine Learning Python client
B. Watson Machine Learning FORTRAN client
C. Watson Studio Project
D. Watson Machine Learning REST API
Considering one ML application is deployed using Kubernetes, its output depends on the data which is constantly stored in the model, if needing to scale the system based on available CPUs, what feature should be enabled?
A. persistent storage
B. vertical pod autoscaling
C. horizontal pod autoscaling
D. node self-registration mode
What is meant by the curse of dimensionality?
A. The number of available algorithms for a given task is high.
B. The number of available data sources for a given task is high.
C. The data sparsity becomes more severe as the number of features is increased.
D. The data sparsity becomes more severe as the number of samples is increased.
If the distribution of the height of American men is approximately normal, with a mean of 69 inches and a standard deviation of 2.5 inches, then roughly 68 percent of American men have heights between __________ and __________.
A. 64 inches and 74 inches
B. 66.5 inches and 69 inches
C. 71.5 inches and 76.5 inches
D. 66.5 inches and 71.5 inches
Given the following sentence:
The dog jumps over a fence.
What would a vectorized version after common English stopword removal look like?
A. ['dog', 'fence', 'run']
B. ['fence', 'jumps']
C. ['dog', 'fence', 'jumps']
D. ['a', 'dog', 'fence', 'jumps', 'over', 'the']