Machine Learning - Skill Assignment Quiz

Q1. You are part of a data science team that is working for a national fast-food chain. You create a simple report that shows trend: Customers who visit the store more often and buy smaller meals spend more than customers who visit less frequently and buy larger meals. What is the most likely diagram that your team created?

  •  multiclass classification diagram
  •  linear regression and scatter plots
  •  pivot table
  •  K-means cluster diagram

Q2. You work for an organization that sells a spam filtering service to large companies. Your organization wants to transition its product to use machine learning. It currently a list Of 250,00 keywords. If a message contains more than few of these keywords, then it is identified as spam. What would be one advantage of transitioning to machine learning?

  •  The product would look for new patterns in spam messages.
  •  The product could go through the keyword list much more quickly.
  •  The product could have a much longer keyword list.
  •  The product could find spam messages using far fewer keywords.

Q3. You work for a music streaming service and want to use supervised machine learning to classify music into different genres. Your service has collected thousands of songs in each genre, and you used this as your training data. Now you pull out a small random subset of all the songs in your service. What is this subset called?

  •  data cluster
  •  Supervised set
  •  big data
  •  test data

Q4. In traditional computer programming, you input commands. What do you input with machine learning?

  •  patterns
  •  programs
  •  rules
  •  data

Q5. Your company wants to predict whether existing automotive insurance customers are more likely to buy homeowners insurance. It created a model to better predict the best customers contact about homeowners insurance, and the model had a low variance but high bias. What does that say about the data model?

  •  It was consistently wrong.
  •  It was inconsistently wrong.
  •  It was consistently right.
  •  It was equally right end wrong.

Q6. You want to identify global weather patterns that may have been affected by climate change. To do so, you want to use machine learning algorithms to find patterns that would otherwise be imperceptible to a human meteorologist. What is the place to start?

  •  Find labeled data of sunny days so that the machine will learn to identify bad weather.
  •  Use unsupervised learning have the machine look for anomalies in a massive weather database.
  •  Create a training set of unusual patterns and ask the machine learning algorithms to classify them.
  •  Create a training set of normal weather and have the machine look for similar patterns.

Q7. You work in a data science team that wants to improve the accuracy of its K-nearest neighbor result by running on top of a naive Bayes result. What is this an example of?

  •  regression
  •  boosting
  •  bagging
  •  stacking

Q8. ____ looks at the relationship between predictors and your outcome.

  •  Regression analysis
  •  K-means clustering
  •  Big data
  •  Unsupervised learning

Q9. What is an example of a commercial application for a machine learning system?

  •  a data entry system
  •  a data warehouse system
  •  a massive data repository
  •  a product recommendation system

Q10. You work for a power company that owns hundreds of thousands of electric meters. These meters are connected to the internet and transmit energy usage data in real-time. Your supervisor asks you to direct project to use machine learning to analyze this usage data. Why are machine learning algorithms ideal in this scenario?

  •  The algorithms would help the meters access the internet.
  •  The algorithms will improve the wireless connectivity.
  •  The algorithms would help your organization see patterns of the data.
  •  By using machine learning algorithms, you are creating an IoT device.

Q11. To predict a quantity value. use ___.

  •  regression
  •  clustering
  •  classification
  •  dimensionality reduction

Q12. Why is naive Bayes called naive?

  •  It naively assumes that you will have no data.
  •  It does not even try to create accurate predictions.
  •  It naively assumes that the predictors are independent from one another.
  •  It naively assumes that all the predictors depend on one another.

Q13. How is machine learning related to artificial intelligence?

  •  Artificial intelligence focuses on classification, while machine learning is about clustering data.
  •  Machine learning is a type of artificial intelligence that relies on learning through data.
  •  Artificial intelligence is form of unsupervised machine learning.
  •  Machine learning and artificial intelligence are the same thing.

Q14. How do machine learning algorithms make more precise predictions?

  •  The algorithms are typically run more powerful servers.
  •  The algorithms are better at seeing patterns in the data.
  •  Machine learning servers can host larger databases.
  •  The algorithms can run on unstructured data.

Q15. Your university wants to use machine learning algorithms to help sort through incoming student applications. An administrator asks if the admissions decisions might be biased against any particular group, such as women. What would be the best answer?

  •  Machine learning algorithms are based on math and statistics, and so by definition will be unbiased.
  •  There is no way to identify bias in the data.
  •  Machine learning algorithms are powerful enough to eliminate bias from the data.
  •  All human-created data is biased, and data scientists need to account for that.

Q16. What is stacking?

  •  The predictions of one model become the inputs another.
  •  You use different versions of machine learning algorithms.
  •  You use several machine learning algorithms to boost your results.
  •  You stack your training set and testing set together.

Q17. You want to create a supervised machine learning system that identifies pictures of kittens on social media. To do this, you have collected more than 100,000 images of kittens. What is this collection of images called?

  •  training data
  •  linear regression
  •  big data
  •  test data

Q18. You are working on a project that involves clustering together images of different dogs. You take image and identify it as your centroid image. What type machine learning algorithm are you using?

  •  centroid reinforcement
  •  K-nearest neighbor
  •  binary classification
  •  K-means clustering

Q19. Your company wants you to build an internal email text prediction model to speed up the time that employees spend writing emails. What should you do?

  •  Include training email data from all employees.
  •  Include training email data from new employees.
  •  Include training email data from seasoned employees.
  •  Include training email data from employees who write the majority of internal emails.

Q20. Your organization allows people to create online professional profiles. A key feature is the ability to create clusters of people who are professionally connected to one another. What type of machine learning method is used to create these clusters?

  •  unsupervised machine learning
  •  binary classification
  •  supervised machine learning
  •  reinforcement learning

Q21. Random forest is modified and improved version of which earlier technique?

  •  aggregated trees
  •  boosted trees
  •  bagged trees
  •  stacked trees

 

Q22. Self-organizing maps are specialized neural network for which type of machine learning?

  •  semi-supervised learning
  •  supervised learning
  •  reinforcement learning
  •  unsupervised learning

 

Q23. Which statement about K-means clustering is true?

  •  In K-means clustering, the initial centroids are sometimes randomly selected.
  •  K-means clustering is often used in supervised machine learning.
  •  The number of clusters are always randomly selected.
  •  To be accurate, you want your centroids outside of the cluster.

Q24. You created machine learning system that interacts with its environment and responds to errors and rewards. What type of machine learning system is it?

  •  supervised learning
  •  semi-supervised learning
  •  reinforcement learning
  •  unsupervised learning

Q25. Your data science team must build a binary classifier, and the number one criterion is the fastest possible scoring at deployment. It may even be deployed in real time. Which technique will produce a model that will likely be fastest for the deployment team use to new cases?

  •  random forest
  •  logistic regression
  •  KNN
  •  deep neural network

Q26. Your machine learning system is attempting to describe a hidden structure from unlabeled data. How would you describe this machine learning method?

  •  supervised learning
  •  unsupervised learning
  •  reinforcement learning
  •  semi-unsupervised learning

  •  high variance and low bias
  •  low bias and low variance
  •  low variance and high bias
  •  high bias and high variance

Q27 You are using K-nearest neighbor and you have a K of 1. What are you likely to see when you train the model?

  •  high variance and low bias
  •  low bias and low variance
  •  low variance and high bias
  •  high bias and high variance

Q28. Your data science team wants to use the K-nearest neighbor classification algorithm. Someone on your team wants to use a K of 25. What are the challenges of this approach?

  •  Higher K values will produce noisy data.
  •  Higher K values lower the bias but increase the variance.
  •  Higher K values need a larger training set.
  •  Higher K values lower the variance but increase the bias.

 

Q29. You work for a hospital that is tracking the community spread of a virus. The hospital created a smartwatch application that uploads body temperature data from hundreds of thousands of participants. What is the best technique to analyze the data?

  •  Use reinforcement learning to reward the system when a new person participates.
  •  Use unsupervised machine learning to cluster together people based on patterns the machine discovers.
  •  Use Supervised machine learning to sort people by demographic data.
  •  Use Supervised machine learning to classify people by body temperature.

Q30. Someone on your data science team recommends that you use decision trees, naive Bayes and K-nearest neighbor, all at the same time, on the same training data, and then average the results. What is this an example of?

  •  regression analysis
  •  unsupervised learning
  •  high -variance modeling
  •  ensemble modeling

Q31. Someone on your data science team recommends that you use decision trees, naive Bayes and K-nearest neighbor, all at the same time, on the same training data, and then average the results. What is this an example of?

  •  machine learning algorithm
  •  training set
  •  big data test set
  •  data cluster

Q32. You work for a website that enables customers see all images of themselves on the internet by uploading one self-photo. Your data model uses 5 characteristics to match people to their foto: color, eye, gender, eyeglasses and facial hair. Your customers have been complaining that get tens of thousands of photos without them. What is the problem?

  •  You are overfitting the model to the data
  •  You need a smaller training set
  •  You are underfitting the model to the data
  •  You need a larger training set

Q33. Your supervisor asks you to create a machine learning system that will help your human resources department classify jobs applicants into well-defined groups. What type of system are you more likely to recommend?

  •  an unsupervised machine learning system that clusters together the best candidates.
  •  you would not recommend a machine learning system for this type of project.
  •  a deep learning artificial neural network that relies on petabytes of employment data.
  •  a supervised machine learning system that classifies applicants into existing groups.

Q34. You and your data science team have 1 TB of example data. What do you typically do with that data?

  •  you use it as your training set.
  •  You label it big data.
  •  You split it into a training set and test set.
  •  You use it as your test set.

Q35. Your data science team is working on a machine learning product that can act as an artificial opponent in video games. The team is using a machine learning algorithm that focuses on rewards: If the machine does some things well, then it improves the quality of the outcome. How would you describe this type of machine learning algorithm?

  •  semi-supervised machine learning
  •  supervised machine learning
  •  unsupervised machine learning
  •  reinforcement learning

Q36. Compared to the variance of the Maximum Likelihood Estimate (MLE), the variance of the Maximum A Posteriori (MAP) estimate is ___

  •  Higher
  •  same
  •  Lower
  •  it could be any of the above

Q37. ___ refers to a model that can neither model the training data nor generalize to new data.

  •  good fitting
  •  overfitting
  •  underfitting
  •  all of the above

Q38. You work for a website that helps match people up for lunch dates. The website boasts that it uses more than 500 predictors to find customers the perfect date, but many costumers complain that they get very few matches. What is a likely problem with your model?

  •  Your training set is too large.
  •  You are underfitting the model to the data.
  •  You are overfitting the model to the data.
  •  Your machine is creating inaccurate clusters.

Q39. The new dataset you have just scraped seems to exhibit lots of missing values. What action will help you minimizing that problem?

  •  Wise fill-in of controlled random values
  •  Replace missing values with averaging across all samples
  •  Remove defective samples
  •  Imputation

Q40. Which loss function would fit best in a categorical (discrete) supervised learning ?

  •  kullback-leibler (KL) loss
  •  Binary Crossentropy
  •  Mean Squared Error (MSE)
  •  Any L2 loss

Q41. You want to create a machine learning algorithm to identify food recipes on the web. To do this, you create an algorithm that looks at different conditional probabilities. So if the post includes the word flour, it has a slightly stronger probability of being a recipe. If it contains both flour and sugar, it even more likely a recipe. What type of algorithm are you using?

  •  naive Bayes classifier
  •  K-nearest neighbor
  •  multiclass classification
  •  decision tree

Q41. You want to create a machine learning algorithm to identify food recipes on the web. To do this, you create an algorithm that looks at different conditional probabilities. So if the post includes the word flour, it has a slightly stronger probability of being a recipe. If it contains both flour and sugar, it even more likely a recipe. What type of algorithm are you using?

  •  naive Bayes classifier
  •  K-nearest neighbor
  •  multiclass classification
  •  decision tree

Q42. What is Q-learning reinforcement learning?

  •  supervised machine learning with rewards
  •  a type of unsupervised learning that relies heavily on a well-established model
  •  a type of reinforcement learning where accuracy degrades over time
  •  a type of reinforcement learning that focuses on rewards

Q43. Your machine learning system is using labeled examples to try to predict future data, compare that data to the predicted result, and then the model. What is the best description of this machine learning method?

  •  unsupervised learning
  •  semi-supervised learning
  •  supervised learning
  •  semi-reinforcement learning

Q44. You are working with your machine learning algorithm on something called class predictor probability. What algorithm are you most likely using?

  •  multiclass binary classification
  •  naive Bayes
  •  unsupervised classification
  •  decision tree analysis

Q45. What is one of the most effective way to correct for underfitting your model to the data?

  •  Create training clusters
  •  Remove predictors
  •  Use reinforcement learning
  •  Add more predictors

Q46. What is the difference between unstructured and structured data?

  •  Unstructured data is always text.
  •  Unstructured data is much easier to store.
  •  Structured data has clearly defined data types.
  •  Structured data is much more popular.

Q47. What is ensemble modeling?

  •  when you create an ensemble of your training and test data set
  •  when you create an ensemble of different servers to run the algorithms
  •  when you find the one best algorithm for your ensemble
  •  when you use several ensembles of machine learning algorithms

Q48. When is a decision tree most commonly used?

  •  with big data products
  •  for supervised machine learning binary classification challenges
  •  to find thd best data cluster
  •  to determine "Q" in Q-learning reinforcement learning

Q49. Averaging the output of multiple decision trees helps to::

  •  Increase variance
  •  Increase bias
  •  Decrease variance
  •  Decrease bias

Q50. In the context of calculus, what is df/dx?

  •  the prediction function
  •  the derivative of f of x
  •  the derivative of x
  •  equivalent to f divided by x

Q51. In 2013, Google´s DeepMind project created a machine learning algorithm that could play an old-style Atari video game, Pong. The algorithm taught the machine how to play by creating a series of rewards. Each time the machine successfully returned the ball, the machine got a reward; each time the opponent missed the ball, the machine got a reward. How would you describe this type of machine learning algorithm?

  •  big data machine learning.
  •  Good Old-Fashioned Artificial Intelligence (GOFAI).
  •  reinforcement learning.
  •  supervised learning.

READY TO GET STARTED?

Are you ready

Let’s Make Something Amazing Together

Need help? Contact our experts
Tell us about your project