You are designing an AI solution that will provide feedback to teachers who train students over the Internet. The students will be in classrooms located in remote areas. The solution will capture video and audio data of the students in the classrooms.
You need to recommend Azure Cognitive Services for the AI solution to meet the following requirements:
Alert teachers if a student seems angry or distracted.
Identify each student in the classrooms for attendance purposes.
Allow the teachers to log the text of conversations between themselves and the students.
Which Cognitive Services should you recommend?
A. Computer Vision, Text Analytics, and Face API
B. Video Indexer, Face API, and Text Analytics
C. Computer Vision, Speech to Text, and Text Analytics
D. Text Analytics, QnA Maker, and Computer Vision
E. Video Indexer, Speech to Text, and Face API
Correct Answer: E
Explanation/Reference:
Explanation:
Azure Video Indexer is a cloud application built on Azure Media Analytics, Azure Search, Cognitive Services (such as the Face API, Microsoft Translator, the Computer Vision API, and Custom Speech Service). It enables you to extract the insights from your videos using Video Indexer video and audio models.
Face API enables you to search, identify, and match faces in your private repository of up to 1 million people.
The Face API now integrates emotion recognition, returning the confidence across a set of emotions for each face in the image such as anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. These emotions are understood to be cross-culturally and universally communicated with particular facial expressions.
Speech-to-text from Azure Speech Services, also known as speech-to-text, enables real-time transcription of audio streams into text that your applications, tools, or devices can consume, display, and take action on as command input. This service is powered by the same recognition technology that Microsoft uses for Cortana and Office products, and works seamlessly with the translation and text-to-speech.
Incorrect Answers:
Computer Vision or the QnA is not required.
References:
https://docs.microsoft.com/en-us/azure/media-services/video-indexer/video-indexer-overview https://azure.microsoft.com/en-us/services/cognitive-services/face/ https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-to-text