CFA一級(jí)數(shù)量題目分享Text analytics,ML.Learning Module 11 Introduction to Big Data Techniques

第一題:

Text analytics is appropriate for application to:

A large, structured datasets

B public but not private information

C identifying possible short-term indicators of coming trends

解析:

C is correct. Through the text analytics application of NLP, models using NLP analysis might incorporate non-traditional information to evaluate what people are saying—through their preferences, opinions, likes, or dislikes— in an attempt to identify trends and short-term indicators—for example, about a company, a stock, or an economic event—to forecast coming trends that may affect investment performance in the future.

C是正確的。通過(guò)NLP的文本分析應(yīng)用,模型使用NLP分析非傳統(tǒng)信息來(lái)評(píng)估人們說(shuō)什么,比如通過(guò)分析他們的偏好、意見(jiàn)、喜歡或不喜歡,嘗試識(shí)別趨勢(shì)和短期指標(biāo),比如關(guān)于一個(gè)公司,一只股票,或經(jīng)濟(jì)事件預(yù)測(cè)未來(lái)趨勢(shì),這些趨勢(shì)可能會(huì)影響投資表現(xiàn)。

第二題:

Which of the following statements is true in the use of ML:

A some techniques are termed “black box” due to data biases

B human judgment is not needed because algorithms continuously learn from  data

C training data can be learned too precisely, resulting in inaccurate predictions  when used with different datasets

解析:

C is correct. Overfitting occurs when the ML model learns the input and target dataset too precisely. In this case, the model has been“overtrained”on the data and is treating noise in the data as true parameters. An ML model that has been overfitted is not able to accurately predict outcomes using a different dataset and might be too complex.

C是正確的。當(dāng)機(jī)器學(xué)習(xí)模型對(duì)輸入和目標(biāo)數(shù)據(jù)集的學(xué)習(xí)過(guò)于精準(zhǔn)時(shí),就會(huì)發(fā)生過(guò)擬合。在這種情況下,模型對(duì)數(shù)據(jù)進(jìn)行了“過(guò)度訓(xùn)練”,并將數(shù)據(jù)中的噪聲作為真實(shí)參數(shù)來(lái)處理。一個(gè)被過(guò)度擬合的模型會(huì)給出錯(cuò)誤的結(jié)果。機(jī)器學(xué)習(xí)在理解底層數(shù)據(jù)和選擇適當(dāng)?shù)臄?shù)據(jù)分析技術(shù)時(shí)仍然需要人類的判斷。由于它們沒(méi)有明確編程,ML技術(shù)可能看起來(lái)是不透明的或“黑盒”方法,它們得到的結(jié)果可能不能完全理解或解釋。