【AI】Unigram Distribution(NLP中常用)
unigram distribution
Key word: Evaluation, Loss Function
Usage: non-contextual probability of finding a specific word form in a corpus.
【AI】GBDT(Gradient Boosting Decision Tree)详解
GBDT
Advantage: The advantage of GBDT is that it achieve great performance.
对数据特征尺度不敏感,自动填补确实特征,可做特征筛选,效果较为突出。
Key word: Regressor
Usage: A regressor to forecast different labels or values
Great Explanation
GBDT Algorithms: Principles - Develop Paper
Watch chapter 3 for formulation
Adobe Acrobat
Core idea
Whether weak learners could be m...
【AI】Arima模型详解
ARIMA
Key word: Model, Time Series
Usage: A statistical analysis model that uses time series data to either better understand the data set or to better predict future trends.
Background
Stationary and differencing
Stationary time series does not depend on time at which the series is observed.
Stationary time series will not have any predict...
【AI】Term-Vector含义
Term-vector
It means that each word forms a separate dimension:
For a model containing only three words you would get:
dict = { dog, cat, lion }
Document 1
“cat cat” → (0,2,0)
Document 2
“cat cat cat” → (0,3,0)
Document 3
“lion cat” → (0,1,1)
Document 4
“cat lion” → (0,1,1)
【AI】Softmax函数
Softmax
Advantage: Map the original output vector to a space of [0,1]
Key word: Activation Function
Usage: Normalize the output of a network to a probability distribution over predicted output classes. Used in multiclass classification problems.
Form
\[\begin{equation}
P(y=j|x) = \frac{e^{x^Tw_j}}{\sum^K_{k=1}e^{x^Tw_k}}
\end{equation}\]...
【Ad】Recommendation System Overview(推荐系统详解)
Recommendation Algorithms
Advantage: Efficient and effective.
Usage: Recommendation Systems
Collaborative Filtering
User-based filtering
向用户推荐与他相似的用户感兴趣的物品。
基本步骤:
计算用户之间的相似度
根据相似度对用户进行评分
找到相似用户集合中其他用户感兴趣但是从未出现在该用户列表中的物品,推荐给该用户
相似度计算方法
余弦相似度,Cosine similarity
给定两个向量$A,B$,余弦相似性有点积和向量长度给出
\[\begin{equation}
\...
共计 136 篇文章,17 页。