主页

【AI】Wide&Deep Item-based content retrieval(基于内容的协同过滤)

Wide&Deep Advantage: 记忆和泛化能力的高度结合。 Key word: DNN, Generalization, Regressor Usage: 用于推荐,结合线性模型的记忆能力,和不需要做出很好的特征工程就能拥有强大泛化能力的DNN模型,来达到更加精准的预测效果 Motivation Categorial data using one-hot encoding representation could memorize a feature pair that correlates with the target label. But this requires feature engineering. DNNs could learn low-di...

阅读更多

【AI】Logistic Regression详解

Logistic Regression Advantage: Use logistic function Usage: Used to predict classification problems Introduction Logistic regression is used to predict classification result Details Suppose we try to predict whether an email is spam, we have attributes $x_1, x_2, … x_n$ for prediction. We define a function $z = \theta_0 + \theta_1x_1 + \th...

阅读更多

【AI】DSMM Item-based系统过滤

DSMM Advantage: Use DNN models to do this. Key word: DNN Usage: Used in text-based query-to-document retrieval. Objective Retrieve documents related with an input query. Proposed DNN architecture: Term vector Term vector use letter or word as a dimension and counts of that term. Word hashing In order to reduce the dimensionality of bag...

阅读更多

【AI】GBDT(Gradient Boosting Decision Tree)详解

GBDT Advantage: The advantage of GBDT is that it achieve great performance. 对数据特征尺度不敏感,自动填补确实特征,可做特征筛选,效果较为突出。 Key word: Regressor Usage: A regressor to forecast different labels or values Great Explanation GBDT Algorithms: Principles - Develop Paper Watch chapter 3 for formulation Adobe Acrobat Core idea Whether weak learners could be m...

阅读更多

【AI】Arima模型详解

ARIMA Key word: Model, Time Series Usage: A statistical analysis model that uses time series data to either better understand the data set or to better predict future trends. Background Stationary and differencing Stationary time series does not depend on time at which the series is observed. Stationary time series will not have any predict...

阅读更多