Skip to content
Jaimin Pan edited this page Nov 23, 2015 · 2 revisions

简介

MADlib能干什么呢?看一张图就明白了,以下取自
http://user2014.stat.ucla.edu/files/PivotalR_user2014/userR2014_PivotalR.pdf
image

Madlib 支持的算法

Version 1.8

数据类型转换 (Data Types and Transformations)
	数组和矩阵 (Arrays and Matrices)
		数组运算 (Array Operations)
		矩阵运算 (Matrix Operations)
		矩阵分解 (Matrix Factorization)
			低阶矩阵分解 (Low-rank Matrix Factorization)
			奇异值分解 (Singular Value Decomposition)
		规范和距离函数 (Norms and Distance functions)
		稀疏向量 (Sparse Vectors)
	降维 (Dimensionality Reduction)
		PCA主成分分析 (Principal Component Analysis)
		PCP主成分投影 (Principal Component Projection)
	编码分类变量 (Encoding Categorical Variables)

模型验证 (Model Evaluation)
	交叉验证 (Cross Validation)

统计 (Statistics)
	描述性统计 (Descriptive Statistics)
		摘要汇总 (Summary)
		Pearson相关性 (Pearson's Correlation)
	推断性统计 (Inferential Statistics)
		假设检验 (Hypothesis Tests)
		概率函数 (Probability Functions)

监督学习算法 (Supervised Learning)
	回归模型 (Regression Models)
		聚类方差 (Clustered Variance)
		COX比例风险模型分析 (Cox-Proportional Hazards Regression)
		Elastic Net回归 (Elastic Net Regularization)
		广义线性模型 (Generalized Linear Models)
		线性回归 (Linear Regression)
		逻辑回归 (Logistic Regression)
		边际效应 (Marginal Effects)
		多项式回归 (Multinomial Regression)
		有序回归 (Ordinal Regression)
		鲁棒方差 (Robust Variance)
	树模型 (Tree Methods)
		决策树 (Decision Tree)
 		随机森林 (Random Forest)
	条件随机场 (onditional Random Field)

时间序列分析
	ARIMA模型 (全称为自回归积分滑动平均模型ARIMA)

无监督学习 (Unsupervised Learning)
	关联规则 (Association Rules)
 		Apriori算法 (Apriori Algorithm)
	聚类 (Clustering)
 		k-Means (k-Means Clustering)
	主题模型 (Topic Modelling)
 		LDA (Latent Dirichlet Allocation)

Utility Functions
	线性求解器 (Linear Solvers)
		稠密线性系统 (Dense Linear Systems)
		稀疏线性系统 (Sparse Linear Systems)
	Developer Database Functions
	PMML Export
	文本分析 (Text Analysis)
		术语频率 (Term Frequency)

Early Stage Development
	基数估计 (Cardinality Estimators)
		CountMin (Cormode-Muthukrishnan)
		FM (Flajolet-Martin)
		MFV (Most Frequent Values)
	共轭梯度 (Conjugate Gradient)
	朴素贝叶斯分类 (Naive Bayes Classification)
	随机抽样 (Random Sampling)
	支持向量机 (Support Vector Machines)
Clone this wiki locally