A Blog on Analytics and Marketing

SAS, Marketing, Predictive Modeling, Statistics

Archive for the ‘Modeling’ Category

Everything about modeling.

Ten Reasons Why Models May Fail

Posted by phillippeng on September 14, 2009

Kent Leahy and Nethra Sambamoorthi list ten most common reasons why predictive models in marketing may fail. These top ten reasons are:

(1) Modeling strategy design. The person who will actually be building the model is not included in the initial discussions or design of the model.
(2) Model overfitting. The model has been “overfit” to the sample at hand ,and, consequently, does not generalize well to the actual mailing population, or is otherwise unreliable.
(3) Population shift due to environment changes. The circumstances surrounding the actual mailing change or the mailing environment turns out to be substantially different from the one on which the model was built.
(4) Model generalization too much. The model is used as though it were ‘generic’ or ‘universally applicable’.
(5) Population shift and model overfitting. Changes in the mailing environment in conjunction with the use of an ‘overfitted’ model.
(6) Model out-of-date. The model contains “post-event” variable(s), or those that occurred after the event you are trying to predict.
(7) Model validation and implementation. Not ‘test-scoring’ the model, or making an error when implementing the model.
(8) Sample selection QC. Failing to run an audit of the file as the first step in the model-building process.
(9) Miss the model expectation. A consensus on just exactly what the model is expected to predict (and for which audience) is not reached and/or well understood.
(10) Poor fanancial Planning. The model performs well but the mailing itself is not a financial ‘success’.

Reference: http://www.crmportals.com/crmnews/2002123.html

Advertisements

Posted in Marketing, Modeling | Leave a Comment »

Resources from SIAM

Posted by phillippeng on September 10, 2009

SIAM stands for Society for Industrial and Applied Mathematics. Invited and Prize Lectures from the 2008 and 2009 Annual Meetings are captured as slides with audio. You can access them through the following link:

http://live.blueskybroadcast.com/bsb/client/CL_DEFAULT.asp?Client=975312&title=Home

The proceedings section includes good coverage on data mining:

http://www.siam.org/proceedings/

Posted in Modeling | Tagged: | Leave a Comment »

Time Series (1)

Posted by phillippeng on October 15, 2008

Text and Resources on time series:

1. Enders, Walter (2004). Applied Econometrics Time Series, 2nd edition, New York: Wiley & Sons, Inc.
2. Wei, William W.S. (1990). Time Series Analysis: Univariate and Multivariate Methods, New York: Addison-Wesley Publishing Co., Inc.
3. Box, G.E.P., G.M. Jenkins and G.C. Reinsel (1994). Time Series Analysis Forecasting and Control, Third Edition, San Francisco: Holden-Day, Inc.
4. Lutkepohl, Helmut (2005). New Introduction to Multiple Time Series Analysis, 3rd Edition, Springer Verlag.

Posted in Modeling | Tagged: , , | Leave a Comment »

Scoring observations using PROC FASTCLUS

Posted by phillippeng on October 6, 2008

PROC FASTCLUS can be used to perform a k-means clustering for observations. All the observations in the training dataset are assigned to clusters on the basis of the parameterization of the procedure and of their variable values. Scoring the observations in the validation dataset using PROC FASTCLUS seems a little bit challenging because the cluster assignment rules depend on new observations now.

Scoring new observations without changing the cluster assignment rules can be achieved by using a SEED dataset in PROC FASTCLUS.

/*original clustering */

%let indsn = input;  *your input dataset;
%let nclus = maxclus; *number of clusters to request;
%let indvars = varlist; *independent variables to run proc fastclus on;
%let valid = val_data; *validation dataset to score;

proc fastclus data=&indsn maxclusters = &nclus outseed= clusterSeeds;
var &indvars;
run;

/*scoring new observations using the seed dataset */
proc fastclus data=&valid  out=&valid._scored seed = clusterSeeds maxclusters = &nclus maxiter = 0;
var &indvars;
run;

Reference:
“Data Preparation for Analytics Using SAS” By Gerhard Svolba, Gerhard Svolba, Ph.D.

Posted in Modeling, SAS | Tagged: , , , , | Leave a Comment »

PROC GAM – Detect the Seasonality

Posted by phillippeng on October 2, 2008

Fluctuations embedded in a data series over time may come from random fluctuations or latent seasonality. Very often, we need to test if there’s a true seasonality trend. If so, what seasonality pattern is it? The example in the following link provides a perfect illustration on how to examine the seasonality using SAS PROC GAM.

proc gam

Posted in Modeling, SAS | Tagged: , , | Leave a Comment »