Re: [R] Labour Statistics

Max Wed, 15 Oct 2008 09:23:16 -0700

Ruben,

Thankyou for the advice. I'll do what I can with it.



Ruben Wrote:
If I understand your problem correctly, you have that the magnitude of
deviations from the mean/median/mode in the volume of your requests for
background checks in month m predicts a multivariate response that
represents the macroeconomic situation in month m+1.
First, regarding your original question, a statistician judging your

product would like to see a measure of predictive success. If you havea

model to relate your predictor (the deviation in volume of requests)
with your response (several variables representing the macroeconomic
status) then you could run the model for many months (say from Jan 2000
to Sep 2008) and predict the macroeconomic status with the model and
compare it with the actual macroeconomic status observed. This would be
framed into a measure of predictive success and predictive mean squared
error.

Second, regarding what method to use to fit the relation between your
predictor and the multivariate response, you have a number of options.
One simple alternative that would reduce your problem to a simple

univariate modeling problem would be to research the economicliterature

to define an index of macroeconomic status that would reduce your
multivariate response to an univariate response. Additionally, if the
variables in the multivariate response are strongly correlated, you can
define your own index by using principal component analysis on the
multivariate response, and later use the first principal component as a
univariate response. After that, many options are again available, such
as forecasting methods or regular time series analysis. A more complex
but probably more precise approach would be to model the multivariate
response as such. This depends on the nature of the variables in the
multivariate response. If they can be considered as multinomial counts

then you have a very good solution using multinomial logisticregression

with function multinom in package nnet.
Maybe this can get you started.
Regards
Rubén

Gad Abraham explained :
Max wrote:
Hi everyone,
This is not so much of an R question as a statistics question. I currentlywork for the largest pre employment screening company in Canada. Uppermanagement has noticed that noticed that usually a month or so before anybig kind of economic shock happens, that our incoming files (requests fora background check) jump up or down.
As the company statistician, they've asked me to see if the relationshipis strong enough to put together a product that can be sold to any kind offirm or organization (brokerages or any kind of investing firm, federalministry of finance, statistics canada (like the bureau of stats in theUSA), universities etc)
In Canada on the 10th of every month, statistics canada releases labourstatistics for the previous month. The way CFO sees it, *ideally* on the(1st to 10th, something like that) every month, the firm I work for couldbe releasing data for the rest of the month.
What I'm trying to figure out is if you were in the position of evaluatingthe final product for purchase, what kind of information would make theproduct credible/viable? Summary statistics? Variance covariance matrices?Graphs of the data? Cross Correlation matrices for time series analysis?
It's frustrating because I can see a noticeable relationship between ourfile volume and the unemployment rate (in particular,) but I'm not surehow to appropriately frame it in a way that another statistician/modelerwould want the data.
Why not start with some simple plots of the relationships between yourvariables? Once you have a feel for the problem, you can look intomodelling it more formally using a suitable regression model.
Gad, the issue I have is that I technically have one predictor for multipleresponse. The data is not very clean for simple univariate models.Unfortunately, my knowledge of multivariate response models is poor, and howto set up the problem in R as a multivariate regression is a total mystery tome. (Multivariate was the one course that I wasn't able to take in myundergrad math/stats degree. )
The other issue is that if I view the problem as a time series problem, it'smultiple time series analysis, which I don't have any books on.
The more I look at the data and the problem the more I feel like I'm in wayover my head.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Labour Statistics

Reply via email to