Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

Eric Berger Tue, 21 Nov 2017 13:18:19 -0800

Correct 

Sent from my iPhone


> On 21 Nov 2017, at 22:42, Joe O <joerodonn...@gmail.com> wrote:
> 
> Hi Eric,
> 
> Thank you, that helps a lot. If I'm understanding correctly, if I’m wanting 
> to use actual returns from backtests rather than simulated returns, I would 
> need to make sure my risk-adjusted return measure, sharpe ratio in this case, 
> matches up in scale with my returns (i.e. daily returns with daily sharpe, 
> monthly with monthly, etc). And I wouldn’t need to transform returns like the 
> simulated returns are in the vignette, as the real returns are going to have 
> whatever properties they have (meaning they will have whatever average and 
> std dev they happen to have). Is that correct? 
> 
> Thanks, -Joe
> 
> 
>> On Tue, Nov 21, 2017 at 5:36 AM, Eric Berger <ericjber...@gmail.com> wrote:
>> [re-sending - previous email went out by accident before complete]
>> Hi Joe,
>> The centering and re-scaling is done for the purposes of his example, and 
>> also to be consistent with his definition of the sharpe function.
>> In particular, note that the sharpe function has the rf (riskfree) parameter 
>> with a default value of .03/252 i.e. an ANNUAL 3% rate converted to a DAILY 
>> rate, expressed in decimal.
>> That means that the other argument to this function, x, should be DAILY 
>> returns, expressed in decimal.
>> 
>> Suppose he wanted to create random data from a distribution of returns with 
>> ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal. 
>> The equivalent DAILY returns would have mean MU_D = MU_A / 252 and standard 
>> deviation SIGMA_D =  SIGMA_A/SQRT(252).
>> 
>> He calls MU_D by the name mu_base  and  SIGMA_D by the name sigma_base.
>> 
>> His loop now converts the random numbers in his matrix so that each column 
>> has mean MU_D and std deviation SIGMA_D.
>> 
>> HTH,
>> Eric
>> 
>> 
>> 
>>> On Tue, Nov 21, 2017 at 2:33 PM, Eric Berger <ericjber...@gmail.com> wrote:
>>> Hi Joe,
>>> The centering and re-scaling is done for the purposes of his example, and 
>>> also to be consistent with his definition of the sharpe function.
>>> In particular, note that the sharpe function has the rf (riskfree) 
>>> parameter with a default value of .03/252 i.e. an ANNUAL 3% rate converted 
>>> to a DAILY rate, expressed in decimal.
>>> That means that the other argument to this function, x, should be DAILY 
>>> returns, expressed in decimal.
>>> 
>>> Suppose he wanted to create random data from a distribution of returns with 
>>> ANNUAL mean MU_A and ANNUAL std deviation SIGMA_A, both stated in decimal. 
>>> The equivalent DAILY
>>> 
>>> Then he does two steps: (1) generate a matrix of random values from the 
>>> N(0,1) distribution. (2) convert them to DAILY
>>> After initializing the matrix with random values (from N(0,1)), he now 
>>> wants to create a series of DAILY
>>> sr_base <- 0
>>> mu_base <- sr_base/(252.0)
>>> sigma_base <- 1.00/(252.0)**0.5
>>> for ( i in 1:n ) {
>>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>>> 
>>>> On Tue, Nov 21, 2017 at 2:10 PM, Bert Gunter <bgunter.4...@gmail.com> 
>>>> wrote:
>>>> Wrong list.
>>>> 
>>>> Post on r-sig-finance instead.
>>>> 
>>>> Cheers,
>>>> Bert
>>>> 
>>>> 
>>>> 
>>>> On Nov 20, 2017 11:25 PM, "Joe O" <joerodonn...@gmail.com> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> I'm trying to understand how to use the pbo package by looking at a
>>>> vignette. I'm curious about a part of the vignette that creates simulated
>>>> returns data. The package author transforms his simulated returns in a way
>>>> that I'm unfamiliar with, and that I haven't been able to find an
>>>> explanation for after searching around. I'm curious if I need to replicate
>>>> the transformation with real returns. For context, here is the vignette
>>>> (cleaned up a bit to make it reproducible):
>>>> 
>>>> (Full vignette:
>>>> https://cran.r-project.org/web/packages/pbo/vignettes/pbo.html)
>>>> 
>>>> library(pbo)
>>>> #First, we assemble the trials into an NxT matrix where each column
>>>> #represents a trial and each trial has the same length T. This example
>>>> #is random data so the backtest should be overfit.`
>>>> 
>>>> set.seed(765)
>>>> n <- 100
>>>> t <- 2400
>>>> m <- data.frame(matrix(rnorm(n*t),nrow=t,ncol=n,
>>>>                        dimnames=list(1:t,1:n)), check.names=FALSE)
>>>> 
>>>> sr_base <- 0
>>>> mu_base <- sr_base/(252.0)
>>>> sigma_base <- 1.00/(252.0)**0.5
>>>> for ( i in 1:n ) {
>>>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>>>> #We can use any performance evaluation function that can work with the
>>>> #reassembled sub-matrices during the cross validation iterations.
>>>> #Following the original paper we can use the Sharpe ratio as
>>>> 
>>>> sharpe <- function(x,rf=0.03/252) {
>>>>   sr <- apply(x,2,function(col) {
>>>>     er = col - rf
>>>>     return(mean(er)/sd(er))
>>>>   })
>>>>   return(sr)}
>>>> #Now that we have the trials matrix we can pass it to the pbo function
>>>>  #for analysis.
>>>> 
>>>> my_pbo <- pbo(m,s=8,f=sharpe,threshold=0)
>>>> 
>>>> summary(my_pbo)
>>>> 
>>>> Here's the portion i'm curious about:
>>>> 
>>>> sr_base <- 0
>>>> mu_base <- sr_base/(252.0)
>>>> sigma_base <- 1.00/(252.0)**0.5
>>>> for ( i in 1:n ) {
>>>>   m[,i] = m[,i] * sigma_base / sd(m[,i]) # re-scale
>>>>   m[,i] = m[,i] + mu_base - mean(m[,i]) # re-center}
>>>> 
>>>> Why is the data transformed within the for loop, and does this kind of
>>>> re-scaling and re-centering need to be done with real returns? Or is this
>>>> just something the author is doing to make his simulated returns look more
>>>> like the real thing?
>>>> 
>>>> Googling around turned up some articles regarding scaling volatility to the
>>>> square root of time, but the scaling in the code here doesn't look quite
>>>> like what I've seen. Re-scalings I've seen involve multiplying some short
>>>> term (i.e. daily) measure of volatility by the root of time, but this isn't
>>>> quite that. Also, the documentation for the package doesn't include this
>>>> chunk of re-scaling and re-centering code. Documentation: https://cran.r-
>>>> project.org/web/packages/pbo/pbo.pdf
>>>> 
>>>> So:
>>>> 
>>>>    -
>>>> 
>>>>    Why is the data transformed in this way/what is result of this
>>>>    transformation?
>>>>    -
>>>> 
>>>>    Is it only necessary for this simulated data, or do I need to
>>>>    similarly transform real returns?
>>>> 
>>>> I read in the posting guide that stats questions are acceptable given
>>>> certain conditions, I hope this counts. Thanks for reading,
>>>> 
>>>> -Joe
>>>> 
>>>> <http://www.avg.com/email-signature?utm_medium=email&;
>>>> utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>>> Virus-free.
>>>> www.avg.com
>>>> <http://www.avg.com/email-signature?utm_medium=email&;
>>>> utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>> 
>>>>         [[alternative HTML version deleted]]
>>>> 
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> 
>>>>         [[alternative HTML version deleted]]
>>>> 
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
> 

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Do I need to transform backtest returns before using pbo (probability of backtest overfitting) package functions?

Reply via email to