Re: [R] using spec.pgram

Anthony Mathelier Mon, 16 Jun 2008 11:39:56 -0700

OK, it seems like I do not succeed in expressing what I do, or want to do.
So, I give you the example that bring me to this kind of analysis. I wrote
the paper "Chromosomal periodicity of evolutionary conserved gene pairs"
(which you can download at http://www.pnas.org/cgi/reprint/104/25/10559). In
figure 2, they have a histogram of distances between genes on a chromosome
and they make a discrete fourier transform analysis to exhibit a period of
117kb. They explain how they did in the first paragraph of "Distributions of
distances and positions and fourier transform" (last page). I thought that
this kind of analysis was made by spec.pgram with a histogram. But perhaps I
am wrong because I really do not understand what they mean by "the histogram
was tranformed into a continuous probability density by using a Gaussian
smoothing window and normalizing the total density over the entire genome to
1. A discrete Fourier transform of the data were computed from 0 to 1,000kb
by using a Tukey window to taper the end (ratio of 0.5 for tapered to
untapered length.".
I hope it explains better what I want to obtain from my distances.
Best regards,


Anthony

On Mon, Jun 16, 2008 at 5:25 PM, stephen sefick <[EMAIL PROTECTED]> wrote:

> To get some sort of frequency which in your case seem to be cycles per
> distance?  Is a valid use of a fourier transform as long as it is a distance
> that is measured in a way that would be analogous to a time series-  In
> other words if the distance proceeds from an origin in one direction-
> geophysicists do this often with the realization of an earthquake picked up
> by sensors that are a distance away from the origin of the epicenter, but
> they are looking for coherencies in the signal from one place to the next in
> the frequency domain seperated by distance- this is called beam forming-
> They use the raw signal-  by binning (making a histogram) the data you are
> loosing the signal-  you are looking at frequency of occurance of certain
> values not for the underlying periodicities of the data (in time or
> space).   You are fitting cos and isin function to you data to see if there
> is periodicity-  the power is the integration of the convolution of this sin
> and cosine function with your data-  It seems to me meaningless to preform
> this convolution agianst something that is not a signal (the histogram).  If
> you want to use a frequency domain technique you have to have a frequency to
> investigate-  a histogram does not have this-  I is a frequency of occurance
> by bin size which is NOT what you want (your would have cycles/binlength
> that doesn't make any sense to me) to do this analysis on-  You want a
> signal-  dissolved oxygen curve, sunspot record, etc. through time, or
> distance as stated above- you are looking for the frequency of a waveform-
> Anyway,  I may be misunderstanding- supply some code and explain the data
> otherwise this line of though- in my limited expertise- is a dead end,  but
> agian I still don't know what it is that you are, exactly, trying to do- and
> what your dataset constits.  I hope these ruminations help
>
> I recommend doing this analysis on the raw data- It doesn't matter that you
> don't have the same amount of data points- as long as both sets of data have
> circa ten times the length of (cycles/distance) what you want to detect-  If
> things in your case are spaced by one meter then the lowest cycle
> perdistance that you can reliably detect if 0.5 meters,  this is all
> speculation because you don't have a problem with reproducible code, and we
> have no idea what you are measuring or what your data looks like-  without
> this information there is no way that I can say one way or the other that
> you approach (suggested non-histogram) would be right or wrong.
>
> Stephen
>
>
> On Mon, Jun 16, 2008 at 9:33 AM, Anthony Mathelier <
> [EMAIL PROTECTED]> wrote:
>
>> Perhaps I'm applying spec.pgram wrong as you said. I will explain what I
>> want, so you can tell me why I'm wrong and perhaps what I have to do to do
>> it well.
>> I have some points in a 1-D space and I want to know if they are spaced at
>> a certain periodic distance. So, I computed all the distances between points
>> in my space. Then, I would like to know if a certain distance (period), or
>> multiples of a certain distance, is preferred to space my data. I made a
>> histogram of the distances and apply the spec.pgram function to know the
>> frequence (so the period) which is the most important to space the original
>> data.
>> But, when I have to sets of data (without necessarily the same number of
>> observation in each set), I want to compare the importance of the period
>> given by spec.pgram between the sets. Could I normalize the amplitude of the
>> peaks given by spec.pgram?
>> So, am I wrong to apply this methodology to exhibit a periodic distance
>> between my data? If, true, what could you recommend me to do this?
>> Thanks in advance for your answers.
>> Best regards,
>>
>> Anthony
>>
>> On Tue, Jun 10, 2008 at 6:13 PM, stephen sefick <[EMAIL PROTECTED]>
>> wrote:
>>
>>> I from a first thought I would say that you are apply this wrong!  The
>>> fourier transform convolves a function (cos(x)+isin(x)  (this may not be the
>>> exact formula but I don't have my books near)) to the data and then
>>> integrates over -1/2 to 1/2 takes the modulus and plots this- the
>>> periodogram.  The reason you preform a fourier transform is to look at
>>> recurring frequencies in the data, which are in the time domain.  The
>>> fourier transform converts the time series into the frequency domain and
>>> viola you have a peak into the hidden/recurring parts of your signal.  From
>>> your explaination your are applying this technique wrong-  look at schumway,
>>> MASS4, et al. books to get a handle on how this technique is used.  If you
>>> are to apply a time series analysis please use it on a time series.  Maybe
>>> your logic is not flawed but I don't see how a histogram with its associated
>>> binning is a better candidate for time series analysis than the original
>>> time series if at all.
>>> good luck
>>>
>>> Stephen
>>>
>>> On Tue, Jun 10, 2008 at 8:49 AM, Matthieu Stigler <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Hello
>>>>
>>>> I don't know exactly what you want to do but:
>>>>
>>>> -why do you use in your example h$counts and not h? Furthermore helpl
>>>> file says it should be a time series, why then rather not your time series?
>>>>
>>>> -usually na.action will make the "default" action, which you can see by
>>>> getOptions("na.action")
>>>>
>>>> -here in this function it is provided in the function values na.action =
>>>> na.fail so it will just remove the NA in the time series
>>>>
>>>> -if you want to study a function, I advise you to copy it entirely,
>>>> rename it and then just insert print(curiousobject...) in the function, 
>>>> this
>>>> will allow you to let the function run and grasp the interessting objects,
>>>> like:
>>>>
>>>> study<-function (x, spans = NULL, kernel = NULL, taper = 0.1, pad = 0,
>>>> fast = TRUE, demean = FALSE, detrend = TRUE, plot = TRUE,
>>>>   na.action = na.fail, ...)
>>>> {
>>>>   series <- deparse(substitute(x))
>>>>   x <- na.action(as.ts(x))
>>>>   print(x)
>>>>   xfreq <- frequency(x)
>>>> ...}
>>>> study(sunspots)
>>>>
>>>> -when you provide an example, instead of giving an external reference
>>>> for the data, try to search a convenient internal data (accessed by data()
>>>> ), so one will be able to reproduce your problems. Here you could use
>>>> sunspots
>>>>
>>>> -to obtain the commented code... I don't know it...
>>>>
>>>> -good luck
>>>>
>>>> Matthieu
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>  Hi everyone,
>>>>>
>>>>> first of all, I would like to say that I am a newbie in R, so I
>>>>> apologize in
>>>>> advance if my questions seem to be too easy for you.
>>>>>
>>>>> Well, I'm looking for periodicity in histograms. I have histograms of
>>>>> certain phenomenons and I'm asking whether a periodicity exists in
>>>>> these
>>>>> data. So, I make a periodogram with the function spec.pgram. For
>>>>> instance,
>>>>> if I have a histogram h, I call spec.pgram by spec.pgram (h, log="no",
>>>>> taper=0.5). So, I have some peaks that appear and I would like to
>>>>> interpret
>>>>> them but I do not know how they are computed and so what a peak with a
>>>>> value
>>>>> of 10000 represents in comparison with a peak of value 600 with another
>>>>> histogram.
>>>>> I looked at the source code of the function spec.pgram to better
>>>>> understand
>>>>> what is behind. But, when I apply the source code line by line, I've
>>>>> got a
>>>>> problem. For instance, I make:
>>>>>
>>>>>
>>>>>> >data = scan ("file.txt")
>>>>>> >h = hist (data, breaks=max(data)/5000)
>>>>>>
>>>>>>
>>>>> #then I apply the first two lines of the spec.pgram function
>>>>>
>>>>>
>>>>>> >series <- deparse(substitute(h$counts))
>>>>>> >x <- na.action(as.ts(h$counts))
>>>>>> >x
>>>>>>
>>>>>>
>>>>> NULL
>>>>> I do not understand why when I apply the first two lines of the
>>>>> function I
>>>>> have x which is equal to NULL (which make a mistake in the following
>>>>> lines
>>>>> of the code) but if I apply the function directly with h$counts it
>>>>> gives me
>>>>> a result.
>>>>> So, if someone can explain to me what is the problem and/or how
>>>>> spec.pgram
>>>>> exactly computes the periodogram and how to interpret it with my data,
>>>>> I
>>>>> would be so grateful.
>>>>> And subsidiary questions:
>>>>> - Is it possible to have the commented source code of the function?
>>>>> - I do not understand what is the function na.action in the second line
>>>>> of
>>>>> spec.pgram, so if you can explain it to me.
>>>>>
>>>>> Thanks in advance for your answers.
>>>>> Best regards,
>>>>>
>>>>> Anthony Mathelier
>>>>>
>>>>>        [[alternative HTML version deleted]]
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Let's not spend our time and resources thinking about things that are so
>>> little or so large that all they really do for us is puff us up and make us
>>> feel like gods. We are mammals, and have not exhausted the annoying little
>>> problems of being mammals.
>>>
>>> -K. Mullis
>>
>>
>>
>
>
> --
> Let's not spend our time and resources thinking about things that are so
> little or so large that all they really do for us is puff us up and make us
> feel like gods. We are mammals, and have not exhausted the annoying little
> problems of being mammals.
>
> -K. Mullis
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using spec.pgram

Reply via email to