OK, it seems like I do not succeed in expressing what I do, or want to do. So, I give you the example that bring me to this kind of analysis. I wrote the paper "Chromosomal periodicity of evolutionary conserved gene pairs" (which you can download at http://www.pnas.org/cgi/reprint/104/25/10559). In figure 2, they have a histogram of distances between genes on a chromosome and they make a discrete fourier transform analysis to exhibit a period of 117kb. They explain how they did in the first paragraph of "Distributions of distances and positions and fourier transform" (last page). I thought that this kind of analysis was made by spec.pgram with a histogram. But perhaps I am wrong because I really do not understand what they mean by "the histogram was tranformed into a continuous probability density by using a Gaussian smoothing window and normalizing the total density over the entire genome to 1. A discrete Fourier transform of the data were computed from 0 to 1,000kb by using a Tukey window to taper the end (ratio of 0.5 for tapered to untapered length.". I hope it explains better what I want to obtain from my distances. Best regards,
Anthony On Mon, Jun 16, 2008 at 5:25 PM, stephen sefick <[EMAIL PROTECTED]> wrote: > To get some sort of frequency which in your case seem to be cycles per > distance? Is a valid use of a fourier transform as long as it is a distance > that is measured in a way that would be analogous to a time series- In > other words if the distance proceeds from an origin in one direction- > geophysicists do this often with the realization of an earthquake picked up > by sensors that are a distance away from the origin of the epicenter, but > they are looking for coherencies in the signal from one place to the next in > the frequency domain seperated by distance- this is called beam forming- > They use the raw signal- by binning (making a histogram) the data you are > loosing the signal- you are looking at frequency of occurance of certain > values not for the underlying periodicities of the data (in time or > space). You are fitting cos and isin function to you data to see if there > is periodicity- the power is the integration of the convolution of this sin > and cosine function with your data- It seems to me meaningless to preform > this convolution agianst something that is not a signal (the histogram). If > you want to use a frequency domain technique you have to have a frequency to > investigate- a histogram does not have this- I is a frequency of occurance > by bin size which is NOT what you want (your would have cycles/binlength > that doesn't make any sense to me) to do this analysis on- You want a > signal- dissolved oxygen curve, sunspot record, etc. through time, or > distance as stated above- you are looking for the frequency of a waveform- > Anyway, I may be misunderstanding- supply some code and explain the data > otherwise this line of though- in my limited expertise- is a dead end, but > agian I still don't know what it is that you are, exactly, trying to do- and > what your dataset constits. I hope these ruminations help > > I recommend doing this analysis on the raw data- It doesn't matter that you > don't have the same amount of data points- as long as both sets of data have > circa ten times the length of (cycles/distance) what you want to detect- If > things in your case are spaced by one meter then the lowest cycle > perdistance that you can reliably detect if 0.5 meters, this is all > speculation because you don't have a problem with reproducible code, and we > have no idea what you are measuring or what your data looks like- without > this information there is no way that I can say one way or the other that > you approach (suggested non-histogram) would be right or wrong. > > Stephen > > > On Mon, Jun 16, 2008 at 9:33 AM, Anthony Mathelier < > [EMAIL PROTECTED]> wrote: > >> Perhaps I'm applying spec.pgram wrong as you said. I will explain what I >> want, so you can tell me why I'm wrong and perhaps what I have to do to do >> it well. >> I have some points in a 1-D space and I want to know if they are spaced at >> a certain periodic distance. So, I computed all the distances between points >> in my space. Then, I would like to know if a certain distance (period), or >> multiples of a certain distance, is preferred to space my data. I made a >> histogram of the distances and apply the spec.pgram function to know the >> frequence (so the period) which is the most important to space the original >> data. >> But, when I have to sets of data (without necessarily the same number of >> observation in each set), I want to compare the importance of the period >> given by spec.pgram between the sets. Could I normalize the amplitude of the >> peaks given by spec.pgram? >> So, am I wrong to apply this methodology to exhibit a periodic distance >> between my data? If, true, what could you recommend me to do this? >> Thanks in advance for your answers. >> Best regards, >> >> Anthony >> >> On Tue, Jun 10, 2008 at 6:13 PM, stephen sefick <[EMAIL PROTECTED]> >> wrote: >> >>> I from a first thought I would say that you are apply this wrong! The >>> fourier transform convolves a function (cos(x)+isin(x) (this may not be the >>> exact formula but I don't have my books near)) to the data and then >>> integrates over -1/2 to 1/2 takes the modulus and plots this- the >>> periodogram. The reason you preform a fourier transform is to look at >>> recurring frequencies in the data, which are in the time domain. The >>> fourier transform converts the time series into the frequency domain and >>> viola you have a peak into the hidden/recurring parts of your signal. From >>> your explaination your are applying this technique wrong- look at schumway, >>> MASS4, et al. books to get a handle on how this technique is used. If you >>> are to apply a time series analysis please use it on a time series. Maybe >>> your logic is not flawed but I don't see how a histogram with its associated >>> binning is a better candidate for time series analysis than the original >>> time series if at all. >>> good luck >>> >>> Stephen >>> >>> On Tue, Jun 10, 2008 at 8:49 AM, Matthieu Stigler < >>> [EMAIL PROTECTED]> wrote: >>> >>>> Hello >>>> >>>> I don't know exactly what you want to do but: >>>> >>>> -why do you use in your example h$counts and not h? Furthermore helpl >>>> file says it should be a time series, why then rather not your time series? >>>> >>>> -usually na.action will make the "default" action, which you can see by >>>> getOptions("na.action") >>>> >>>> -here in this function it is provided in the function values na.action = >>>> na.fail so it will just remove the NA in the time series >>>> >>>> -if you want to study a function, I advise you to copy it entirely, >>>> rename it and then just insert print(curiousobject...) in the function, >>>> this >>>> will allow you to let the function run and grasp the interessting objects, >>>> like: >>>> >>>> study<-function (x, spans = NULL, kernel = NULL, taper = 0.1, pad = 0, >>>> fast = TRUE, demean = FALSE, detrend = TRUE, plot = TRUE, >>>> na.action = na.fail, ...) >>>> { >>>> series <- deparse(substitute(x)) >>>> x <- na.action(as.ts(x)) >>>> print(x) >>>> xfreq <- frequency(x) >>>> ...} >>>> study(sunspots) >>>> >>>> -when you provide an example, instead of giving an external reference >>>> for the data, try to search a convenient internal data (accessed by data() >>>> ), so one will be able to reproduce your problems. Here you could use >>>> sunspots >>>> >>>> -to obtain the commented code... I don't know it... >>>> >>>> -good luck >>>> >>>> Matthieu >>>> >>>> >>>> >>>> >>>> >>>> Hi everyone, >>>>> >>>>> first of all, I would like to say that I am a newbie in R, so I >>>>> apologize in >>>>> advance if my questions seem to be too easy for you. >>>>> >>>>> Well, I'm looking for periodicity in histograms. I have histograms of >>>>> certain phenomenons and I'm asking whether a periodicity exists in >>>>> these >>>>> data. So, I make a periodogram with the function spec.pgram. For >>>>> instance, >>>>> if I have a histogram h, I call spec.pgram by spec.pgram (h, log="no", >>>>> taper=0.5). So, I have some peaks that appear and I would like to >>>>> interpret >>>>> them but I do not know how they are computed and so what a peak with a >>>>> value >>>>> of 10000 represents in comparison with a peak of value 600 with another >>>>> histogram. >>>>> I looked at the source code of the function spec.pgram to better >>>>> understand >>>>> what is behind. But, when I apply the source code line by line, I've >>>>> got a >>>>> problem. For instance, I make: >>>>> >>>>> >>>>>> >data = scan ("file.txt") >>>>>> >h = hist (data, breaks=max(data)/5000) >>>>>> >>>>>> >>>>> #then I apply the first two lines of the spec.pgram function >>>>> >>>>> >>>>>> >series <- deparse(substitute(h$counts)) >>>>>> >x <- na.action(as.ts(h$counts)) >>>>>> >x >>>>>> >>>>>> >>>>> NULL >>>>> I do not understand why when I apply the first two lines of the >>>>> function I >>>>> have x which is equal to NULL (which make a mistake in the following >>>>> lines >>>>> of the code) but if I apply the function directly with h$counts it >>>>> gives me >>>>> a result. >>>>> So, if someone can explain to me what is the problem and/or how >>>>> spec.pgram >>>>> exactly computes the periodogram and how to interpret it with my data, >>>>> I >>>>> would be so grateful. >>>>> And subsidiary questions: >>>>> - Is it possible to have the commented source code of the function? >>>>> - I do not understand what is the function na.action in the second line >>>>> of >>>>> spec.pgram, so if you can explain it to me. >>>>> >>>>> Thanks in advance for your answers. >>>>> Best regards, >>>>> >>>>> Anthony Mathelier >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> Let's not spend our time and resources thinking about things that are so >>> little or so large that all they really do for us is puff us up and make us >>> feel like gods. We are mammals, and have not exhausted the annoying little >>> problems of being mammals. >>> >>> -K. Mullis >> >> >> > > > -- > Let's not spend our time and resources thinking about things that are so > little or so large that all they really do for us is puff us up and make us > feel like gods. We are mammals, and have not exhausted the annoying little > problems of being mammals. > > -K. Mullis > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.