On Fri, 14-Sep-2012 at 02:03PM -0400, Earl Brown wrote: |> Hello R-helpers. |>
|> I've tried to recreate a parallel version of tapply() and table() |> using a combination of the parallel functions mclapply() and pvec() |> and papply(), but haven't been successful. In the end, I'm trying |> to get a cross tab of two vectors. I currently (can) use |> tapply(..., sum) and table(), and even xtabs() and ftable(), but |> with tens of millions of words and tens of thousands of files to |> loop over, it take a long time, like days. |> Does anyone know of a parallel version of tapply(), table(), |> xtabs(), or ftable()? Or has anyone created something that |> approximates a parallel version of one of these functions? Not sure I have much of an idea of what your cross tab would look like, but from what I can ascertain, I think you could do something along these lines: 1. Partition your data into the number of processors you have available. 2. Specify your tapply function as the function that mclapply "apply"s to each tranch of the data. 3. Use regular lapply (using one processor) to the list that will be the result of part 2 to get all the bits back together again and do whatever summation is appropriate. HTH |> Thank you for your time and help. Earl Brown |> |> ----- |> Earl K. Brown, PhD |> Assistant Professor of Spanish Linguistics |> Department of Modern Languages |> Kansas State University |> |> ______________________________________________ |> R-help@r-project.org mailing list |> https://stat.ethz.ch/mailman/listinfo/r-help |> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html |> and provide commented, minimal, self-contained, reproducible code. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.