I am not an expert, but I believe your extrapolation idea is unsound.
Again, post on the HPC list to get expert feedback instead of trying
to reinvent your own wheel. I will not respond further.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sun, Jan 30, 2022 at 3:02 AM akshay kulkarni <akshay...@hotmail.com> wrote:
>
> dear Avi and Bert,
>                               I think I got my answer. I will just run it 
> with a small sample and check the execution time and extrapolate from that. 
> By the way, LDA (I am using topicmodels package) cannot be parallelized, 
> right?     Thanks in advance.
>
> Thanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
> ________________________________
> From: R-help <r-help-boun...@r-project.org> on behalf of Avi Gross via R-help 
> <r-help@r-project.org>
> Sent: Sunday, January 30, 2022 4:15 AM
> Cc: r-help@r-project.org <r-help@r-project.org>
> Subject: Re: [R] progress of LDA algorithm...
>
> I agree with Bert that this is way off topic and one few here know (or care) 
> about.
>
> Generally, if a package has functionality with manual pages, it may have 
> abilities defined such as setting verbose=TRUE or to various levels of output 
> that may satisfy the request or they may make a copy of code including their 
> print or logging statements and so on.
>
> If the request is more general such as how to run a program under some 
> debugging method and set checkpoints at which some reporting is done, that 
> too is a bit outside the normal uses of this forum.
>
> The usual suggestion here is to contact the package maintainer, with no 
> guarantee of getting any useful response, or find a forum way more specific 
> than R HELP just because part of the package is in R.
>
> As it happens, the lda() function being discussed may (or may not) be in the 
> MASS package. Looking at the documentation, I saw no obvious hook to show it 
> as it makes progress. Of course Akshay can do some external testing using 
> standard R timing mechanisms to see how long it takes to do just some of the 
> news categories without going in to the details of the function called and 
> that might partially answer his question. Asking how to do that might fit the 
> parameters here.
>
>
> -----Original Message-----
> From: Bert Gunter <bgunter.4...@gmail.com>
> To: akshay kulkarni <akshay...@hotmail.com>
> Cc: R help Mailing list <r-help@r-project.org>
> Sent: Sat, Jan 29, 2022 3:34 pm
> Subject: Re: [R] progress of LDA algorithm...
>
>
> I presume this is in some specialized package that you have not told
> us about -- topicmodels maybe? It is therefore off topic here. In any
> case, this is the sort of question for which you should contact the
> package maintainer (?maintainer).
>
> As your question may also intersect with high performance computing
> considerations, you might want to post  it on the R-Sig-HPC list,
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Sat, Jan 29, 2022 at 8:27 AM akshay kulkarni <akshay...@hotmail.com> wrote:
> >
> > dear members,
> >                           I want to run LDA(latent Dirichlet allocation) on 
> > certain news articles. i have the following questions:
> >
> >
> >   1.  Is there any way to know the progress of the execution of the LDA 
> > algorithm?
> >   2.  I read in SO that if you have more memory, faster is the execution 
> > time of LDA. I am using AWS z1d instance with 48 cores and about 325 GB 
> > RAM. I have multiple categories of news, but one of them is much larger 
> > than others, containing about 25000 articles. Is it preferable to send 
> > those categories individually to different processors, and whether R frees 
> > up the memory after running on the smaller categories so that the largest 
> > category can run with more memory? Or is it preferable to first run the 
> > smaller sets, finish the job, and then run the largest category?
> >
> > Thanking You,
> > Yours sincerely,
> > AKSHAY M KULKARNI
> >
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to