On Sat, Jun 24, 2023 at 10:00:05PM +0200, Emanuel Berg wrote:
> tomas wrote:
> 
> >> Is there a CLI and FOSS tool that creates stats from text
> >> indata - e.g.,
> >> 
> >>   $ txt2stats path/to/indata/*.txt
> >> 
> >> I mean a general tool, but with options to tweak the report
> >> included, of course.
> >
> > If you can bear some tweaking, R is it.
> 
> Sure! Let's run R on this e-mail. Does it work and if so, what
> does it say?

T a generic question -- a generic answer. I don't even know what
you mean by "general stats" -- the sports example you put in the
other mail suggests that you want statistics gathered about a
subject from written text: this is far more than "just" stats
and involves "understanding texts written in human languages",
another big can of worms (which has become somewhat fashionable
as of late).

If it's text statistics, good statistics packages have lots of
resources. R is a good statistics package with a big community,
so it has:

  
https://towardsdatascience.com/a-light-introduction-to-text-analysis-in-r-ea291a9865a8?gi=001414a39e96
  https://www.r-bloggers.com/2021/02/text-analysis-with-r/
  https://bookdown.org/jdholster1/idsr/text-analysis.html
  https://m-clark.github.io/text-analysis-with-R/intro.html
  
https://towardsdatascience.com/r-packages-for-text-analysis-ad8d86684adb?gi=4a426e671fe6
  https://www.springboard.com/blog/data-science/text-mining-in-r/
  https://m-clark.github.io/text-analysis-with-R/string-theory.html

That said, there are others. In the Python galaxy, there is
the Natural Language Toolkit

  https://www.nltk.org/

But your question was posed in a way that I don't even know
whether I'm wasting our both times with this answer.

Cheers
-- 
t

Attachment: signature.asc
Description: PGP signature

Reply via email to