On Sat, Jun 24, 2023 at 10:00:05PM +0200, Emanuel Berg wrote: > tomas wrote: > > >> Is there a CLI and FOSS tool that creates stats from text > >> indata - e.g., > >> > >> $ txt2stats path/to/indata/*.txt > >> > >> I mean a general tool, but with options to tweak the report > >> included, of course. > > > > If you can bear some tweaking, R is it. > > Sure! Let's run R on this e-mail. Does it work and if so, what > does it say?
T a generic question -- a generic answer. I don't even know what you mean by "general stats" -- the sports example you put in the other mail suggests that you want statistics gathered about a subject from written text: this is far more than "just" stats and involves "understanding texts written in human languages", another big can of worms (which has become somewhat fashionable as of late). If it's text statistics, good statistics packages have lots of resources. R is a good statistics package with a big community, so it has: https://towardsdatascience.com/a-light-introduction-to-text-analysis-in-r-ea291a9865a8?gi=001414a39e96 https://www.r-bloggers.com/2021/02/text-analysis-with-r/ https://bookdown.org/jdholster1/idsr/text-analysis.html https://m-clark.github.io/text-analysis-with-R/intro.html https://towardsdatascience.com/r-packages-for-text-analysis-ad8d86684adb?gi=4a426e671fe6 https://www.springboard.com/blog/data-science/text-mining-in-r/ https://m-clark.github.io/text-analysis-with-R/string-theory.html That said, there are others. In the Python galaxy, there is the Natural Language Toolkit https://www.nltk.org/ But your question was posed in a way that I don't even know whether I'm wasting our both times with this answer. Cheers -- t
signature.asc
Description: PGP signature