
Patrick B. answered 01/13/21
Math and computer tutor/teacher
Ssounds like you need some software that does UNSUPERVISED machine learning, like clustering..
Worst case we can write something that reads every single token in the file and returns frequency table