top | item 39082524

TextAnalysisTool.NET

104 points| gadiyar | 2 years ago |textanalysistool.github.io

43 comments

order

totetsu|2 years ago

https://lnav.org/

Is a good Linux command line tool in the same genre

AdieuToLogic|2 years ago

> Is a good Linux command line tool in the same genre

It is also a good OS-X/FreeBSD command line tool as well.

akoboldfrying|2 years ago

Looks nice! Especially the SQL query feature.

dash2|2 years ago

This is grep++ right?

My guess is that it's aimed more at the humanities. Hence the GUI. My experience: in the world of humanities text analysis, there are just a ton of Java programs which were funded by some academic grant. Mostly they are closed source, not updated, might have a horrible GUI, and the website is always written in 8 point font.... Don't hate them for what they are....

internetter|2 years ago

Marginally related, but this is one of the things I'm bullish on ChatGPT for. Too frequently, I've gotten hundreds of lines of malformed textual data that I need to standardize. This is like impossible with REGEX but I can drop it into GPT and it does this wonderfully.

BurnerBotje|2 years ago

There is however no indication if it failed on a line when using ChatGPT, it could provide you with a slightly incorrect result.

osigurdson|2 years ago

I have no idea how Regex became the standard. The syntax is impossible to remember unless you write regex expressions daily. Most people only rarely need regex so it needs to be relearned every time. It is also incredibly unsatisfying to write (and read).

dextro42|2 years ago

I tried using ChatGPT (4) for format conversion. I had a draft yaml file and needed some differently structured json. Mainly with the same content.

If you just want to change the format it works. If you need more than programming skills it seems too fail duo to the amount of text.

E.g. if you have a list of items and want ChatGPT to generate a meta field which it cannot generate using simple python code it stops after 10 to 20 elements.

Thus at least the cloud version doesn't work so well here.

I also wanted it to help me fill out my i18n file with translations and plural forms. Even thought he got every word correct i needed to split it into multiple requests. Not sure if the api would have worked better (used the web frontend).

For the plural forms I finally added them myself as it was way faster for my natural language than copy pasting all the small chunks. Really hoped for more help there.

Liftyee|2 years ago

Agreed. It works especially well for formatting where semantics matter, such as separating the term and definitions of flashcards. Hard to do with code, but easy with GPT.

atesti|2 years ago

Where is the source code? It looks like they only host releases on github, but the license is MIT

bramblerose|2 years ago

The MIT license just gives you permission to use the work as published. Normally that work would be in source form, but there is nothing in the MIT license requiring that. In this case, it seems that the authors chose to release the binaries under the MIT license.

brchn|2 years ago

This tool is pretty good. Used it to find the meaningful errors from giant MSBuild logs

sebazzz|2 years ago

Whenever convenient the MSBuild binary log file in combination with MSBUILD Structured Log Viewer is a better fit.

andix|2 years ago

If I had to chose a name for it, it probably would be "Regex 401".