top | item 35012004

You can’t parse CSV with a regular expression

4 points| charly357 | 3 years ago |successfulsoftware.net

4 comments

order

WaitWaitWha|3 years ago

It is unclear if the short note is about regular expression refers to pattern matching techniques, an application like regex, or an internal command in some coding language like =~ and !~ in Perl.

I can use regular expression to parse CSV, it is just not pretty. Regex solutions do not need to be single runs.

I frequently use regex in multiple iterations to clean up the data be it in code or command line, then process it for one off scenarios.

> This is because a regular expression doesn’t store state.

This depends on how much state I need to store and in what context (see first sentence).

charly357|3 years ago

Regular expressions are a very useful tool in a programmer’s toolbox. But they can’t do everything. And one of the things they can’t do is to reliably parse CSV (comma separated value) files. This is because a regular expression doesn’t store state. You need a state machine (or something equivalent) to parse a CSV file.

version_five|3 years ago

But can I parse html?

James_Henry|3 years ago

No, and the article links to a StackOverflow question about this.