"revolutionary"? It just copied and pasted the decades-old R (previous "S") dataframe into Python, including all the paradigms (with worse ergonomics since it's not baked into the language).
No other modern language will compete with R on ergonomics because of how it allows functions to read the context they’re called in, and S expressions are incredibly flexibly. The R manual is great.
To say pandas just copied it but worse is overly dismissive. The core of pandas has always been indexing/reindexing, split-apply-combine, and slicing views.
It’s a different approach than R’s data tables or frames.
> allows functions to read the context they’re called in
Can you show an example? Seems interesting considering that code knowing about external context is not generally a good pattern when it comes to maintainability (security, readability).
I’ve lived through some horrific 10M line coldfusion codebases that embraced this paradigm to death - they were a whole other extreme where you could _write_ variables in the scope of where you were called from!
Dataframes first appeared in S-PLUS in 1991-1992. Then R copied S, and from 1995-1996-1997 onwards R started to grow in popularity in statistics. As free and open source software, R started to take over the market among statisticians and other people who were using other statistical software, mainly SAS, SPSS and Stata.
Given that S and R existed, why were they mostly not picked up by data analysts and programmers in 1995-2008, and only Python and Pandas made dataframes popular from 2008 onwards?
Exactly. I was programming in R in 2004 and Pandas didnt exist. I remember trying Pandas once and it felt unergonomic for fata analysis and it lacked the vast library of statistical analysis library.
data-ottawa|1 month ago
To say pandas just copied it but worse is overly dismissive. The core of pandas has always been indexing/reindexing, split-apply-combine, and slicing views.
It’s a different approach than R’s data tables or frames.
aidos|1 month ago
Can you show an example? Seems interesting considering that code knowing about external context is not generally a good pattern when it comes to maintainability (security, readability).
I’ve lived through some horrific 10M line coldfusion codebases that embraced this paradigm to death - they were a whole other extreme where you could _write_ variables in the scope of where you were called from!
sampo|1 month ago
Dataframes first appeared in S-PLUS in 1991-1992. Then R copied S, and from 1995-1996-1997 onwards R started to grow in popularity in statistics. As free and open source software, R started to take over the market among statisticians and other people who were using other statistical software, mainly SAS, SPSS and Stata.
Given that S and R existed, why were they mostly not picked up by data analysts and programmers in 1995-2008, and only Python and Pandas made dataframes popular from 2008 onwards?
xtracto|1 month ago
BeetleB|1 month ago
(Yes, yes - I know some people wish that were the case!)