deshpand's comments

deshpand | 1 year ago | on: Bring Back Shortwave

Growing up in a somewhat remote part of India, I would tune to BBC, Radio Australia to listen to test cricket commentary, on short wave. I have fond memories and owe a lot of my personal growth to SW.

deshpand | 1 year ago | on: Chess grandmaster Magnus Carlsen rejoins tournament he quit over wearing jeans

I don't know all the history, but Magnus comes off as above the game.

“I am playing at least one more day here in New York and, if I do well, another day after that,” Really? And I saw another quote that said "f*k you"

Whatever his frustrations are with the governing body, the above are unacceptable behavior. I don't understand why they need to bend over backwards and modify the rules and mollify him.

He has also accused another player of cheating, after he lost, then he settles out of court. And he doesn't want to participate in the world championship, but chooses to make comments about the quality of games.

deshpand | 2 years ago | on: The Top Programming Languages 2023

Of late, I have been googling a lot of SAS and may have contributed to its rise in ranking! Not coding in SAS but moving SAS to Python. Speaking of enterprise Java, there's a ton of enterprise SAS and it's moving to Python.

deshpand | 2 years ago | on: The one-handed backhand is on the way to extinction

Not sure why you were downvoted. Fitness, equipment (racket and strings) are the reason if not the height. For height, above a certain level, say 6 ft 2 inch, it starts becoming a liability in movement. Tennis used to have a lot of variety in the past. Big serve and volleyers on grass, long baseline rallies on clay, slice backhands, flat strokes. Now it's monotonous.. the surface doesn't matter. The game with the most payoff is to stand back at the baseline and hammer the ball.

On a somewhat related topic, fitness has taken over many sports. In field hockey, dribbling used to be a skill. India was unbeaten for decades in Olympics, winning 8-9 gold medals. The introduction of artificial turf ushered in the era of strength and fitness, and the Western nations mostly took over.

deshpand | 3 years ago | on: Python's “disappointing” superpowers

I didn't mean to exaggerate. I think part of the improvement was the luxury of refactoring which should generally reduce the bloat. And, as someone else said, part of the issue is C++/Java, not static typing. When I move from Java to Python, I also get the luxury of organizing code into fewer/meaningful source files. I find this more readable, than having to switch to different files constantly.

I have not had a chance to learn or use a language like Go. But production use of Python, including building large code bases is real. We do resort to numba, cython or using Python API to compiled code.

I'm now involved in converting large codebases from SAS to Python. I don't think I will have the luxury of choosing another language like Go, for a number of reasons.

deshpand | 3 years ago | on: Python's “disappointing” superpowers

A ~100k LOC project in a statically typed system with type hierarchies, interfaces, contracts and boiler plate will boil down to ~10k LOC in a dynamically typed language like Python. A 10k LOC project will be more readable than a 100k LOC project.

Source: I have spent years coding in C++/Java, then Python. I have migrated Java projects into Python

deshpand | 3 years ago | on: The forty-year programmer

Another advice is to have a thick skin and ignore the jerks, like for instance someone downvoting your note that was written with a good intention, sharing something learnt the hard way.

deshpand | 3 years ago | on: The forty-year programmer

For longevity, I will provide one advice to younger people.. get away from the mouse as much as you can and use the keyboard. My working style is editing code in vim, dark screen, 2 buffers when necessary, go to the colon prompt to run the unit test or the script/driver and do this all day. No mouse needed until I need to check email or browse the web. There were times in my career when I was overusing the mouse and my wrist hurt. I have no issues now, 30+ years into programming.

No matter what your editor/language/framework choices, try to minimize using the mouse

deshpand | 3 years ago | on: What’s up with Austin?

Can you please elaborate why? I worked in Manhattan for many years, lived a few as well, and moved out in 2009.

deshpand | 3 years ago | on: From Python to NumPy (2017)

I think it's helpful to keep in mind that Python is general purpose and used in many domains in addition to data analysis (no matter which side of the walrus operator you are on)

deshpand | 3 years ago | on: Coinbase stock lost over 75% value

A little over a year ago, there was an incident with a lot of sites being down and Cloudflare/Lumen being involved (per Hacker News chatter, I only ready about the incident here). I bought a few shares of each, at around 39 and 11. While NET is down significantly from the peak, they have both worked out OK so far (LUMN has been paying $0.25 per quarter dividend). LUMN has been paying down debt and has good cashflow. They are contrasting and yet in a similar space.

deshpand | 4 years ago | on: Dask – A flexible library for parallel computing in Python

In the end, it's not about an OPEN SOURCE tool being perfect but whether it is helping you solving a problem. If it did not help you and YOU don't consider it production ready, then that's fine. But you seem to argue that Dask should put this disclaimer out there. That would imply that many other open source tools including Spark would have to do it.

Dask has solved specific problems for us and we are grateful about it. I remain open minded about other choices and listed them with the understanding I have about them.

Switching to pandas when you can is going with the philosophy of keeping things simple. I like the flexibility of going back and forth between these as and when I choose.

deshpand | 4 years ago | on: Dask – A flexible library for parallel computing in Python

Do you have any citation on why "Dask doesn't consider their distributed version" to be ready? If it is your own view, then that's ok.

I think dask is in heavy usage in real production systems. Let me cite one such usage here, from Capital One (no affiliation, just referencing a big bank for 'production ready' purposes) https://www.capitalone.com/tech/machine-learning/dask-and-ra... (also not necessarily suggesting any rapids/GPU usage, you can decouple it from the article)

And note the article is from Nov 2019. Two years is a substantial amount of time for further improvements.

deshpand | 4 years ago | on: Dask – A flexible library for parallel computing in Python

We use dask heavily, along with rest of the pydata ecosystem. I guess we are in the 'sweet spot' where data doesn't fit memory, to begin with, but once we perform any filtering and aggregations, switch over to pandas. That's exactly what dask recommends too. Our datasets don't exceed 100GB right now.

Also note, dask clearly acknowledges challenges dealing with data in the terabyte range https://coiled.io/blog/dask-as-a-spark-replacement/

Most of our use-cases right now involve using multiple cores of a big instance, than resorting to cluster computing.

With spark, there is additional/steep learning curve, complexities of dealing with cluster computing. And Spark-ML is not well known. With dask/pandas it's easy enough to feed scikit-learn and/or bring in dask-ml, just a pip install, and you can scale well known sklearn modules effectively.

I think in the end, it's about keeping things simple. As others said, if you are already invested big in Spark/Scala/Hadoop, that may make sense for you. For non-CS folks, this will be a challenge.

As for vaex, it's very interesting. One issue is that it seems to be able to want hdf5 and doesn't want to work with parquet. And it's API is not fully compatible with pandas.

Ray/Modin: played with it a bit and maybe it's a bit too new for enterprise uses and may be more geared for ML workloads. That's my take anyway and it may have progressed substantially, already.

deshpand | 4 years ago | on: Ask HN: How to start learning about investments?

Not easy, unless you have an ability to identify opportunities that others don't have (such as applying your domain knowledge on a new trend, as I mentioned). You may do better than the market (market == passive index fund == averages) sometime, and worse some other time. And over a long time, you may just end up matching the market, minus any fees. With a passive index fund, the fees will be tiny.

deshpand | 4 years ago | on: Ask HN: How to start learning about investments?

Technically, beating the market should be like being in the top 50 percentile of your class. The reason most big funds fail to do this is because they need to overcome the fees they charge, to match the market.

If you are investing yourself, you won't have the fees to overcome. You do need to be a bit careful around trading costs and taxes. Luck can play a huge factor too. If you have domain expertise in a specific sector, your chances of outperforming the market may go up a bit. You may be able to identify with greater certainty an opportunity to invest in a company that others haven't yet noticed. This can only happen with small companies. With bigger companies, it's hard for some outsider to possess some information that others don't have.

It's also a fine strategy, IMO, to invest in indexes and then focus your time on what you can do best or enjoy that time the way you see fit (for a few people, the latter can indeed be investing)

deshpand | 4 years ago | on: DuckDB-Wasm: Efficient analytical SQL in the browser

I work heavily with pandas and dask (when you want to use multiple cores), using parquet files for storage. We see a lot of benefits in selectively bringing in duckdb into the mix. For instance, the joins are extremely slow with both pandas and dask and require a lot of memory. That's a situation where using duckdb reduces the memory needs and speeds things up a lot.

And we may not want to upload the data into postgres or another database. We can just work with parquet files and run in-process queries.

page 1