crazybit | 5 years ago | on: Jepsen Disputes MongoDB's Data Consistency Claims
crazybit's comments
crazybit | 7 years ago | on: Apple introduces macOS Mojave
It almost ruined them (desperately holding on to their drastically interior CPU). What could go wrong this time? Especially when they no longer have Steve running the show.
crazybit | 7 years ago | on: Remembering When Only Barbarians Drank Milk
crazybit | 7 years ago | on: The great video game exodus
Game developers: very sexy, easy to show off, has a (very) large supply of talent. Pay not bad, not good, work schedule very demanding.
COBOL developer: extremely unsexy, almost don't want to admit it publicly. Tiny supply, relatively much larger demand. Pay very good. Work schedule very predictable, not demanding. Almost a vacation.
Supply and demand. Ignore at your own peril.
crazybit | 7 years ago | on: Judge rules that Amazon isn't liable for damages caused by a hoverboard it sold
But doesn't it also make sense for Amazon to be liable for allowing 3rd parties to sell counterfeit or not-as-advertised items?
Also, how far does this go? Where is the line crossed?
crazybit | 7 years ago | on: Deep learning: a critical appraisal
crazybit | 7 years ago | on: Reinforcement Learning Jupyter Notebooks
1) 16 notebooks from the book "Python Machine Learning" by Raschka & Mirjalili https://github.com/rasbt/python-machine-learning-book-2nd-ed...
2) Linear Regression, Logistic Regression, Random Forests, and k-Means Clustering notebooks by Nitin Borwankar https://github.com/nborwankar/LearnDataScience
3) scikit-learn tutorial notebooks by Jake VanderPlas https://github.com/jakevdp/sklearn_tutorial
4) Lots of deep learning notebooks from the book "Deep Learning with Python" by François Chollet https://github.com/fchollet/deep-learning-with-python-notebo...
Bonus) Jupyter notebook on AWS tutorial (when your local computer just won't handle your notebook requirements): http://efavdb.com/deep-learning-with-jupyter-on-aws/
Please share your jupyter notebook recommendations.
What do I use in this situation:
1) I need to store 100,000,000+ json files in a database
2) query the data in these json files
3) json files come from thousands upon thousands of different sources, each with their own drastically different "schema"
4) constantly adding more json files from constantly new sources
5) no time to figure out the schema prior to adding into the database
6) don't care if a json file is lost once in awhile
7) only 1 table, no relational tables needed
8) easy replication and sharding across servers sought after
9) don't actually require json, so long as data can be easily mapped from json to database format and back
10) can self host, no cloud only lock-in
Recommendations?