n_are_q
|
14 years ago
|
on: Amazon DynamoDB – a Fast and Scalable NoSQL Database Service from AWS
You will have to create an attribute for your json, where you'll store the json utf-8 encoded. If you want to index on parts of that json blob you'll have to pull them out into their own separate attributes and the recombine them into a single json object on read.
n_are_q
|
14 years ago
|
on: Amazon DynamoDB – a Fast and Scalable NoSQL Database Service from AWS
Any word on when boto will support this?
n_are_q
|
15 years ago
|
on: How does Google Analytics measure site speed?
I really wish GA tracked all the metrics exposed by the timing spec, instead of combining them into one overall value. It's great that this reports numbers from actual users' machines instead of a headless render process on a monitoring server somewhere though. Here is hoping FF implements support for this soon.
n_are_q
|
15 years ago
|
on: AIM Google Talk Federation now live
Right, that makes sense, thank you for the explanation. I guess interopertability between the two services implied to me they could speak eachothers' protocols in entirety (at least ideally), so i pictured gtalk talking to the aim chat room via its protocol. Oh well.
BTW this has implications for google apps users since there are no usable persistent chat rooms in gtalk. Well i guess any users, but for businesses chat rooms are more useful than for casual users I think.
n_are_q
|
15 years ago
|
on: AIM Google Talk Federation now live
What about interoperability with aim chat rooms?
n_are_q
|
15 years ago
|
on: Google APIs Discovery Service: one API to find them all
Interesting to see json schema used. I was just using it for something similar in my own api framework (although mostly for validation and serialization, not just discovery) and the public interest in the spec seemed mild at best. Good to see the idea catch on a little, though it would have been even better if they released a full python implementation of it instead of just hard coding around the few pieces they actually use. There actually aren't any full implementations at all right now.
http://json-schema.org/
n_are_q
|
15 years ago
|
on: Why Not All Earnings Are Equal; Microsoft Has the Wal-Mart Disease
> Also, Moore's law is still relevant in the mobile market meaning that people replace their phones fairly frequently to get better hardware support. A mobile phone from 5 years ago is clearly inferior to most people whereas a computer from 5 years ago is mostly adequate for most users.
Agreed, and also the dominant software platform has not emerged yet, like it has with the PCs. The market is still highly fragmented and applications are routinely being written for many platforms. There is still time for someone other than Apple and Google to build something there.
In the meantime, MS has quite a bit of time I think. Those mobile devices will remain "niche" devices for a long time, as these platforms mature. Right now using my phone while i'm outside the house or using a tablet on my couch is great, but that doesn't mean i can dispense with my desktop, where MS is king. That will remain for probably a long time.
The holy grail in my opinion is a single device that i can use as a phone, and also hook up to monitors + peripherals and get a full unabridged desktop experience. Why have two devices when you can have one. Obviously the Atrix is already a step towards that, so i'm not pointing out anything particularly new. Of all the existing players, I think MS is potentially even better positioned than Apple to get in on that game since they already have a lock on the desktop platform.
n_are_q
|
15 years ago
|
on: Google Instant cannot be turned off..
Google instant is where google shows you search results as you type, going beyond auto completing your query. Some people like myself find that annoying, and it's doubly annoying that google actually resets your preferences back to what IT thinks is better for YOU.
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
I've used ORMs before i worked at myspace. NHibernate specifically. I've also used sql alchemy on the python side. NHibernate was in a professional environment, sql alchemy was a bunch of stuff I did for evaluation purposes, so you can discount that if you like.
And i'm not working at a data warehouse.. why do you think that?
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
It's interesting to hear that has worked well, obviously this wasn't a small project. Your point about knowing how to use your tool definitely rings true. Also interesting that you had a use case where data loss and integrity actually mattered and in real time, unlike a social network or most start ups operating today. Going with a heavy oracle system instead of trying to roll your own creative distributed architecture definitely seems to make sense in that scenario. Just out curiosity, was this Java/Hibernate?
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
Query results you will normally adapt to specific object properties, and at that point the only thing that can bite you is if your query starts returning columns of a different sql type, which of course you can't catch compile time anyway. If you wrap your query results with objects and maintain an interface into your update statements via method calls (which obviously have type checking for arguments), I don't see how you can run into serious trouble. In the dynamic language world you of course don't have compile time anything, but you can use pretty much the exact same techniques to ensure you don't pass something bad to your query. I guess this isn't very dynamic, but that's the idea - your data access logic lives in the database, you execute methods and get back objects. That's the no orm way.
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
There is one thing that I think can help with the sql overhead you mention - if you have a rock star dedicated sql person that can take all this work off your hands (that's not me btw, i've just worked/am working with such people). I think it affords you easier long term growth if you have expectations of making it to the medium to large company world, while not slowing you down when you are small, so I think it's a better strategy for both small and large companies. Are you signing on for a potential bottle neck? Yea it is a trade off and it is paramount you hire well in that area, but that's the sort of problems and decisions you have make all the time at a company.
I understand where you're coming from and what you describe may be workable in a smaller company with 7-14 devs where everyone knows what they are doing and understands well what happens under the hood. I think it's less likely to work at a company with 50+ devs though where you inevitably start trusting people less, or just at a company where you don't trust everyone. I've worked at both types. There is also the question of the complexity of your data and the way you need to query it. Right now we do essentially a ton of graph queries that we optimize highly in sql (ends up working much faster than any graph database since the schema and the queries are optimized for the exact data we are working on). Some of the functions that I write for this would not be implementable in an orm. I suppose that could be the case where you drop down into raw sql, but that happens to be a fair chunk of our code.
Maybe you can make it work better than I'm expecting, but if you were starting from scratch would you really want to go down that path anyway, all things considered? My original argument was that you are better off choosing a different way. I suppose that point of view will be difficult to change for me :).
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
The difference at a high level is that sql has a syntax and set of capabilities that is quite unique, and every single database vendor has its own extensions or differences driven by their particular approach. To really replicate all of this in code you would have to go beyond the basic data structures and syntax of that programming language. And at that point might as well just have sql. It's a paradigm and an approach expressed through its own syntax, you can't easily copy all of it in a totally different programming language..
As for checking for type safety, I think frameworks that do sql-to-object mapping (with type safety), and also handle cache for you, are a very useful thing. Making raw calls on database connections is definitely too far "in the other direction" :).
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
Things like lazy loading is a red flag to me that you are doing something wrong, so if your framework allows you to do that that's not necessarily something to brag about :). Random IO that is triggered by merely accessing a property without knowledge of the programmer is not the best approach if you want to scale, you are better off doing deliberate fetches as a result of previously fetched data. If you are breaking and composing queries, how are they broken and compose by the orm, as joins or as sub queries? If as joins does your orm know the best columns to join on? You could replace everything with named sql functions (dropping to the lowest level of optimization as you mention above), but at that point what is your orm really doing for you. Anyway, sorry, I'm not sold :). Maybe if you effectively replicated the database engine in your front end framework I would come closer to being sold, but even then you don't have the same rapid in memory access to statistics about tables to make the right optimization decisions, etc..
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
My experience is from writing a bunch of middle tier code at MySpace in the 06-07 time frame, the myspace hey days when they were pushing more traffic than google (true story). Anyway, the user facing product might have sucked, but we did scale (that's why friendster was friendster and we were myspace :). In an environment with 450+ million users, we had extensive caching systems and still had to use every sql trick in the book to get our systems to scale well. I know because my job was working with the DBAs to bridge the sql and front end worlds together. I can say with great certainty that front end developers who did not know sql and were simply following a logical object model would not have produced code that scaled in our environment, there were way too many things that were done that were extremely non-obvious. Since myspace i've been working at a python/postgres start up where we've been applying the same principles pretty successfully, at a much different scale of course. If nothing else, i think the no orm approach will at least give you more bang for your buck.
Separating your data access code out of the application logic also allows you to change it much more easily as data conditions change, including on the fly, without an application deployment. That's often extremely useful.
MySpace scale may be at an extreme end of the spectrum, but we had formidable hardware to throw at it too (although x86, so nothing TOO crazy). So I think the ratio of hardware to scale at other sites is comparable, and so I think the same lessons apply. I have no experience working with oracle, but would you say that a 7 node oracle cluster is some pretty serious hardware? I myself really don't know, but it is a question I have :).
EDIT: I'm not discounting your experience, i just want to point out that i've experienced conditions where I think the orm approach would have broken down. If others have had different experiences, the more data points the better, but i think the scale/complexity/cost(hw) ratios play into the debate as well.
EDIT #2: Oh and I forgot to mention that the automated test suite you had is an incredible asset, and no doubt made it easier to discover problems early and deal with them effectively. But you do have to invest resources in creating one, and something like that is no small cost at a start up.
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
Sure, I'm just trying to speak up for some of the less popular technologies in today's dev community :). SQL server has the downside of costing money, but i think it should be in the equation when making technology decisions today, it's a vastly superior piece of tech compared to mysql and postgres :).
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
Wrapping both caching logic and database access in an ORM like system is no doubt the right thing to do. Letting front end developers write queries to be converted by an orm and reviewed by a DBA later - in my opinion that's not the most efficient method of development. I probably would have invested in an extra DB person or two to help write the data access logic. But hey, I can't argue with results - if it worked for you that's great. But as a general statement I think that sort development methodology is highly conducive to errors and systematic problems that would not become evident until later, and at that point take a great deal of effort to fix.
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
Tying yourself to LINQ is one thing, tying yourself to MS SQL Server is completely fine if you're ok with the licensing fees. SQL Server is probably one of the best enterprise databases out there right now, I imagine right behind oracle.
n_are_q
|
15 years ago
|
on: Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning
If you are building anything more complex than a blog site and expect to take a decent amount of traffic, to the point that you may in fact care about optimizing at all, going with an ORM that writes sql for you is a really really bad idea. I really don't understand the fascination with ORMs today. Some sort of sql-to-object translation layer is no doubt a great thing, but any time you write "sql" in a non-sql language like python or ruby you are letting go of any ability to optimize your queries. For reasonably complicated and trafficked websites that's a disaster simply waiting to happen. This isn't just blind speculation on my part, I've heard a great many stories where very significant resources had to be dedicated to removing ORM from the architecture, and the twitter example should familiar to most.
I would go so far as to say that sql writing ORMs are a deeply misguided engineering idea in and of itself, not just badly implemented in its current incarnations. You can't possibly write data access logic entirely in your front end and expect some system to magically create and query a data store for you in the best or even close to the best way.
I think the real reason people use ORMs is because they don't have someone at the company that can actually competently operate a sql database, and at any company of a decent size traffic-wise that's simply a fatal mistake. Unless you are going 100% nosql, at which point this discussion is irrelevant.
n_are_q
|
15 years ago
|
on: Introducing Druid: Real-Time Analytics at a Billion Rows Per Second
Aren't metrics such as "seconds on an ec2 instance" not particularly meaningful because you get highly variable performance per instance based on who else is using the actual hardware? Am I correct to assume that m2.2xlarge instances are shared like other instance types?