top | item 7799323

Give PonyORM a chance

51 points| kozlovsky | 11 years ago |jakeaustwick.me

63 comments

order
[+] datashaman|11 years ago|reply
I do not really understand the desire for ORMs to try and recreate the experience of writing SQL. Why not just use SQL, then?

I understand that the ORM is trying to smooth over differences in implementation, providing the possibility of change from one DB access layer to another.

I have yet to see anyone do that on a real project, which makes me really wonder at the point of using an ORM at all.

I cannot quantify the amount of time I've lost figuring what the hell I must do to generate a relatively simple SQL statement.

[+] cwp|11 years ago|reply
Absolutely. In my latest project, I just have a very thin facade between the application and the DB client library, just enough to simplify the process of getting a connection from the pool, fishing the required data out of a response, handling errors etc. The application doesn't build any queries dynamically; they're all string literals with parameters that get filled in by the DB client library.

I also wrote a little utility to handle migration. Each migration is a file with SQL in it; forward migrations only, no going back. The tool makes it easy to apply a given migration to a given database (dev, staging, production or whatever) and does the book-keeping to ensure they're only applied once. Couldn't be simpler.

I'm really happy with this set-up. It gives access to all the features of the database instead of just the lowest common denominator, and makes it much easier to read the code and know exactly what's going on.

The only times I've seen an ORM work well in a non-CRUD application is when it was designed and written from scratch to work with one database to supply exactly the operations and semantics that the application needs.

[+] benselme|11 years ago|reply
Well, ORMs are useless if you're of the opinion that string interpolation and concatenation are a good way of composing logical units. Many of us prefer a proper API for doing that, and good ORMs like SQLAlchemy provide us with one.
[+] watwut|11 years ago|reply
ORM is most supposed to save you from repeating the same load-db-put-data-into-objects over and over. It is also supposed to do caching and other similar optimizations for you.

If we insist on it, SQL itself was supposed to smooth over differences in db implementation. Ehm.

If all you need in one query, then figuring out how to make ORM running may be waste of time. If you need gazimilion of them and you need results cached reasonably and so on and so forth, ORM starts being very useful.

[+] stana|11 years ago|reply
Surprised by number of negative comments about ORM-s. Twice I have had to migrate projects between different sql db engines. Once SQL was embedded in the application, second time perl-s DBIx ORM was used from the beginning. Guess which project was easier. Why would you lock yourself into using a specific db engine unless you really had to. And how hard is it to embed SQL for a few specific queries even when using ORM. Lately heavily using Django's ORM and think it is great. And if your data structures so complex that ORM does not do the job maybe something wrong with your schema. I like my data layer simple if possible, and complexity in the application.
[+] wldlyinaccurate|11 years ago|reply
> And how hard is it to embed SQL for a few specific queries even when using ORM

This is the thing that many people seem to forget. Every ORM I've used has allowed you to write your own SQL for those times where the DBAL is limited or not performing well. In my experiences, those times are rare. Some ORMs (like Doctrine) even have their own DSL so that you can write complex queries and still have the benefit of a DBAL.

I'm currently working with a team on a large project where all the SQL is written by hand, and results are hydrated into plain ol' arrays. The maintenance overhead for schema changes is huge due to the number of queries which need to be updated, and there are constant bugs caused by columns being left out of SELECT statements. On top of that, the testability of the entire app suffers because most methods talk directly to the DB. It's truly a nightmare. I've never experienced these problems with an ORM.

[+] gldalmaso|11 years ago|reply
Probably most schemas out there in the wild are "wrong" or too complex. And we have to live with that sometimes, because cosmetical changes to database are more trouble than they are worth. ORMs do not help much in this regard, in fact they probably make the problem worst.

Sometimes a schema can even be correct and the ORM will have trouble with it, Hibernate has (or had, I don't follow nowadays) a problem with n-n relationship tables that have aditional information on them, which is a perfectly fine normalization thing to do.

In fact, I would go so far as to say that if my tool dictates my data modeling, than there's something wrong with my framework.

[+] sgift|11 years ago|reply
No chance. Sorry Pony, you are probably a great ORM, but I have been burned too often by ORMs and their "magic", which will "just work" (hahahahahahaha). Hibernate destroyed my trust in ORMs forever and even SQLAlchemy (which is great) wasn't able to fully restore it. I will write my queries alone. So that I know why they misbehave and I can change it instead of trying to chase through obscure layers of magic, read various loggings (if logging is available) and hope that my changes in the ORM layer will translate to the sql I want.
[+] gldalmaso|11 years ago|reply
I have the same opinion.

I don't want magic on my database layer, because when it backfires, and it will, it might ruin the most precious thing that my application relies on.

I absolutely do not want a lack of understanding on my part to cause permanent damage to my data (already got burned by that). And since ORMs are usually complex beasts, I certainly don't want the burden of learning every single use case before even deciding how to model the thing.

The level of complexity does not balance the value added. It does not hide SQL, because you WILL have to learn both the framework and SQL to debug, and debug you will, often. It does not, in my opinion, increase productivity since I will lose many an hour figuring things out when things do not work as expected, and they won't. Simple bugs and performance issues become a great burden.

Simply put, I don't won't to forfeit control of the most sensitive part of the system, the data. Everything that goes on with data persistence should be really clear and simple, no magic.

[+] mistermann|11 years ago|reply
If you can think of one of your most representative examples of a problem you had I'd be interested to hear.
[+] anton_gogolev|11 years ago|reply
So even trivial `select id, name from bar where baz = @foo`-style queries are hand-rolled?
[+] the_mitsuhiko|11 years ago|reply
Why you should not: it's inferior to SQLAlchemy and has a dual licensing model where one is the useless AGPL and the other is an expensive commercial license.

Also I find it's implementation to be very questionable.

[+] nnq|11 years ago|reply
...expensive?! Maybe when you compare it to free. But for such a crucial piece of infrastructure, this looks like very cheap, IF its friendlyness is backed by SQLAlchemy-class code quality.

The fact that there exists good free software out there doesn't mean that one should adjust his price expectations based on this. Take Linux, it's free, but this doesn't mean it's development was not fueled by up to $100M worth of contributions, so if Linux it's free, this doesn't mean that a team developing a new OS by themselves should not charge a price big enough to quickly amass a tens-to-hundreds-of-millions-$ class profit to recoup their investments.

You give something fully for free IF and AFTER you've recovered the development cost, or after you go bankrupt or pivot to another product and you no longer have any use for it. If you do it before, you at least make damn sure nobody else can make a profit from/based-on your not-yet-paid work without giving you at least some of it, because this is how it's fair to be, this is what the (A)GPL is for!

[+] reitanqild|11 years ago|reply
As much as AGPL annoys me the pony developers seems to understand it and use it in a reasonable way.

What really annoys me is

1. when people stamp a AGPL license on something (that was previously licensed under GPL or even less restrictive) and claim that it is still just GPL and everyone could still continue using their system as before.

2. when people make something brilliant and only offer it under AGPL without providing a commercial licensing option.

[+] coleifer|11 years ago|reply
I'm the author of a lightweight python ORM (peewee) and have a healthy respect for the work that goes into building this type of tool. PonyORM is quite a feat of software engineering: it's developer(s) have done a great job.

I would never use it, though, because decompiling Python generator expressions just feels so unpythonic.

[+] acdha|11 years ago|reply
> I would never use it, though, because decompiling Python generator expressions just feels so unpythonic.

Agreed but it really left me thinking that it'd be extraordinarily useful if Python gained some supported path for doing this – a standard way to get the AST for a generator expression would be really useful and a relatively general tool. A few years back I would have loved a cleaner way for a JSON decoder to know I only needed a couple of fields to avoid creating objects for everything else and it seems like there could be some very interesting optimizations for something like numpy/scipy as well.

1. I briefly liked LINQ awhile back before Microsoft's usual baroque design and poor QA convinced me that ASP.Net wasn't actually a time-savings. The LINQ-REST support seemed nice, albeit complicated, until you hit bugs where a change in one view would break unrelated views and running in the debugger "fixed" it.

[+] Jake232|11 years ago|reply
The process going on under the hood (the whole disassemble / AST thing) is pretty unpythonic, however I would argue this then gives the developer the advantage of having a more pythonic query interface to the database on a higher level.

Providing the things happening underneath don't break (and they haven't on me yet), and it generates efficient SQL (which is does), then I'll take the hidden complexity for the advantage of the higher level generator syntax.

[+] jstsch|11 years ago|reply
I still feel that RedBeanPHP (http://redbeanphp.com) has found the right balance between direct object manipulation and SQL. I find that abstracting away SQL too much gives more headaches than it's worth — SQL is a valuable skill anyway, and often diving into SQL is much faster and easier than diving into abstraction magic.
[+] nnq|11 years ago|reply
I love redbean, it saved my butt once, but it gets lots of it's power from doing things "the PHP way", it makes linting and editor/ide autocompletion impossible, there is no one place you can look to find the part of the database schema relevant to your application, you can't just go read a models.php file and get a birds eye view of the database structure that matters and all...

It's basically the opposite of "pythonic". I can't imagine something like redbean written in a language like Python, and maybe it's for the best :)

[+] mgkimsal|11 years ago|reply
Another +1 for redbean. Haven't tried v4 with the namespace support yet, and I have a feeling I will not like it as much as earlier releases, but will probably end up needing to use it on newer projects.

The dev mode approach of 'just do stuff and I'll modify the tables' has scared off many people I've showed it to, but I love the approach - sort of a moderate 'nosql' approach without giving up structured tables for more complex queries later.

[+] nnq|11 years ago|reply
> It essentially decompiles the generator into its bytecode using the dis module, and then converts the bytecode into an AST.

Uhm, anyone knows what are the performance costs of this?

[+] Jake232|11 years ago|reply
Here's a comment on the stack overflow answer I linked to in the post:

Very performant: (1) Bytecode decompiling is very fast. (2) Since each query has corresponding code object, this code object can be used as a cache key. Because of this, Pony ORM translates each query only once, whereas Django and SQLAlchemy have to translate the same query again and again. (3) As Pony ORM uses IdentityMap pattern, it caches query results within the same transaction. There is a post (in russian) where author states that Pony ORM turned out to be 1.5-3 times faster than Django and SQLAlchemy even without query result caching: http://www.google.com/translate?hl=en&ie=UTF8&sl=auto&tl=en&...

[+] mtford|11 years ago|reply
I do a lot of heavy Django work and first glance I really like the syntax. Much cleaner that what I'm used to. I will def give this a try at some point and comment further.

Why the name PonyORM btw? I know it's superficial but I much prefer the name SQLAlchemy - has more meaning.

[+] amalashkevich|11 years ago|reply
Hi, I am one of Pony ORM authors.

The idea of Pony ORM is to provide a Pythonic way to work with the database. We think that the generator syntax is very concise and convenient.

It is named Pony because a pony is a small, smart and powerful creature - these are the features which our mapper has. Our goal is to provide non-leaky abstraction and good user experience.

[+] megaman821|11 years ago|reply
As someone who has had to maintain other people's code, please use an ORM. Usually the biggest thing to fix is a person didn't select related rows and ends up doing hundreds of queries inside a for loop.

Now the hand-written SQL people leave in SQL injection possibilities. They build up complex queries with crazy string concatenation. They either have no or a shitty data mapping layer (I mean, I really enjoy having to look at the database to figure out what fields select * from articles returns).

Obviously there are going to be queries outside of what any normal ORM can do, but every ORM I have used gives you an escape hatch to just write raw SQL when needed.

[+] makmanalp|11 years ago|reply
I'm just waiting for zzzeek to publish a blog post on how to implement this on top of sqlalchemy core, like this post [1]. :) Seriously though, sqlalchemy ORM layer already does this. Why not leverage all the existing features and support it has? Also, it's an easier sell for a $100 piece of software: "If you're using sqlalchemy, you can migrate to pony immediately."

[1] http://techspot.zzzeek.org/2011/05/17/magic-a-new-orm/

[+] zzzeek|11 years ago|reply
the AST-on-top-of-SQLAlchemy idea has been discussed for many years prior to Pony's existence. Robert Brewer's Geniusql http://www.aminus.net/geniusql/chrome/common/doc/trunk/manag... does the same thing (for Geniusql, you pass it lambdas; interpreting generators directly was more of a "future" thing, but by "future" we're talking, like four years ago :) ). He presented it at Pycon maybe in 2009, 2010 and right after I had the usual flurry of questions "Can SQLAlchemy do this?" and I said "sure! just write an adapter, easy enough". I think Robert was even interested in that.

at the end of the day the AST idea looks very nifty but IMO is too rigid to translate to SQL in a flexible enough way, and also works against the main reason you use code structures to produce SQL which is composability. When I last saw the Pony creators do a talk, the approach seemed that each time you have a given SELECT, and you'd like to add some extra criterion to it, it pretty much will keep producing subqueries of the original, because each time you can only wrap the AST construct you already have. It similarly had no ability to produce a JOIN - at that time at least, the only way to join things together was by rendering inefficient correlated subqueries. This was asked explicitly.

If they've found a way to resolve these issues while keeping true to the "AST all the way" approach and not dropping into a SQLAlchemy-style approach, that would be great. There's no reason SQLA ORM or Core couldn't be behind a similar approach as well except that nobody's really had the interest in producing it.

[+] GlennS|11 years ago|reply
The lack of data migration tooling would be a critical problem for me.

The traditional problem that ORMs are supposed to solve is change. Changing your database schema means you have to change all your queries. So instead you have a a system whereby the program that executes your queries also understands your schema and can make that change for you.

Data migrations are similar to this, only much harder and more time consuming.

[+] mantrax5|11 years ago|reply
I think once a developer reaches a certain level of experience and maturity, they realize they should stop giving any sort of ORM a chance.
[+] hirre|11 years ago|reply
ORMs basically comes from "lazy" programmers not wanting to learn SQL and proper database schema design and wanting something "that just works" instantly with little thought effort... Optimizations and handling of large data sets are some of the problems with ORMs. However, in our case we decided to use an ORM anyways because each UPDATE/DELETE/INSERT had to be signaled out to some lowlevel code, communicating with hardware. If we hadn't used an ORM (with overloaded save-methods) the user would have to keep track of all changes himself and signal the hardware manually, so in our case this was fine (we also didn't manage large data volumes)...