In such a case, the very first thing you do — aside from verifying that your backups, replication, and WAL archiving are all working — before you even try diagnosing your problem any further than "something weird is happening" is make a filesystem-level copy of your PostgreSQL cluster. (If you're running LVM, ZFS, a SAN, or any other thing that lets you take an instantaneous filesystem snapshot, so much the better. Do that, and then copy it.) Then, and only then, should you even contemplate trying to un-fuck your database.
I cannot stress this point enough. That means make a tarball, or cp the directory, or rsync it, or whatever disk file level tool you prefer, and specifically and emphatically not pg_dump. (It's probably not going to make things any worse, but if you do have corrupted disk pages, pg_dump isn't likely work, anyway.)
Flailing around trying to fix things can sometimes make them worse. If you're working on your already broken data, and break it further without the safety net of a fs-level backup, you're ... well, you're worse off than you were five minutes prior, aren't you?
Better, if you do have backups, just drop your database, reconstitute a new one (using your schema definition file which you should have saved,) and load your most recent backup. I've dealt with disk file level corruption and the best way to deal with it IMHO is not to deal with it at all.
Since "dribbble.com" appears to be "taking a timeout for maintenance" I can't at a glance determine what, exactly, they do. So maybe they're a bunch of hobbyists. But if not ...
I don't understand how a "real" company, even in todays overheated environment of soon-to-fail hipster startups, doesn't do (at least) one of these two things:
a) have competent employees on staff that are intimately familiar with their critical infrastructure and how to support it
or b) pay, yes, gasp, pay some other company for professional support services. A glance at the Wikipedia entry for Postgres shows some possible companies that do that. Or, as already mentioned, there are mailing lists.
But twitter? Really? That's support? For someone other than a hobbyist running a website in their spare time?
Go ahead, flame me. But my first impression is "amateur hour". My apologies if dribbble.com really is an amateur effort.
Even people intimately familiar with a pieces of infrastructure - hardware or software - can't possibly know all failure modes. Asking on twitter about an obscure error that you've never seen doesn't cost anything and may be very well just one way of finding a fix. It's akin to asking on IRC or writing a mail to a mailing list - something which you seem to condone. I fail to see how it's any less professional.
Plenty of big sites use Twitter to inform users of down time, I prefer that, do you expect me to go hunting down their status page or just have it come up conveniently in my newsfeed...
Just google "reddit status" and check the first result.
Who's to say they aren't? Spending 10 seconds asking on Twitter, and getting an 'oh yeah, I had that and fixed it with x, y and z' in response sure beats staying up until 2am fixing it.
Everyone should be running pg_basebackup + pg_receivexlog on a separate machine, preferably at a different data center.
These let you go back to any point in time. If you ran 'delete from orders where id=id', you can restore to the transaction before you ran that command.
Also, #postgres on Freenode. No matter when or what you ask, one of the regulars will usually respond instantly with very high-quality advice. No single product, no matter whether it was free software or not has ever reached this kind of support quality for me.
I feel for you guys, but it sounds like a disk error. If you're not experts, you need to do one of the following A) rollback to a snapshot B) restore from backup. If you have neither of those, take a sql backup, and pray you can recreate the broken data.
Might as well comment here as tweets to the account seem to do nothing. Every 25th click or so has always resulted in a Chrome "no content" error. It's a frequent problem, and it's been happening for years.
When a problem first rears it's head, we usually don't know what the actual cause is. Some problems take 5 minutes to resolve, and others keep me up until 1 in the morning. So, making any kind of specific statement in the middle of a crisis is often impossible, and always imprudent.
[+] [-] rosser|13 years ago|reply
In such a case, the very first thing you do — aside from verifying that your backups, replication, and WAL archiving are all working — before you even try diagnosing your problem any further than "something weird is happening" is make a filesystem-level copy of your PostgreSQL cluster. (If you're running LVM, ZFS, a SAN, or any other thing that lets you take an instantaneous filesystem snapshot, so much the better. Do that, and then copy it.) Then, and only then, should you even contemplate trying to un-fuck your database.
I cannot stress this point enough. That means make a tarball, or cp the directory, or rsync it, or whatever disk file level tool you prefer, and specifically and emphatically not pg_dump. (It's probably not going to make things any worse, but if you do have corrupted disk pages, pg_dump isn't likely work, anyway.)
Flailing around trying to fix things can sometimes make them worse. If you're working on your already broken data, and break it further without the safety net of a fs-level backup, you're ... well, you're worse off than you were five minutes prior, aren't you?
[+] [-] astine|13 years ago|reply
[+] [-] Zenst|13 years ago|reply
Still true today and as I say backup don't fuckup and plan for the worst and let Murphies law work in your favour.
[+] [-] PhantomGremlin|13 years ago|reply
I don't understand how a "real" company, even in todays overheated environment of soon-to-fail hipster startups, doesn't do (at least) one of these two things: a) have competent employees on staff that are intimately familiar with their critical infrastructure and how to support it or b) pay, yes, gasp, pay some other company for professional support services. A glance at the Wikipedia entry for Postgres shows some possible companies that do that. Or, as already mentioned, there are mailing lists.
But twitter? Really? That's support? For someone other than a hobbyist running a website in their spare time?
Go ahead, flame me. But my first impression is "amateur hour". My apologies if dribbble.com really is an amateur effort.
[+] [-] Xylakant|13 years ago|reply
[+] [-] alexcroox|13 years ago|reply
Just google "reddit status" and check the first result.
[+] [-] BoyWizard|13 years ago|reply
[+] [-] bsg75|13 years ago|reply
https://twitter.com/samuelfine/status/334499315949133825
[+] [-] joevandyk|13 years ago|reply
These let you go back to any point in time. If you ran 'delete from orders where id=id', you can restore to the transaction before you ran that command.
http://www.packtpub.com/how-to-postgresql-backup-and-restore... contains more information.
Also, postgresql 9.3 (out in a few months) supports disk page checksums which can detect filesystem failures immediately.
[+] [-] jordanthoms|13 years ago|reply
[+] [-] parfe|13 years ago|reply
My guess is hard drive failure. Hope they have good backups.
[+] [-] mattryanharris|13 years ago|reply
[+] [-] sickpig|13 years ago|reply
Postgres community is quite responsive and usually give very effective advice, as long as you use the proper channel to communicate.
[+] [-] pilif|13 years ago|reply
[+] [-] bifrost|13 years ago|reply
[+] [-] taitems|13 years ago|reply
[+] [-] samuelfine|13 years ago|reply
In the interest of saving time: browser / version / OS? Any browser extensions installed?
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] samuelfine|13 years ago|reply
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] 19_ploT|13 years ago|reply
[+] [-] lllllllllllllll|13 years ago|reply
[deleted]