top | item 13668916

Google Spreadsheets and Python

811 points| happy-go-lucky | 9 years ago |twilio.com | reply

140 comments

order
[+] Lightbody|9 years ago|reply
I love Google Sheets. It is so much more powerful than 99.9% of the users appreciate. Here is how I use it at my day job as head of PM at New Relic:

  - Fetch deals closed and lost hourly from Salesforce
  - Fetch for each of our 14k+ paid accounts usage metrics using our Insights product
  - Pull both items into a nice color-coded business dashboard that is near realtime
  - Send said dashboard out as a PDF to a bunch of stakeholders daily/weekly
  - Save PDF snapshot into Google Drive so I can easily pull up historical reports
If you haven't tried out Google App Script and supercharged your spreadsheets, give it a shot. It's also basically the only way the engineer in me gets to code much these days, so I have fun with it too :)

PS: Google Sheets is definitely much slower for pure spreadsheets than Excel, no question about that. But by using cloud-based JavaScript + custom code, I can often work around that issue by writing functions that process the data quickly and report on what I need. Ultimately I end up getting much more out of it than Excel.

[+] josephjrobison|9 years ago|reply
For all the non-programmers out there that are comfortable with spreadsheets, this site - http://codingisforlosers.com/ -is pretty epic for building out a ton of dashboards and data pipelines with Google Sheets.
[+] rebootthesystem|9 years ago|reply
I get all the good things Google Apps do. It's wonderful. Yet, until Google has a "contract with our users" that includes a level of customer service and human conflict resolution I would not touch them with a ten foot pole.

Why? Because they could take it all away --and I do mean ALL-- overnight. Email, apps, etc.

I've seen many people run into the Google ban for reasons not more complicated than not being experts and using one of Google's tools in a way that triggers an algorithmic ban. And this doesn't have to have anything to do with fraud or scams.

Had one client, probably ten years ago, who moved all of his 200+ domains to a service Google introduced at the time called "Google for Domains". A domain parking service with ads inserted into the parked domains auto-magically by Google. Prior to that all of his domains were parked at GoDaddy, who used Google to stuff ads into the domains. As soon as the service was offered he figured it was a good idea to remove GoDaddy as the middle-man in that equation.

He transferred all of his domains. There was an approval process. All domains were approved overnight. Done deal. Right?

Wrong! Two days later Google sent him an email informing him that all of his Google accounts were now permanently suspended (aka: closed) with no recourse. Reason given: Unusual click activity on some of the domains parked with Google for Domains.

This guy ran (runs) a multi-million dollar manufacturing business. The very idea that he would sit there clicking on ads to earn a quarter of a cent per click is, well, stupid.

Anyhow, this one event cost him all of his Google tools. His email, calendar, contacts, apps and whatever else he was using at the time. No recourse. Not conversation with a human being. Nothing. Done. Puff. Evaporated.

Since that one event I have been of the opinion that using any of these Google tools for a business is, to put it plainly, stupid and dangerous. You could be betting the farm on a petulant algorithm and no customer service of any kind to protect you from it.

I still have Excel files from decades ago that I can open and manipulate today. Microsoft could go out of business and I could still open these files. And the software has never lacked functionality, either directly, through add-ins or programmability.

The bottom line for me is that if something is business-critical it isn't a good idea to rely on a service that could be yanked overnight (I do mean that literally). If Google had customer service and a human (and humane) process to deal with issues it would be a very different story. I don't want free stuff. We have tens of thousands of dollars invested in all kinds of software. I want stuff I can rely on because my businesses depend on it.

The only way you are going to have to security is if you pay for well supported software that has a team behind it who understand how important it is for a business to have the ability to wake up every morning knowing that the tools they come to rely on for their daily work won't evaporate overnight.

I truly don't care how many shinny new things, candy and chocolate Google throws on the table. Until they prove they understand this one point they represent a business-killing risk nobody should be willing to accept.

BTW, this is a problem with nearly all Internet giants. Facebook, Amazon and others are horrible companies from the standpoint of how they deal with their business customers.

[+] cm2187|9 years ago|reply
...when I think that Microsoft is trying to kill scripting in office...
[+] goblin89|9 years ago|reply
There’s also a newer thing called Google Data Studio, which creates neat visualizations directly from Sheets with even less effort.

It’s a no-brainer to feed all sorts of metrics (business or otherwise) from let’s say a Django app into Sheets, and with Data Studio decision makers can have all the freedom to combine them into complicated dashboards in accordance with current business landscape, pinging engineers only if new things need to be tracked.

[+] pbreit|9 years ago|reply
Is it currently possible to authenticate to Salesforce without Oauth? Like just obtain a token?

I use App Scripts to run API calls from a Google spreadsheet which is easy when the API has simple basic auth. But I've been stumped with Salesforce.

FYI: https://developers.google.com/apps-script/guides/sheets

[+] vegabook|9 years ago|reply
Ever since I learned R and Python, I have dumped Excel. All my heavy lifting is done in a proper programming language, and for all the spreadsheet things that I need, Google Sheets is perfect. There's also something pretty magical about the sharing functionality when two people work on a document at the same time. Not to mention that all my sheets are a browser shortcut away, anywhere.
[+] johnsmith21006|9 years ago|reply
Totally agree on Google Sheets. I love the connectivity of sheets and the best application I have seen is grades at the school district when you have a lot of kids. It is so easy for me and my wife
[+] georgeaf99|9 years ago|reply
A friend who works at the Dallas Animal Shelter (a non-profit) asked me to help automate their process of generating reports from data gathered via Google Forms, which they were doing by hand every month... As a developer who is used to high technical investment into complex systems, I had to take a completely different approach to storing their information.

Google Forms + Spreadsheet + Python works perfectly for use cases like this: it is free, scaling isn't a concern, users can view and modify data at any time, and users can create Google Forms at any time. If you are trying to help a non-profit and aren't planning on maintaining their systems long-term, this is a great approach.

Code for the project: https://github.com/georgeaf99/das-care-contact-forms

[+] btown|9 years ago|reply
https://gspread.readthedocs.io/en/latest/ is great, but it's limited in what kinds of things it can set (formatting, notes, etc.). One other option is to create an endpoint on script.google.com which can access a much richer SpreadsheetApp API: https://developers.google.com/apps-script/reference/spreadsh... . You can POST to it using an auth token from the same service account oauth creds (though you need to add drive and drive.scripts to your scopes), and it can run arbitrary JS to translate reads and writes from the JSON payload/response into API calls.

As another note, we realized that far more useful than using Google Spreadsheets as the canonical backing database, was to be able to bidirectionally synchronize it with our primary database. That way, users who wanted to annotate entities in spreadsheet form could do so in GSheets, always working with up-to-date data, and keeping track of "I updated a.x in the spreadsheet, but a.y was updated upstream, so merge the two." Here were the semantics of our integration:

    Returns a list of updates between last_synced_data and sheet.
    Subsequently, if upstream_data is provided, then load it into the sheet,
    adding rows on the end as needed, or merging if there is a match in the merge_key column
    (note that any updates to the live sheet data since the last sync
    override any upstream data, and those live updates are returned without changing the live sheet).
The caller would then be responsible for taking the returned list of updates and cleaning it for the database, as well as maintaining a record of what the state of the last sync was. Essentially we maintain enough information to do a three-way merge. We've since built internal applications that allow real-time spreadsheet-like interactions in a much more domain-specific manner, but it definitely did the job for quite a while.

If there's interest in seeing open-source code for all of this, we could definitely extract from our corporate repo (we're https://www.belstone.com/ ). Let me know!

[+] arthurdenture|9 years ago|reply
One thing to note is that the new version (v4) of the Sheets API can access the same fancy functionality that Apps Script can. So if gspread moves to that (which e.g. https://github.com/burnash/gspread/issues/435 alludes to), some of those limitations will go away.

(And the sync thing is neat!)

[+] cyberpanther|9 years ago|reply
The big downside I've found using this a lot in the past is that Google has no SLA for sheets. Sure its Google, so it has high availability but things can quite often fail with no support from Google. So just be careful of building anything mission critical with this. The API can be wonky every once in a while. At one point it even took them like 6 months to fix an issue I had with the Apps Script API. If my business relied on that functionality, I would have been screwed.

The Jupyter Notebook use case sounds great but if you move to something that is more critical use a real data store.

[+] victorhooi|9 years ago|reply
Regarding your claim about SLAs - there actually is an SLA for all of the paid Google Cloud products - e.g. the following would apply for GSuite, which covers Google Sheets:

https://support.google.com/work/answer/6056635?hl=en https://gsuite.google.com/terms/reseller_sla.html

If you mean the free products e.g. Google Sheets with a consumer account - AFAIK, there isn't a contractual SLA per-se, but the uptime is pretty good =).

You can also see a status dashboard with historical status here:

https://www.google.com/appsstatus#hl=en&v=status

RSS feed link is at the bottom of that page.

(Disclaimer: I work for Google Cloud, on the Drive/Docs/Sheets side of things).

[+] rickcnagy|9 years ago|reply
Having run an (internal) app that is deeply integrated with Google Sheets for about a year, I think that Sheets is good and bad. It's _extremely_ powerful for allowing business users to easily get access to and manipulate data manually. Downside is that it really struggles past a few thousand rows when you've got more than 20 or so columns. Definitely good for a quick and dirty way to expose data to internal users, but it essentially can't scale. Engineers need to be ready to move off of it if the system works.
[+] startupdiscuss|9 years ago|reply
If anyone from Google is reading this: please add a database (yes, like a cloud version of MS ACCESS) in the Google Docs/Drive suite.

Has anyone looked at "Fusion Tables?"

A google search reveals that is what it is supposed to be.

[+] iamdave|9 years ago|reply
I've found that it also chokes when you throw Google Apps Scripts at it; or at least it did when I was developing a mail merge app for a Real Estate friend. By "chokes" I mean the application frequently locked up, cells would stop responding and I had to constantly reload the page, in a spreadsheet with only a couple of hundred rows, and maybe 12 columns.
[+] domfletcher|9 years ago|reply
I had a lot of mileage avoiding row count actually in the sheet buy using an API to retrieve data directly when a user is interested in a particular record. Then the only think you need to keep in the sheet is an index which is lighter because (as far as I can tell) it's cell count that really affects performance.
[+] audleman|9 years ago|reply
I had the same experience. I had to write a layer to slice my data up and submit no more than 50 rows, and retry on connection timeouts. Very frustrating, and so weird that a global giant like Google releases an API that's so mediocre.
[+] DontGiveTwoFlux|9 years ago|reply
One issue I ran into quite a bit was timeouts.

I had a spreadsheet with a few hundred rows I was using to create Docs and send emails. I was running from scripts.google.com. Any operation running for longer than about five minutes would time out so I had to batch my work into smaller buckets and hit run a bunch of times.

[+] vpribish|9 years ago|reply
hear hear! this was my experience too. I was ready to switch to using gsheets as my entire application UI layer when the scaling problems and desyncing started hitting me.

still fantastic for quick-and-dirty tools and prototypes though.

[+] stephen-mw|9 years ago|reply
I never published it, but I actually wrote a service that uses google sheets to manage SSH access on your servers. Users simply submit a google form with their SSH key, and it's automatically pulled by a daemon on each host. It comes complete with admin/sudo access toggling.

https://github.com/stephen-mw/gdoc_ssh_manager

[+] f_allwein|9 years ago|reply
I think there's a bit of a gap in the software ecosystem today in that there's no tool that lets semi-technical people like myself create simple applications, e.g. to read and write to databases. Bit like MS Access used to do.

Google Spreadsheets actually goes a long way (I've done some stuff using the IF function etc. in it), but obviously has its limits. So combining it with Python sounds interesting. Not sure if it's what I'm looking for, but I'll check it out.

[+] simonw|9 years ago|reply
Have you seen AirTable? It's definitely the best modern alternative to Access I've seen - easy to create tables, great web UI and a phenomenal iOS mobile app.

https://www.airtable.com

[+] buildbuildbuild|9 years ago|reply
Salesforce often ends up filling this gap in organizations that can afford it, but often results in a technical debt of layman-designed schemas persisting long into production. (cleaning up or directly working with SF data is an expensive nightmare in my experience even post-Heroku acquisition)

You are correct that a modern MS Access alternative would certainly find its market.

[+] dragonwriter|9 years ago|reply
> I think there's a bit of a gap in the software ecosystem today in that there's no tool that lets semi-technical people like myself create simple applications, e.g. to read and write to databases. Bit like MS Access used to do.

Isn't the tool that does what MS Access used to do, and that is available now, called "MS Access"?

[+] PhilipRawson|9 years ago|reply
Perhaps Bubble is close to what you may be looking for: https://bubble.is

I know several people using this as a quick backend database solution since they make it so easy to interact with data and expose a GET and POST API.

[+] tyingq|9 years ago|reply
Quickbase does that, but the pricing model doesn't work well for all use cases.

Edit: DabbleDB was awesome, and easier to learn than quickbase. But Twitter acquihired them and shut it down.

[+] ghostly_s|9 years ago|reply
Microsoft has a new suite of web-based tools called Sway, PowerApps, Flow and Dynamics365. I gather this are trying to meet this need...somewhere in there?
[+] beojan|9 years ago|reply
MS Access hasn't gone anywhere, why not use it?
[+] agildehaus|9 years ago|reply
Just because you can doesn't mean you should.

You're putting the entire Internet between you and your "database". You'll receive all the latency and general reliability problems that go with doing that.

In the v3 API, the first blank row terminated the column data set. Which meant you couldn't access rows after that blank row, which may have been accidentally inserted by a user. They may have addressed this in v4, not sure.

You have almost no ability to restrict the values the user enters. They can be literally any string and your app has to handle every possibility. Restricting type on a Sheet is not really possible.

What if a user is in the middle of editing the spreadsheet when your app attempts to access it? Since the sheet is constantly saving itself, and the user may not have completed their edits, are you getting incomplete or inaccurate data?

[+] xelathebela|9 years ago|reply
Potentially, putting the python function into a AWS Lambda function (or any other serverless function) + attaching an API gateway, you could make a RESTful endpoint for all the CRUD operators for google sheets. Could be a extremely light weight way of storing data and exposing it through REST :)
[+] stpe|9 years ago|reply
In a project I worked on the implementation of a very complex form (insurance company) with lots of conditions. Given the number of stakeholders and people involved having a say (legal, product owner, sales people, customer care, copywriter, etc) normal iteration (gathering feedback => implement => deploy => repeat) would simply take too long.

I very successfully used Google Spreadheets as a backend for this. Putting all text and the conditional logic in the spreadsheet, and allowing the form to be built as a web page based on the Google Spreadsheet data for instant preview. This allowed the different members in the project to even work in parallel (thanks to collaboration features) with stakeholders - and implementing the feedback immediately themselves DURING the meeting - to see if the results was what they expected.

I wrote a small blog post about this, unfortunately in Swedish, but hey - there's a video at least :) http://www.rebelandbird.com/hyperiterativ-prototypning-med-g...

Another powerful way I use personally is to use Google Spreadsheet as a data backend for Jekyll based static websites. Here is a Grunt plugin I did to deal with this https://github.com/stpe/grunt-gss-to-json - example usage; my retro games collection http://games.stpe.se/

[+] jonknee|9 years ago|reply
Google Forms got a big re-do in the fairly recent past and it's an excellent product. My partner uses it a lot and she calls the results her "apps" becaues they are saved to her homescreen on iOS. She has some pretty sophisticated forms going because there's a "Go to section based on answer" feature that lets you build conditions.

If you haven't checked it out in a while it's worth a shot.

[+] classybull|9 years ago|reply
I'm going to have to be a contrarian on this one.

On my team, we absolutely loathe Google Spreadsheets with a passion. Not necessarily the product itself. It actually is a truly capable and nice spreadsheet program. However, it encourages some extremely bad uses.

Because of the low friction of creating a spreadsheet versus, say, a database, business users have started using it as a substitute for something that should probably live in a database. For instance, I'm thinking of a particular portion of our sales organization that kept their business hierarchy in a Sheet, ie x person is on y team. Now, when you want to do analysis on the data you now have to use something similar to this library to retrieve the data and return it in a dataframe, which then pipes into the rest of your analysis and you end up with a finished product, be it a dashboard or some job.

What always happens is that you write your analysis, set it up to run regularly, and all of the sudden, three weeks later, you start receiving exception emails on the script. When you investigate it, inevitably some jackass business user upstream altered the "schema" of the document, breaking your downstream analysis.

In addition, once you have hundreds or thousands of these scripts running repeatedly, all reaching out to the Google API and in some cases retrieving 100s of MBs of data, Google very quickly rate limits you and you end up with dozens of scripts that broke.

Sheets is great for analysis and one offs, but you have to push back on your business users who will constantly try to use it as an operations platform because they don't know any better.

- "I need to see an analysis of xyz data."

- "Ok, where does the data live?"

- "Well, all of the facts come from this SQL table, and we pull from this web service, and finally we run it through this lookup table we maintain in Google Sheets."

- "Nope. Come back to me when you put that data somewhere else. Spreadsheets are not databases."

[+] eob|9 years ago|reply
Cloudstitch founder here (YC S15).

We wrap Google Sheets and Excel 365 in an API you can use to GET/PUT data along with a number of other goodies:

   - security policies (e.g., row-based auth)
   - file upload support
   - email triggers upon upload
   - a notion of "frozen" releases versus live dev data
     (released but yet undocumented -- email hello@ for details)
Very happy to provide support for anyone looking to use us as an API into their sheets.
[+] mixmastamyk|9 years ago|reply
Interesting to choose this over simpler storage, such as sqlite. I suppose you get offsite backups for free.
[+] gregorymichael|9 years ago|reply
(author here) Main advantage in our usecase is getting the CUD interface for free.
[+] vpribish|9 years ago|reply
multi-user UI, sharing/access control already there, versioning of data, backups. those are all really nice to have
[+] vgt|9 years ago|reply
Google Sheets fully integrates with Google BigQuery as well! [0]

You can create a BigQuery table that's powered by a Google Sheet, or export data straight to Sheets for quick and dirty data wrangling. Some folks use Google Forms or Google Sheets to update ledgers, SKUs, or what have you, with data flowing straight to BigQuery for analysis against clickstream data, server logs, Google Analytics data, weather, etc.

Love AppScript - so many possibilities for a "serverless" framework. How about ODBC?

[0] https://cloud.google.com/blog/big-data/2016/05/bigquery-inte...

(worked on BigQuery, work in Google Cloud)

[+] szopa|9 years ago|reply
Huh. I bet that in the Docs SRE team at Google there's a senior SRE who's having a "I felt a great disturbance in the Force moment." Google Sheets is awesome, people at Google are smart, but they are not capable of magic.

In the case of Sheets, the most reasonable sharding scheme for the database backing it would be based on some collection of spreadsheets (in the extreme case it would be just one spreadsheet, but that probably wouldn't be practical). The important assumption here is that all the traffic pertaining to one spreadsheet will ultimately go to one server. So, if there's too much traffic coming related to one spreadsheet, you cannot really scale horizontally by adding more servers – you have to give it more resources. A single Google Sheet is sometimes used by lots of users at the same time, but they are able to limit it from the frontend (for example by making it degrade gracefully).

If the traffic comes from the API, it's going to be a little bit more tricky – especially that IIRC the current rate limiting for the API uses a daily quota, leaving room for some really nice spikes.

So, as soon as a website built using this hack get somehow popular someone is going to have a really rotten day with a lot of pages.

(Of course, this is pure speculation on my side, I have no idea how Google Sheets is actually built – I would be very curious if there's a smart way of overcoming this sort of issue.)

[+] nl|9 years ago|reply
One of the good things to come out of Google Wave was a better understanding of how to scale using (eg) operational transform. Even without that, scaling a mostly read only spreadsheet backend isn't as hard as you are making out.
[+] rob-olmos|9 years ago|reply
Like another commenter said, the built-in Apps Script is pretty powerful as it is that it could probably handle requirements more easily and portably.

For example, I've created a simple Apps Script for a Google Spreadsheets that uses OAuth to sync the list of client accounts from QuickBooks Online allowing us to keep track of more structured notes and segmentation, which QBO is still sorely lacking. The sync allows someone to easily update the spreadsheet with any new clients not in the spreadsheet.

However, if I was going to be doing any in-depth calculations and/or a lot of data then I'd probably go the same route with using Python externally.

Forgot to mention: The only thing I'm wary of, and have seen other comments in other threads about, is using it for anything more crucial like a core Excel/Access app because of the potential future change that'll break things.

[+] dy|9 years ago|reply
I went pretty far down the path of trying to build applications with Google Spreadsheets and Apps Scripts (we still have some major business processes running on it). It's definitely made me feel the need for something between Spreadsheets and custom development or Salesforce - something like what I imagine MS Access solved in the earlier days.

Here's what it'd need:

- Spreadsheet like UI - Validations and field data types (major liability of spreadsheets) - Good forms integration - Custom code like Apps Script capability - Something around workflow and business logic

I think Airtable gets pretty close but I haven't been willing to switch cause of some limitations but I do like where they're going. I'm considering pulling the trigger for Salesforce but keep hearing that it's really easy to shoot yourself in the foot.