top | item 5221081

Sorry, Geeks, Microsoft Excel is Everywhere

172 points| csixty4 | 13 years ago |davidmichaelross.com | reply

165 comments

order
[+] ef4|13 years ago|reply
As a geek, I actually think Excel is one of the few really nice things Microsoft has created (I realize it wasn't the first spreadsheet, but MS gets credit for taking it mainstream).

It's one of the few general-purpose programs that really empowers ordinary users.

Excel is essentially a functional programming environment used by hundreds of millions of people.

[+] jgrahamc|13 years ago|reply
I agree with this. Excel is a wonderful product and I reach for it frequently when I need to get stuff done with some data. I tried to use OpenOffice and Apple Numbers and both are really poor by comparison.

Also, I've found that the combination of Perl and Unix command-line tools (cut, sort, uniq etc.) and Excel is really powerful. I can grind some data from log files or other sources into a format that Excel can read and then analyze it in Excel or get charts etc.

Excel and PowerPoint are the only Microsoft products I have on my Mac. Excel for its excellence and PowerPoint because I can exchange presentations with other people who use Powerpoint since none of the other presentation software really shares correctly with PowerPoint.

[+] Irregardless|13 years ago|reply
The simplicity and flexibility of Excel make it great for one-offs and random hack jobs that don't require a new application or full-fledged report. At the same time, it has some minor idiosyncrasies that can be infuriating and make me wary of trusting it in the hands of average end-users who need to enter/analyze important data.

Automatic type conversion is my favorite. I can't even count the number of times I've received Excel spreadsheets where data was completely lost because of it. Leading zeroes at the beginning of your account number? Excel will gladly chop those off for you. Order number looks like a date because they used the year as a prefix? No worries, Excel will change that to a standard date and completely forget the original format.

Maybe I'm just crazy, but I don't think a business-oriented application should favor convenience that much more than data integrity.

[+] jackalope|13 years ago|reply
This is why I'm continually amazed that Excel is used in business at all. I use a command line program to record my hours and can report the time I've worked on a project to the second. Unfortunately, my Excel-based timesheet sucks at math, and shaves off time when it converts my HH:MM:SS totals to the decimal figures (HH.x) required by payroll. It's so bad, sometimes the values displayed in the columns don't add up to the displayed total if you enter them in a calculator. No sweat, I only get paid for the displayed total, and at the end of the year I compute the difference and add it to my last timesheet.

I can't imagine trusting my business finances to a program that can't deliver a reasonable amount of precision. I wonder how many fortunes are made by people well-placed enough to exploit Excel's weaknesses ("Hmmm, Excel shows that we made only $10,000,000 at our bake sale. What should I do with this leftover $700,000?" or "I can use Excel to show you that I owe you less money than I actually do.").

[+] dan1234|13 years ago|reply
I've run into those type conversion problems before. The data is usually still there, it's normally that Excel has decided to use a certain display format for the cell (such as a date). Simply set the formatting back to "General" and all is fine. Usually.
[+] afterburner|13 years ago|reply
"No worries, Excel will change that to a standard date and completely forget the original format."

I hate this particular one so much.

[+] EEGuy|13 years ago|reply
>Automatic type conversion is my favorite.

Amen to that!! Yes, "Clippy" (Excel) gets a little too enthusiastic at times, but the missing piece here is metadata. So Clippy does what Clippy can without metadata, and the results can be laughable at times.

My fix for that, _if I'm sourcing the data_, is to source it as tab-delimited (not CSV-delimited) from the clipboard into an empty sheet area twice:

With the first paste, I 'visually' correct the columns' incorrectly guessed data types, but Clippy oddly doesn't attempt to reformat the data. No worries, I just select all, hit DELETE, then do my second paste using the same starting cell as my first paste. Clippy doesn't interfere this time, and my data comes up all beautifully typed as I want it. I don't know why, but the DELETE key doesn't kill the data types for cells, and I'm glad it doesn't! Not visible, not obvious, but very useful.

[+] Gustomaximus|13 years ago|reply
Depending on how you want to use the data, you may be able to get around this problem by adding a ' infront of the number. It won't show on the cell display and acts as an instruction to hold leading zeros.
[+] baak|13 years ago|reply
I absolutely hate Excel. It's not the number crunching that's the problem. Database -> Excel, Excel -> Database issues are almost unavoidable. In 2 months of working with SSIS packages, I encountered every one of these problems (I'm not the original author of this rant):

"Anyone who has worked with a database in a professional capacity for more than 20 minutes should have a list of at least 10 reasons why Excel is a monster. These probably include:

1. The way it butchers postal codes that start with a leading zero, like the town I grew up in (Granby, MA 01033 USA)

2. Dates of any kind

3. Serial numbers that have leading 0's (see #1)

4. The JET database driver for Excel. One large WTF.

5. SQL Server Integration Services Excel datasource. WTF squared.

6. The f-ing "just put an apostrophe" workaround. WTF.

6. a. The equally effective "format as text before you paste" workaround. Gives the illusion of working, only to break later.

7. Save as CSV, then reopen the CSV in Excel. Lots of magical things happen there.

8. While on the topic, CSV files, which are a whole WTF on their own.

9. The Jet database driver's "type guess rows" registry entry. WTF factorial.

The root of all this: Excel makes things that look like tables, and tables are useful for data. There is no other program that is as widespread AND makes things that look like tables, so people use Excel to make tables of data. And it's in fact really, really bad at that. It was designed for ad-hoc numerical analysis and got appropriated as a database loading and reporting tool.

I think it's actually damaged the GNP of whole nations, this Excel program. It'd be interesting to know how badly."

[+] smackfu|13 years ago|reply
Really, the main problem is that Excel (which has data types) is trying to support CSV (which has no data types) as a pseudo-native format. If Excel forced CSV files through the import wizard, and you could override a data type for each column, it would solve most of the issues. Instead, each column is implicitly treated as Auto and that fails in a lot of cases.
[+] JumpCrisscross|13 years ago|reply
"The way it butchers postal codes that start with a leading zero, like the town I grew up in (Granby, MA 01033 USA)"

Excel, by default, treats any numerical object as a number. Numbers don't have significant leading zeroes. You can change the default data type, i.e. "format", to text or even postal code to preserve leading zeroes.

[+] kyllo|13 years ago|reply
The leading zeroes problem is a constant thorn in my side. You basically have to never, ever open a CSV file in Excel.
[+] einhverfr|13 years ago|reply
> 7. Save as CSV, then reopen the CSV in Excel. Lots of magical things happen there.

Indeed. My favorite one is accounting spreadsheets tending to export accounting numbers as text columns with trailing whitespace. That was off a recent version of excel for the Mac. I understand it's a cute convention for currency formatting but it makes data transformation in a database very, very annoying.

[+] mikec3k|13 years ago|reply
Excel is not a database. The problem is people trying to use it as a database.
[+] jorgeleo|13 years ago|reply
"3. Serial numbers that have leading 0's" and "9. The Jet database driver's "type guess rows" registry entry."

One word: UPC

I still have nightmares....

[+] revelation|13 years ago|reply
At least its not parsing 01033 as octal..
[+] csharpminor|13 years ago|reply
I really like excel, but the sad truth is that outside of the tech world, many many people are incredibly un-tech-savy.

I used to be a contractor for a government agency (that will go unnamed). Granted, it's the government, not the private sector but few people realize that most computers owned by the federal government still run Windows XP. Additionally, the approval process for getting new software usually takes 1-6 months. We're talking about installing something like Google Picasa. Additionally, software updates would have to go through a clearance process, leaving my computer completely vulnerable for weeks at a time while someone (maybe) scrutinized an update to Flash.

It wasn't only the equipment – the sheer lack of ability with computers surprised me. This wasn't an isolated incident – it seemed like everyone from secretaries to managers with PhDs were barely above that scene from Zoolander. Some examples:

We had one "analyst" who had never heard of pivot tables in excel. This is someone whose job it is to analyze massive budgets. They were manually selecting cells to see the count number at the bottom of the Excel window and writing coordinates down on a piece of paper.

After having transferred to Google Applications for 9 months, there were still several people who were surprised to learn that Chrome was a web browser. One asked, "but how do you Google things?"

$1500 videochat system? Forget it, nobody knew how to use it and rarely ever tried.

I think people are starting to wake up to the importance of technology, but I really feel like employers should do more to test their problem solving ability. I am by no means an expert with VB or the more advanced aspects of excel. But my ability to research quickly and solve problems put me miles ahead of everyone else.

[+] hudibras|13 years ago|reply
>Granted, it's the government, not the private sector but few people realize that most computers owned by the federal government still run Windows XP. Additionally, the approval process for getting new software usually takes 1-6 months.

We're still using IE7 in my government office. Good times...

[+] cwbrandsma|13 years ago|reply
<off-topic>To all the Font wonks out there who tell me about the importance of the font to establish branding...blah,blah,blah. The font on this site, while cool looking is very hard to read.</off-topic>

<on-topic> Something I tell junior devs when making reports, if the data can't be exported to excel, then it isn't a report. It doesn't matter how cool your filter/sorting capabilities are, how good you make the data look, your charts could be beautiful to behold. If you can't export the data to excel then you haven't done anything of value as far as the custom is concerned, because they will ONLY look at the data if it is in Excel. In many companies, that is the only feature that is used (the export to excel).

[+] bcoates|13 years ago|reply
That's been my experience too. It actually saves a lot of time figuring out how to make fancy graphs and sort/filter/pivot functions--I just make nice semantic HTML with <table>s of values. The users can point Excel at my URL and it'll automagically turn it into a spreadsheet they already know how to use.

One thing I haven't figured out how to do is have a link on my page that triggers an Excel HTML import operation on the current page.

[+] JumpCrisscross|13 years ago|reply
Excel wasn’t the problem at JP Morgan. There was a reckless culture that thumbed its nose at rules, ignored the guidance of review committees, and tried to sweep things under the rug when they got caught. That would have happened whether the models were written in Excel or Ruby.

If anything, Excel promotes transparency in finance by allowing more people to read the "source" (even the bankers). Cutely named forgotten programmes written in J were a more fecund source of problems.

[+] doppenhe|13 years ago|reply
<disclosure>I work for Microsoft, specifically I am a Program Manager on the Windows Excel team </disclosure>

This is actually something we take very seriously and we have been building tools to improve this. Excel 2013 actually shipped with a compliance add-in. Here is some more info: http://blogs.office.com/b/microsoft-excel/archive/2012/09/13...

[+] h1srf|13 years ago|reply
Agreed. They were doing CDS deals. Every bank I've worked at has their models in COM assemblies that can be consumed by Excel. Excel lets the traders value their instruments using the models. The inputs may have been copy-paste but I'd be hard pressed to believe that the risk numbers were generated by hand in Excel unless this was some very exotic instrument which I don't think it was.

Risk numbers aren't generated by Excel but an automated process that uses the same assemblies.

[+] jgamman|13 years ago|reply
have you tried reverse engineering someone else's non-trivial spreadsheet? a programmer might build a good spreadsheet but the average person builds a nest of lookups and hacked up nested if/then statements. seriously, i think spreadsheets are un-auditable - the only way to double check something is to write your own version and check the output... now you've got 2 problems...
[+] sctgrhm|13 years ago|reply
Financial analyst here (and front end dev geek / enthusiast). It’s been interesting to see a number of recent HN articles related to Excel. I work for a Global 500 manufacturing company and can assure that Excel supports a large number of business processes and decisions.

Raw data is often stored in proprietary OLAP data stores which are provide a single version of the “truth”. The financial data is retrieved through the vendor’s Excel add-ins. Analysts can then use Excel’s basic functionality to transform and enrich the data and finally output it in a format suitable to be presented to decision makers.

Having a decent knowledge of web technologies, I’m often frustrated not to have a shiny web app that will automagically show the data in stunning tables and graphs (e.g. d3.js bliss). For me, the main reason we don’t see proper “developer made” applications in large corporations is that they do not allow for quick and fast iteration and adaptations. Here is a very typical situation in my job : A manager bursts into my office to ask the following : “Hey, I know we usually compare our XYZ monthly performance to our prior year performance and to our last forecast. Could you compare add in a comparison between the year end run rate and forecast ? Oh, and could you also a express XYZ as a percentage of ABC, it could be insightful. Thanks ! ... don’t work too late.”

After a couple of Excel ninja moves : job done, manager happy, business decisions made. If the data is wrong, I'm responsible, not the mistyped Excel formulae.

[+] mdda|13 years ago|reply
"... I’m often frustrated not to have a shiny web app that will automagically show the data in stunning tables and graphs (e.g. d3.js bliss). " : Are you looking for a web interface for manipulating OLAP data? (something like http://www.tableausoftware.com/olap)?
[+] mprovost|13 years ago|reply
I've always been a bit perplexed that there isn't a spreadsheet as good as 123 (ie, 1980s technology) as a standard part of Linux distributions. There is a massive blind spot in the open source world around spreadsheets. As far as I can tell it's down to the geeks writing open source tools not being interested in solving business problems and focusing on tools for working with code and manipulating data in text format, not cells. Which is understandable, you write tools for free for your own needs and you get paid to write tools for others. So businesses pay Microsoft for Excel.
[+] apapli|13 years ago|reply
OpenOffice?
[+] jimktrains2|13 years ago|reply
sc is a text based spreadsheet. I don't have much experience with it though.
[+] benvanderbeek|13 years ago|reply
My company's part numbers are in the form 00-0000. After enough conversations about how to convert back to this format from the date Excel changes it to, we've finally just decided to change our sku format. A rolling change though, so we'll still be dealing with it till the current 5k sku's are all EOL.

I also hate the scientific notation default, in addition to the leading zero. Guess what, UPC's exist and no one wants them in scientific notation.

[+] mpyne|13 years ago|reply
Format the cell as Text first? Enter the UPC with a leading ' to force intepretation as Text?

If you're changing SKUs anyways you may want to change it to have a letter to that Excel "guesses" correctly by default.

[+] EEGuy|13 years ago|reply
I've had success getting rid of the scientific notation by just widening the column. Maybe that doesn't work all the time, I don't know.
[+] bane|13 years ago|reply
In an old job I had to slice and dice lots and lots and lots of spreadsheets and csv dumps -- almost all one offs, or in ways that were one offs.

Sure I could hack up some scripts to do that work, but almost everytime it was quicker and easier to just use Excel as a handy-dandy swiss army knife to do all kinds of bulk data processing.

It's a stupid good tool that gets you almost dangerously far with a modicum of effort and no additional cost.

Doing the same work any other way would have meant keeping 3 or 4 engineers on staff full-time banging out code and managing databases. I or another guy on my team were able to do everything we needed in less than an hour a day, then load the results into an appropriate analysis tool.

Quite often the appropriate analysis tool was also Excel.

[+] elliotanderson|13 years ago|reply
The start of the article infers that the London Whale Trade was caused by manual handling of data between spreadsheets.

The real reason came down to a flaw in the formula they were using. From the JP Morgan report:

    ... a decision was made to stop using the Basel II.5 model and not to rely on it 
    for purposes of reporting CIO VaR in the Firm’s first-quarter Form 10-Q. 

    Following that decision, further errors were discovered in the Basel II.5 model, 
    including, most significantly, an operational error in the calculation of the 
    relative changes in hazard rates and correlation estimates. 

    *Specifically, after subtracting the old rate from the new rate, the spreadsheet 
    divided by their sum instead of their average, as the modeler had intended.* 

    This error likely had the effect of muting volatility by a factor of two and 
    of lowering the VaR.... It also remains unclear when this error was 
    introduced in the calculation.
Source: http://www.zerohedge.com/news/2013-02-12/how-rookie-excel-er...
[+] jessaustin|13 years ago|reply
The whole concept of VaR is an industry-wide consensus hallucination. Even if you do it right you're doing it wrong.

But to your point, if this formula had been in a single library procedure in version control rather than pasted and repasted into various dingy corners of various spreadsheets, this sort of error would have been less likely. Manual handling of formulas is at least as dangerous as manual handling of data.

[+] novocaine7|13 years ago|reply
I work for a mid-sized financial software company and am involved with dev on a mature, used in production, in-house Excel clone.

Initially I figured this was about the craziest thing possible, but over time I've come to realise the company derives genuine competitive advantage from this system because

1. it allows actuaries to program calculators in a language and environment that they are comfortable with. It is a lot easier to find finance guys that do excel than ones that can seriously program.

2. tbh, excel is often a really good tool for the job because it allows visualization of data as you work. If you work often with projections it kills the alternatives like numpy, matlab etc.

The system is pretty advanced and has been used in production for about 8 or so years. We have an interpretive runtime for use during dev and also a static compiler that generates c++ and creates a shared library per sheet.

Some interesting points about implementing excel:

* Most functional languages do lazy evaluation on the assumption that there's a fair amount of arguments that won't be evaluated. We find that in excel all arguments are almost always used, so lazy evaluation and thunks just add overhead if you use them in all cases. We just have special cases for IF and OR et al.

* Performance is all about cell caching - i.e. memoization - but you only really have performance problems if you want to do root finding monte carlo sims online (we do). We have a dependency tracking system so cached cells are selectively flushed only when a cell they depend on changes.

* the system generates very large amounts of static c++, sometimes hundreds of thousands of lines for one sheet - this can be necessary when the sheet has millions of cells, even though we scan for similar formulas and factor them into single functions to improve spatial locality. MSVC can compile a million line .cpp in about 5 minutes using about 1gb ram - gcc 4.6 would use all the memory on my 8gb machine and swap ad infinitum (but if you split the files it is fine).

[+] unsignedint|13 years ago|reply
I really don't mind people sending me stuff in a spreadsheet, if it's something that warrants such use.

Things like "I’ve had screenshots pasted into Excel and attached to an email. Excel is an ubiquitous file format" mentioned in the article is frustrating as well, as particularly in Japan, there's this weird practice of using it as graphing paper, by making each cell into tiny squares, and use it as free-form word processor alternative. (I'd say, PowerPoint would work better for this -- here's the thing, lowest tier of MS Office in Japan doesn't ship with PowerPoint, ugh.) Some of these misuses are actually harmful -- "graphing paper" usage of Excel causes a lot of trouble when it comes to printing, and long-term maintaining, and there's no document structures in such use.

[+] politician|13 years ago|reply
Here's one thing I don't understand. Geeks love Smalltalk image-based persistence. Geeks hate Excel documents. [1] Don't they have the same problems? How is the program state in a Smalltalk image version-controlled? Is that even a sensible question?

[1] generalization, do not take too seriously.

[+] dragonwriter|13 years ago|reply
> Here's one thing I don't understand. Geeks love Smalltalk image-based persistence. Geeks hate Excel documents. Don't they have the same problems?

They have neither the same benefits nor the same problems, though they have some overlap in each.

Also both the "Geeks love Smalltalk" and "Geeks hate Excel" generalizations are over-generalizations, and the set of geeks for whom the former is valid are not the same set of geeks for whom the latter is valid (though, again, there is some overlap.)

[+] dinkumthinkum|13 years ago|reply
So, I don't think this even in the ballpark. There are a very high percentage of "geeks" that would have no idea what Smalltalk's image based persistence even is. Excel is like the lingua franca of the financial business world. It is used an abused. The Smalltalk persistence thing is ... it's an odd comparison, let's say. :)
[+] argc|13 years ago|reply
I have worked in both a bioinformatics lab at University and a medical devices company. At both places, I have seen excel used for a HUGE amount of tasks by biologists, business people, software testers (from non-programming backgrounds) and programmers. It was used as a tool, along with perl, php, c/c++... depending on the level of complexity. Its just damn useful and everyone knows how to use it.
[+] demian|13 years ago|reply
Next time a lawyer, a doctor or an architect tells me his/her professional opinion, I'm going to answer them "sorry GEEK, that's the way it is!"
[+] Harkins|13 years ago|reply
"Sorry?" It is wonderful. People make great things and run their businesses on Excel. It has its faults and people are not programmers, but it makes an incredible percentage of businesses run.
[+] cloudout|13 years ago|reply
Excel is still used because it has High Operating Range with a Low Barrier to Entry. Someone can learn how to do simple tasks in 5 minutes, while advanced users can write macros. You can build a simple calculator or a complicated model without going through a programmers learning curve; where, you've traditionally had to learn multiple concepts before you can print "Hello World".
[+] fencepost|13 years ago|reply
Sneering at Excel is a sign of someone who doesn't understand the goals of business and in many cases why they're employed. They aren't employed to code nice apps in $language, that's HOW they do their job of making the business run more smoothly.

I've built "apps" in Excel - simple stupid crap for doctors to enter hospital charges, etc. There's no database backend, lookup is "is there already a sheet for this patient?" Creating a new sheet is clicking the shortcut to the template and entering the patient's name, the hospital and the month, saving is closing and accepting the generated file name. Training was minimal, backend is office staff, and it's lasted through 2 separate billing systems. Development was simple form layout, locking cells, adding a few dropdown lists to populate cells, and setting up a couple of button/autoclose vbscript macros.

Cheap, simple, lets doctors capture charges that are worth more in one week than I was paid for the development 5+ years ago.

[+] achy|13 years ago|reply
I came to HN as a distraction from the mind wrenching exercise of compiling 5 different copies of excel documents where everyone has butchered their version in a different, horrifying way. Luckily, this is only used to keep track of procurement data for a billion dollar engineering project... Excel is a powerful, but too easily abused, tool.