top | item 38249214

Ship Shape

327 points| SerCe | 2 years ago |canva.dev | reply

81 comments

order
[+] jmiskovic|2 years ago|reply
IMO the RNN is overkill of this problem, compared to a simple and elegant algorithm called "$1 unistroke recognizer". That one works beautifully even when trained with just a single sample of each gesture.

I hope $1 unistroke gets more recognition because it can be integrated in an afternoon into any project to add gesture recognition and make the UI more friendly.

It works quite reliably for palm style "Graffiti" text entry, as long as each letter is just a single stroke. The original paper also makes great effort to be readable and understandable.

https://depts.washington.edu/acelab/proj/dollar/index.html

[+] ajnin|2 years ago|reply
A big issue with the $1 recognizer is that it requires strokes to be drawn in a specific way, for example to draw a circle you need to go counterclockwise, if you go clockwise (as seems more natural to me) it gets recognized as a caret. This makes it not really usable in a context of free drawing were the users are not aware of the details of your implementation.
[+] soamv|2 years ago|reply
People here testing out the example on this page and reporting errors seem to be missing the fact that this demo is "trained" on one example. The linked paper[0] goes into error rates, and they get better pretty quickly with a few more examples.

[0]https://faculty.washington.edu/wobbrock/pubs/uist-07.01.pdf , page 8

[+] dist-epoch|2 years ago|reply
I've just tried it, and it's pretty bad, without training at least.

My rectangle is recognized as a caret, my zigzag as curly bracket.

And it doesn't support drawing a shape in two strokes, like the arrow for example.

[+] DragonStrength|2 years ago|reply
I played with this for a bit and found it too simple. If you don't draw the example shapes exactly, it confuses them. I recommend playing with "delete" versus "x" from the example shapes to see just how poorly this does. I could not get it to consistently differentiate between different drawing techniques.

This would certainly get you started for gesture interfaces, where drawing a shape the same way every time is expected. It would not be a good fit for the use case here of diagramming.

[+] foobiekr|2 years ago|reply
it does not work as well.

I have this deep seated fear that NNs will be the death of the lessons learned from 1970-2010. After all, if you can use massive amounts of compute to materialize what seems to be a good enough function approximator, why do advanced algorithms at all?

Obviously the reason we should is that approximators like the NNs have explainability issues and corner case unpredictability issues plus they are bad at real world complexity (which is why self driving efforts continue to struggle even when exposed to a narrow subset of the real world).

[+] karaterobot|2 years ago|reply
> However, if you’re anything like us, even a simple straight line drawn with a mouse or a trackpad can end up looking like a path trod by a tipsy squirrel. Don’t even get us started on circles and rectangles.

But who needs to draw shapes with their mouse in Canva? Years ago, Miro had a feature that converted your flailing attempts at drawing a star with a mouse into a geometrically precise star (or circle, or triangle, or whatever). I thought it was super cool, but then I never, ever needed to use it. I never need to do line drawing with my mouse: if I'm making diagrams, I just use pre-made shapes, which are faster. If I am making icons, I use a whole different process centered around Boolean operations and nudging points and the Pen tool—and I am probably using a dedicated program, like Illustrator, to do it. And if I am actually illustrating something (rarer these days than in times past) I have a tablet I will pull out. I am sure the tech here is cool, but what's the use case?

[+] tobyjsullivan|2 years ago|reply
Canva is not a diagraming tool. It’s a visual design tool with a very different user base.

Their asset library is massive with millions, maybe tens of millions, of images including both photos and vector graphics.

One of the more annoying parts of the tool - in my limited experience - is searching through an endless library for simple shapes when I already know exactly what I want. Presumably this tool aims to solve that pain point.

Disclosure: worked there a few years ago.

Edit: I suspect (zero inside info) this use case is important because they want to be a competitive diagraming tool as well. However, they’ll be constrained in that they cannot fundamentally change the design experience for the other 99% of their current users.

[+] pc86|2 years ago|reply
> but what's the use case?

Designers/marketers who don't learn keyboard shortcuts, for whom the comparison is "drawing the shape with my mouth" (quick) vs. "going through upwards of a half dozen menus to pick the right shape, place it, then resize it" (slower). Even if the shape is available w/o going to any menus, drawing the entire thing with your mouth using a single cursor is going to be faster than placing and resizing a bunch of icons, switching to the arrow feature and adding the arrows in.

[+] rrherr|2 years ago|reply
"We developed a variation on the Ramer-Douglas-Peucker (RDP) algorithm, which is a curve simplification algorithm that reduces the number of points in a curve while preserving its important details. It achieves this by recursively removing points that deviate insignificantly from the simplified version of the curve."

This reminded me of an old side project, which others may be interested in. I applied Douglas-Peucker to Picasso for a talk at Strange Loop 2018:

Picasso's Bulls: Deconstructing his design process with Python https://rrherr.github.io/picasso/

[+] danproductman|2 years ago|reply
This makes me wonder how they pulled off something similar in Macromedia Flash (RIP) well over 20 years ago. I vividly remember being amazed by how it smoothed out curves when drawing freehand, with such limited processing power compared to today's CPUs.
[+] mbb70|2 years ago|reply
LeCun et al. got 99%+ handwritten digit accuracy in 1995, which is pretty analogous to shape identification.

Having it run trivially and performantly in the browser is still an accomplishment. As always, the experience for the user is what counts.

[+] wes-k|2 years ago|reply
Smoothing is a different operation where you are simplifying the bezier curve by removing redundant(ish) points. So if you draw an almost straight line, you may have created 100 control points, and then the software simplifies it down to 4 points.
[+] danjc|2 years ago|reply
Irrelevant - you must use machine learning for everything now.
[+] londons_explore|2 years ago|reply
I suspect it took mouse events, and initially drew straight lines between them. That's necessary on 1990's hardware because drawing straight lines is fast, and you need to do it fast.

Then, when you are done drawing, it redraws the line, using the same points as before, but this time as input to a spline curve algorithm.

Drawing splines isn't much harder computationally, but notably if you add one more point to the end of a spline curve, then part of the line that you have already drawn changes. That in itself is very computationally heavy, since everything behind that line now needs to be redrawn - certainly not something you can be sure can be done at 60 fps!

[+] freedomben|2 years ago|reply
Great article, and very interesting work.

I'm surely in the minority, but I oddly find myself enjoying the hand-drawn "shaky scribble" versions more than the "sleek vector graphic." I'm sure even my preference would be context dependent though, so even in my case it's a cool feature. But in a world of artificial perfection, there's something innately attractive in a genuine hand-drawn production.

[+] tlrobinson|2 years ago|reply
If you implement a feature like this, please, make it optional and obvious when it’s enabled. It’s maddening when tools try to be too smart and don’t get it perfect (I have been guilty of this too)
[+] abrookewood|2 years ago|reply
There was a game called Scribblenauts that my kids loved years before any of the recent ML/AI hype and it was able to turn very rough scribbles into an amazing number of different objects. No idea how they did it, but even I was impressed - the kids thaught it was magic.

https://store.steampowered.com/app/218680/Scribblenauts_Unli...

[+] obvi8|2 years ago|reply
I’ve played it — it truly is amazing. If I’m remembering correctly, it made it to iOS, too.
[+] infocollector|2 years ago|reply
It would be nice if this was open source :) Recently, there have been various models that have become small in size (This one is 250kb. There are other simple tasks that have seen models of size 50kb or so for finetuning large models). I am looking forward to when we can actually get back to small models for useful applications :)
[+] SonOfLilit|2 years ago|reply
They trained it to recognize nine predefined shapes?

Come on, if you're going to train a model, make it a generic smoother/DWIM for drawing shapes!

You will also get more "analog"/never-identical shapes, which will feel much more stylish in the way drums feel warmer than drum samples even when played by an expert at hitting the notes identically and on time.

[+] spondylosaurus|2 years ago|reply
The iPad drawing app Procreate has a smoothing tool that sounds kinda like what you're describing—you basically draw a line freehand, and then Procreate smooths it afterwards.

Most other drawing apps (like Clip Studio Paint, which is what I primarily use) have a comparable ability to smooth the lines as you're drawing by stabilizing the actual brush tool—basically slowing down the responsiveness of the brush to reduce jitter.

[+] jcparkyn|2 years ago|reply
I agree, all the examples in TFA feel lifeless compared to the originals (except the circle). I could see the utility if they went for "proper" vector shapes, but here it feels like the worst of both worlds.
[+] lancesells|2 years ago|reply
There's an odd feeling about the writing in this article. Maybe I'm seeing things but it does not feel like it's written or composed entirely by a person.
[+] h2odragon|2 years ago|reply
followed the "Draw" link and played with the thing there, but didn't see a way to demonstrate this functionality? Is it a paid feature or something?
[+] robinhouston|2 years ago|reply
It triggers if you keep the mouse button down for a couple of seconds after finishing the stroke.
[+] singularity2001|2 years ago|reply
same, after 1h of technical descriptions of how they did it, I didn't find out WHERE / HOW we as users can use this feature!!
[+] visrut7|2 years ago|reply
if model is on client side itself why not make it open source?
[+] simlevesque|2 years ago|reply
To keep a competitive advantage and get value out of your investment.
[+] vipermu|2 years ago|reply
how will this change with ai though?
[+] simlevesque|2 years ago|reply
It already uses AI so I'm not sure what you mean.
[+] biosboiii|2 years ago|reply
The engineers of ASML, TSMC and others wake up every day, shoot lasers on liquid lead to generate light with extreme short wavelenghts, to make smaller and more performant chips.

And web developers wake up every day so that no one notices their work.

[+] ilovecurl|2 years ago|reply
nitpick: TSMC's EUV process uses lasers to vaporize tin, not lead, into EUV emitting plasma.
[+] shepherdjerred|2 years ago|reply
More performant chips mean you can have more software abstraction and build things quickly. The increase in chip speed does not correspond to faster program execution but rather faster program authorship.

It's easier to train an army of web developers to build React applications than to teach them PHP + JS, Ruby + JS, etc. Those React developers can also (on average; many people are insanely productive in "uncool" languages) write applications more quickly.

For example, a company could write their app for macOS + Windows + Linux using native frameworks, or they could write their app once in JS + Electron.

A native app would certainly be much more performant, but that comes at the cost of being much more difficult to build, and most likely, Linux would not be supported at all.

[+] dist-epoch|2 years ago|reply
Wouldn't it be hilarious if some ASML/TSMC engineers used Canva internally? I bet it happens in some corner.