top | item 8391166

Sikuli Script

148 points| llamataboot | 11 years ago |sikuli.org | reply

31 comments

order
[+] krilnon|11 years ago|reply
I used Sikuli back in 2011-12 for some random automaton tasks, and often wish I would remember to use it more.

- Advisor wanted a one button way to run a convoluted research prototype I had made, and I didn't want to have to dig into Cocoa to figure out how to programmatically click/select options in a few desktop apps.

- Worked at a company and there was silly employee training slideshow+quiz, so I had Sikuli wait for the next arrow to show up once the audio was finished and click it.

- Wanted to heat up my GPU to warm a brownie, so I opened one of those WebGL water demos and had Sikuli repeatedly pick up a ball and drop it in the water.

[+] AceJohnny2|11 years ago|reply
> - Wanted to heat up my GPU to warm a brownie

Please elaborate.

[+] yzzxy|11 years ago|reply
> - Wanted to heat up my GPU to warm a brownie, so I opened one of those WebGL water demos and had Sikuli repeatedly pick up a ball and drop it in the water.

I'm not sure why this is like nails on a chalkboard for me. Maybe it's the programmer equivalent of dog-earing book pages.

[+] Ygg2|11 years ago|reply
I'm gonna post this XKCD, because it's eerily appropriate (once WebGL manages to fix any performance bugs): http://xkcd.com/1172/
[+] cdr|11 years ago|reply
I used to use AHK a lot for interacting with a game. Sikuli seemed neat when I heard about it, but it turns out that it only supports pretty much a single method of capture and only a single method of relaying mouse events. This made it completely unusable for that purpose. AHK supports pretty much every method available to Windows. The simplicity of use was attractive, but it needs a ton of work under the hood before it'd be viable as a general purpose automation tool.
[+] bcaine|11 years ago|reply
I also experimented with it a few years ago trying to build an automated GUI testing framework, and it turns out that its way too fragile and non-portable to be usable for that use case.

I remember running into issues as soon as anything changed regarding resolution, scaling, graphic settings, color scheme etc.

Pretty fun to make toy programs in to automate stuff with though.

[+] Alex_MJ|11 years ago|reply
Sikuli is freaking great, though not sure how this is news, it's been around.

Super useful for automating things that are easy to handle by looking for patterns/things on screen and hard to handle with APIs (or lack thereof)

[+] YZF|11 years ago|reply
We used Sikuli for test automation in a pretty large project with a Windows UI. Got kicked off by a TeamCity agent for every build and worked really nicely.

Thumbs up.

You do need to be careful about timing and getting the right images so tuning things to work under all conditions is a bit of an art. Also being able to recover from a failure so you can continue testing is another bit of art.

As to why this is news there seems to be a new release out (or soon?) 1.1.0 ... Sikuli development seems to have almost died a few years back but it's made a comeback over the last ~2.5 years which is nice.

[+] drothlis|11 years ago|reply
I do a similar style of UI test automation for set-top boxes / smart TVs, with stb-tester[1].

We've found that the "Got kicked off for every build" continuous integration process you mention is the crucial part to achieving success with this type of test automation -- if you're going to invest the effort in writing reliable tests, you want to be getting value out of them by running them as often as possible and as early as possible.

[1] http://stb-tester.com/

[+] gowan|11 years ago|reply
Nice to see sikuli on the front page. I'm currently using it to test a legacy application.

Sikuli is good at image matching. For me sikuli broke when I started to take images of text. The font would render differently in gnome and the vm (vncserver/twm) jenkins ran the tests on. I ended up creating docker images of the test environemt so the docker image would be the same on jenkins and the testers machines.

Debian has a sikuli package libsikuli-script-java and sikuli-ide. I've also written a docker file for sikuli on debian wheezy [1].

[1] https://github.com/jesg/sikuli

[+] jamesgagan|11 years ago|reply
Wrote a WoW fishing bot with Sikuli a few years back - it worked pretty well.
[+] nogridbag|11 years ago|reply
I was a big fan of Sikuli back in the day, but I found it a bit unreliable for automation. No matter how much I tweaked it, it seemed to be a bit unpredictable.

I did find some use for it. My girlfriend got addicted to some online flash Mahjong game and no matter how hard I tried, I could not post a better score than her. With a bit of Sikuli scripting, I was posting top scores in no time!

[+] woutervdb|11 years ago|reply
Reminds me a bit of Scratch[1], a tool that came pretty popular when the Raspbery Pi came out. Very simple programming interfaces that work with simple graphics, but can do a lot of things.

[1]: http://scratch.mit.edu/

[+] whitten|11 years ago|reply
Sikuli is being used to create test plans for the VistA system (documented at http://www.osehra.org ) It works with GUI stand-alone executables and with web pages from a browser.
[+] cwt|11 years ago|reply
Does anyone use this for web scraping dynamic links created by javascript, pulled from dev tools "network" tab?
[+] fiatjaf|11 years ago|reply
Works with the browser, right? So this is the ultimate visual scraping tool, import.io and ParseHub are useless now?
[+] tsergiu|11 years ago|reply
One of the founders of ParseHub here.

Not quite. Sikuli tries to figure out where things are by doing a visual match. This works very well for things like automating applications or sites where page elements are fixed (e.g. finding an option in a menu or using a search engine). But it works terribly when trying to overlay semantic structure on dynamically-generated data. For example, it has no way of knowing that a list of movies is split up on multiple pages, with each movie having multiple genres, a cast, and multiple reviews, each of which has a rating and an author.

There's also the additional drawback that it is hard to parallelize things in Sikuli (you would need heavyweight vms, and there are no obvious "breaks" in the flow). So doing something at scale is not feasible.

With ParseHub, one of the goals is to make it easy to express relationships (and we think we've done a really good job). We also automatically figure out how to split a job up across an entire fleet of servers.

Hope that offers some insight. Email me at [email protected] if you have any other questions.

[+] bart3r|11 years ago|reply
What are it's capabilities with OCR?
[+] YZF|11 years ago|reply
It has OCR but wasn't working so great. It uses Tesseract. I'm not absolutely sure why it wasn't working well in the past, possibly something to do with different fonts/display rendering (e.g. ClearType and such). It "almost" worked so maybe it got better or maybe there's some tuning you can do. Didn't spend too much time on it.
[+] qwerta|11 years ago|reply
Great project, used with great success to automate testing of legacy app across multiple virtual machines.