whodunser's comments

whodunser | 9 years ago | on: Announcing AudioSet: A Dataset for Audio Event Research

"The ontology is made available by Google Inc. under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license." -- Github page[0]

Looks like the... tag names? and example urls have been released, but the videos and sound are under their respective licenses -- ie, mostly the standard Youtube license.

This is neat. Can a ML model developed on this dataset be used for commercial purposes? I guess, at minimum, the paper and tag list are provided as help for those corporations that would wish to build/use a private dataset for similar purposes?

[0] https://github.com/audioset/ontology

whodunser | 9 years ago | on: Introducing Keras 2.0

Note that there are breaking changes, and it seems the docs online already point to the new, 2.0 version.

So if you are relying on the docs to edit old code, you may become a teensy bit frustrated!

whodunser | 9 years ago | on: Baidu Deep Voice Explained: Part 1β€Šβ€“β€Šthe Inference Pipeline

This post on Deep Voice seems a little off-the-mark. In fact, I would say it is completely misleading about the technical accomplishments here.

From my perspective, Baidu's approach is a little embarrassing, with the use of many modeling stages in their training and production of TTS. When the rest of the community is moving towards end-to-end training, their usage of this many stages sounds excruciating. Merlin[0], which was a pretty good standard for 2016, has this painful feeling as well, with two DL stages (duration, acoustic) followed by some conditioning and then a synthesis step.

The more important technical contribution seems to be the hand-tuned synthesis code that makes their generation faster; cool but not particularly sexy (and there are few details). The details on training hyperparams are nice to have too, of course.

Contrary to the post, I would be very surprised if the voice sample included in the post was actually generated by Deep Voice -- it has none of the robotic qualities pointed out by the researchers themselves in their blog post[1]. More likely it is a demonstration of the loss in their last, WaveNet-like stage. This was also pointed out in the previous HN discussion[2]

Lastly, Andrew Ng is neither thanked in the paper nor mentioned on any webpage -- are we sure this was work he supervised?

[0] https://github.com/CSTR-Edinburgh/merlin

[1] http://research.baidu.com/deep-voice-production-quality-text...

[2] https://news.ycombinator.com/item?id=13756489

whodunser | 9 years ago | on: Launch HN: FloydHub (YC W17) – Heroku for Deep Learning

I have been using your docker container for 6 months or so now, thanks for putting it together :)

The jupyter jobs look neat, but I assume they are charged continuous time? Would be cool if somehow that only ended up charged for compute time, but I understand that would be difficult.

Are these instances guaranteed to be in a given region, for if I wanted to route more complex debug output / intermediate files to S3?

whodunser | 9 years ago | on: The jobs that really smart people avoid

(I've never had this thought before) What if it is currently encouraged at the proper level, given its present value -- that is, exponentially discounting the future revenue.

Throw in uncertainty, and inability to judge quality work so far from the future, and maybe that justifies the low pay of many basic research jobs? Would be interesting to see an analysis of this across fields, countries, year, etc.

page 1