hamner's comments

hamner | 14 years ago | on: Steve Perlman unveils white paper explaining “impossible” wireless data rates

I imagine there would be some overhead (e.g. each user transmitting a certain user-specific pattern every 1/2 second, and each AP transmitting a certain AP-specific pattern every 1/2 second), to calibrate the location of each user in AP-space. Fundamentally I think all the calculations involved would be fast linear algebra operations that could be done in hardware on the order of microseconds.

hamner | 14 years ago | on: Steve Perlman unveils white paper explaining “impossible” wireless data rates

The white paper is non-technical, but I think this is the gist:

Currently, if you have multiple users and 1 access point (AP), the users split the bandwidth. Multiple APs and multiple users on the same channel results in split bandwidth as well, since the APs operate independently and interfere with each other.

This proposal uses N APs for N users on the same channel, allowing for full bidirectional use of the channel bandwidth by each user. To send data to N users simultaneously, a central server receives the data, and calculates the signal to send to each AP such that the user receives only the clean signal meant for him post interference. This requires precise localization of the user in AP space, presumably done by having the user transmit a certain pattern at the particular frequency, and measuring the result at each of the channels.

For the N users to transmit simultaneously to the N APs, the data center can take each of the incoming signals from the N APs along with the localization of the users in AP space, and apply linear algebra to unmix the signals into a signal from each user.

I imagine this adds some overhead to each channel in order to maintain precise localizations of each user in AP space.

hamner | 14 years ago | on: IOS 5 To Have Powerful Face Detection

Hopefully this will improve on the algorithms they have in Lion (which seemed to work ever-so slightly better than OpenCV's face detection and work poorly on profile views).

hamner | 14 years ago | on: Google Tried To Buy Color For $200 Million. Color Said No.

A lot of the comments below are criticizing "irrational investors" that were "duped" or the product as "vaporware."

This is not the case. Color had a very talented team attempting to attack multiple technically challenging problems, that remain unsolved today.

The first is the discovery of your implicit social network, as defined by your virtual and real-world interactions with others. Facebook currently uses this to determine what information is shown in your News Feed and make friend recommendations, but is not using it to its potential. Google Buzz tried to do this directly via your emails and flopped partly since it did not account for the privacy implications. The ability to transform people's natural interactions into strong recommendations of what they should pay attention to and who should meet each other is still an open problem.

The second is the mapping of real-world events (initially defined by the pictures and people) onto the virtual world. There is potentially a lot of value, both to participants and outsiders, to say (1) who came to real world events, (2) how they interacted, and (3) what happened, while properly dealing with the corresponding ethical implications.

For both of these to work, Color needed a viral social product to gain data and users. They failed on product/market side, especially because they did not have enough focus on "what is the experience we want our users to have the first time they launch the application?" The opportunity remains open, for Color to redeem itself, for the big players to improve their products, or for a new startup to come along and show the world how it's done.

hamner | 14 years ago | on: Andrew Ng: Machine Learning in Robotics

The implementation of these algorithms is relatively straightforward. The challenge is the state of the art in computer vision is not currently at a point where it is possible to reliably detect 10s-100s of object categories in real time on current systems. It is currently possible to build systems that get decent real-time performance on detecting a few categories concurrently, or offline systems that get around 60% accuracy across hundreds of well-defined categories. Thus, (1) faster general purpose hardware, (2) better algorithms, or (3) running the best algorithms in ASICs designed for CV is necessary. The top labs now typically use GPU clusters to train / run their algorithms, with the computationally expensive stages usually being feature extraction and/or classifier training.

Google Predict (http://code.google.com/apis/predict/) offers a general machine learning API geared towards those who want to apply machine learning to their applications without subject-specific knowledge. I've not used it so I can't speak to its accuracy, but it is not geared towards computer vision and I imagine it would fail miserably at such tasks (since computer vision is highly dependent on domain-specific feature extraction techniques), and I imagine it performs well at NLP tasks. The primary limitation of such a system is that it acts as a black box - you throw data in and get answers out without any knowledge of the process behind it.

This black-box model is limiting for three major reasons. First, depending on the domain, incorporating domain-specific knowledge can greatly improve performance. Secondly, it is hard to understand the limitations of such a system. Many ML algorithms can fail catastrophically when the input is substantially different from the training data, and the black box makes it hard to understand when the system is likely to fail and adjust accordingly. Third, in many cases you face a tradeoff involving speed, memory, and classification/regression performance. This tradeoff is automatically determined for you and is not transparent.

I've been considering a general ML system that offers an API similar to Google Predict, yet is transparent in the feature extraction / model selection stages for those that would benefit from digging deeper into the system. Is this something that you would pay for?

Specifically for computer vision, there's a variety of startups and companies working on providing a system for object recognition and classification. One example is http://www.numenta.com/, though when I tried there software about a year ago it did not seem to function very well compared to the state of the art. Others that are making visual search type applications include http://www.tineye.com/ and http://www.kooaba.com

hamner | 14 years ago | on: Why decision trees is the best data mining algorithm

There is no such thing as a "best" data mining algorithm. Almost all the advantages you mentioned for decision trees, a form of recursive binary partitioning, applies to a greater extent to Random Forests, which are bootstrapped decision trees that only consider a subset of features at each node.

Examples of domains where decision trees perform poorly include: -Low amount of data -Domains where you have extra knowledge about the data (such as some features coming from certain probability distributions) that you can incorporate into classifiers.

Decision trees work well in a variety of applications, but that does not make them the "best" algorithm, and it is rare that a classical decision tree provides state of the art performance on any given data set. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.122...

hamner | 15 years ago | on: Risk, probability, and how our brains are easily misled

First paragraph doesn't make since - if there are 5 flips, there are 2^5=32 possible outcomes. If the "odds are low that even one person in the audience guessed it" then I'd expect less than 16 people to be in the audience. However, "about a dozen people" did guess it, implying that the audience is in the hundreds (there are >10 "random" looking sequences of the 32).

hamner | 15 years ago | on: Is Amazon’s Kindle Destroying the Publishing Industry?

I love this, a beautiful example of disruptive innovation:

-Authors get up to 70%, as opposed to 5-15% -Far less trees are killed

-Eliminates the role of the publisher / physical distribution channel, which claimed the lion's share of the profits without adding creative value

-Marginal cost of distributing an extra book is on the order of cents

I've not read a hardcover book since getting a Kindle.

hamner | 15 years ago | on: Google, Facebook Lose Social Network Patent Ruling

This is yet another example of an absolutely ridiculous patent lawsuit, where companies that lose in the marketplace sue successful ones, costing both the companies and general public money and providing a disincentive for innovation.

What can we do to prevent this from happening in the future? (preferably by making software patents go away).

If nothing else works, how about reductio ad absurdum. Let's go through recent sci-fi/academic literature, file patents for anything technologically feasible that has a high probability of hitting the marketplace in the next 5-10 years, and then troll away until Congress acts.

hamner | 15 years ago | on: Inspired by XKCD:903, Wikipedia steps to philosophy

By extension, you also will always end up at ... Existence Sense Organism Biology Natural_science Science Knowledge Fact Information Finite_set Mathematics Quantity Property_(philosophy) Modern_philosophy

hamner | 15 years ago | on: Human Brain Project: $1.61 billion to achieve human brain emulation by 2024

That is important, but extremely challenging to execute for two reasons.

1. Historically, technology has outpaced the legal framework. Look at copyright law, software patents, and internet commerce for some examples.

2. It is next to impossible to predict the impact that "Strong AI" or a "Singularity" would have on society. Science fiction literature is filled with thousands of different scenarios. Do we expect Congress or another legislative body to create a framework based on each potential manifestation of strong AI, on the 0.01% chance of it occurring in the next 10 years?

Food for thought - there's a chance that the first implementations of Strong AI occur as a result of a public and government-sponsored research program. There's also a chance that they will come about by a small team of dedicated researchers who will use the technology to their (or its) own ends, legalities and ethics be damned.

hamner | 15 years ago | on: Research Directions for Machine Learning and Algorithms

The argument that important ML algorithms should be highly scalable ( O(logN), O(N), O(NlogN) ) holds in fields that are rich in "big data," with millions to trillions of data points.

However, there are also many fields where acquiring a large ( > 100s-1000s of samples) is infeasible. This is especially relevant in medicine and biology. Many applications are constrained by small sample sizes and may have a feature count that is orders of magnitude larger than the sample count. Examples include fMRI studies and gene expression studies. Don't discount research in methodologies (such as SVMs and many graphical models) that have superlinear performance as impractical for real-world applications, because these are used heavily in certain fields.

hamner | 15 years ago | on: CEO of Svpply: I have no idea what I'm doing

F--- that. He approaches his position with humility, an understanding of what he doesn't know, and a desire to learn it. Instead of reveling in whatever success his startup has had, he's candidly looking forward to learn what he needs to grow his business.
page 1