piemonkey's comments

piemonkey | 7 years ago | on: Computational Mathematics with SageMath (2018)

I made extensive use of Sage during a recent research project. The most useful thing about Sage for me is that it integrates cleanly many mathematical libraries, making it easy to export results from one tool into the other. The visualization tools are a bonus, and made debugging quite easy. In my case, Sage neatly connected graph isomorphism tools with group theory libraries, saving me a tremendous amount of work.

The main downside for me is the performance overhead of using Sage. It is great for quickly prototyping ideas, but I found that some components were not designed with performance in mind. For instance, I used the Sage interface for Gap for much of my group theory computations, and I found that this interface was incomplete (it does not natively support the product replacement algorithm, for example) and quite leaky (I needed to manually garbage collect to keep from running out of memory).

piemonkey | 7 years ago | on: Jürgen Schmidhuber says he’ll make machines smarter than us

Thanks for this, the reviews for this paper were very interesting.

I do wish that they provided the meta-review, however, which summarizes the ultimate decision. Meta-reviews are written by area chairs, which are the final arbiters for whether or not the paper is accepted. Seeing their perspective on this paper would be very interesting.

piemonkey | 8 years ago | on: UCLA introductory programming classes should focus more on Python, not C++

I would like to point out before this discussion moves too far along that UCLA does in fact have an introductory programming course which is taught using Python: http://web.cs.ucla.edu/classes/enroll/97/

Students that are unfamiliar with programming are encouraged to take this course before the C++ course.

piemonkey | 8 years ago | on: The De Bruijn-Newman constant is non-negative

Can someone with more knowledge than me contextualize this result? It appears to be making progress on the Riemann hypothesis, but why is this particular effort getting attention on HN?

piemonkey | 8 years ago | on: The destruction of graduate education in the United States

While I agree that there are a lot of structural problems with universities and the way funding is allocated, I disagree with the premise that beginning with taxing graduate students is the correct way to do it. Granted, as a current PhD. student at a University of California university, I have some skin in this game which colors my view-point, so take what I say with a grain of salt.

In particular, state schools have a lot less flexibility in terms of how we allocate funding to graduate students; our pay rates are set by California state law, and the salary of every single employee of the university is subject to public scrutiny [0]. We cannot simply dip into our endowments to correspondingly raise wages, since the amount which can be withdrawn each year is strictly capped as a portion of the total, to ensure the long-term growth of the fund (see [1], "Endowment Spending Policy").

It is true that colleges and universities have grown tremendously in their size and scope. This is a popular anecdote (and often repeated without data), but consider this: a university has become a mini-society in a lot of ways. Universities offer an unbelievable number of services and drive their own mini-economies. They are often among the largest employers in their area, even in thriving metropolitan centers [2]. They provide health insurance, mental health care, their own police forces, job-searching centers, support groups, etc. As our broader American society scales back its social services, universities are increasingly on the hook to provide these services to students.

I agree that universities have grown too large, and resent extremely large unnecessary salaries (for example for sports coaches), but I also acknowledge that such waste exists in any sufficiently large and complex system, and it is a constant and active process to limit its effects. Condemning the whole system on account of this seems like cutting off the nose to spite the face.

[0]: https://ucannualwage.ucop.edu/wage/ [1]: http://www.ucop.edu/investment-office/_files/report/UC_Annua... [2]: http://www.ucla.edu/about/facts-and-figures

piemonkey | 8 years ago | on: Macro-Lisp, a Lisp-Like DSL for Rust Language

Rust macros are extremely impressive. This reminds me of Will Crichton's Lia library[0], which embeds a convenient matrix manipulation domain specific language inside of Rust. This seems like a very promising direction for fusing the low-level performance of Rust with the high-level readability and usability of a managed language. Exciting times.

[0]:https://github.com/willcrichton/lia

piemonkey | 8 years ago | on: An Intro to Compilers

Alex Aiken has a very nice course: http://openclassroom.stanford.edu/MainFolder/CoursePage.php?...

piemonkey | 8 years ago | on: I found a bug in Intel Skylake processors

His name is italicized because he is the primary author of OCaml (and a plethora of other great tools, like CompCert, the first fully-verified compiler). Overall, an exceedingly competent and productive programmer and scientist.

The doctor metaphor isn't perfect; what I was going for is, when you are seeking out an expert's advice and you ignore it, why do you go to see the expert in the first place?

piemonkey | 8 years ago | on: I found a bug in Intel Skylake processors

One amusing thing about this epic tale is Serious Industrial OCaml User disregarded direct, relatively easy to implement, very sound advice from Xavier Leroy about how to debug their system! I would like to think that, were I in a similar situation being advised by an expert of that calibre, I would at least humor his suggestions.

Why seek the expert if not for his advice? It brings to mind people disregarding doctors who give them inconvenient medical advice.

piemonkey | 9 years ago | on: Introduction to K-means Clustering

> K-means is based on some quite wild assumptions - your data follows a specific case of the Gaussian distribution. Plus side is that the algorithm is relatively easy to understand and implement so it is a good starting point into clustering.

K-means is a non-parametric clustering algorithm, which assumes no underlying properties about your data. It is hard to evaluate the quality of a clustering algorithm without some clearly defined objective. In particular, k-mean's cousin in the classification community, k-nearest neighbors, actually is provably a a Bayes optimal decision strategy as k and the number of samples increases, regardless of the underlying distribution of your data (i.e., it is strongly consistent, see [0]).

> K-means can be levelled up with the EM algorithm

In fact, the k-means algorithm is actually a special case of fitting a mixture of k Gaussian distributions where all the Gaussians are isotropic (have identical covariance matrices with uniform elements on the diagonal, i.e. a scaled identity matrix).

[0] https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

piemonkey | 9 years ago | on: Vellvm: Verifying the LLVM (2015)

You are right, perhaps my original post was unclear. The goal is to verify the transformation between LLVM and low-level assembly. I embellished slightly by equating the semantics of low-level assembly with CPU executions.

piemonkey | 9 years ago | on: Vellvm: Verifying the LLVM (2015)

I think it's important to understand what's meant when someone says some piece of software is "verified".

In this case, when they say LLVM is verified, they mean that they designed a formal correspondence between the (1) semantics of the LLVM virtual machine language and (2) the actual execution of the underlying CPU (in this case, modeled using actual assembly language). Essentially, this amounts to writing two interpreters (one for LLVM assembly and one for low-level assembly), writing a compiler to generate low-level assembly from LLVM assembly, and proving that the interpreter for LLVM assembly does the same thing as the assembly language interpreter running on the generated code.

The tricky part here is "proving that they do the same thing"; this means that, for any program written in LLVM, there is a corresponding program in low-level assembly that has equivalent semantics. Reasoning about all possible programs requires reasoning over an infinite family of objects; there are a few tools in mathematics for doing this, but by far the most useful for reasoning about programming languages is induction. Coq, the tool used in the linked page, is a very popular tool for writing inductive proofs on heterogeneous tree structures (aka, an abstract syntax tree).

Ultimately, they use Coq to define the semantics of LLVM by writing (1) execution rules (for example, adding two integers results in a new int in a certain register) and, (2) in the case of LLVM (but not low-level assembly), typing rules which provide certain guarantees themselves. Coq is a proving environment which provides a functional dependently-typed programming language which can be used to construct and verify constructive proof arguments about programs. For example, you can prove existence (that there always exists an assembly language program for every LLVM program), or type soundness (that an LLVM program that is well-typed never performs an illegal operation according to its type semantics).

Ultimately, the value of proving this correspondence is that we know that LLVM has (1) equivalent expressive power to low-level assembly and (2) that, barring any bugs in the way that a CPU executes its assembly, the generated LLVM code will perform the same operations as the assembly language that it generates.

Edit: As some follow-up posts point out, this particular project is more concerned with verifying intra-LLVM transformations (i.e., LLVM to LLVM). This is quite different from what I described; my post describes a verified compiler, similar to CompCERT.

piemonkey | 9 years ago | on: Nvidia on new self-driving system: “basically 5 years ahead and coming in 2017”

There are a number of interesting examples in computer vision and psychology literature which document difficult cases for visual perception, see for example slides 14-19 in [0]. Ultimately, vision is a holistic process with edge cases that require knowledge of physics, human psychology, and so-called "common-sense reasoning" in order to resolve. Depending on how one phrases the objective of what a computer vision system should do, the problem can go from a tractable subset of automated reasoning to an intractable general AI task, often with very subtle changes to the problem statement.

[0] http://www.cs.toronto.edu/~urtasun/courses/CV/lecture01.pdf

piemonkey | 9 years ago | on: Probabilistic Programming

As a researcher that is currently working on implementing and improving existing PPLs, I'm very curious: what is your specific use-case that you have in mind for recursive queries?

piemonkey | 9 years ago | on: Fine-Grained Language Composition: A Case Study

My first thought is: Neat! Languages with different grammars, syntax, and libraries have different expressivities; my first thought is always compiler implementation in OCaml. I would love, for example, to (effortlessly!) dip into OCaml or Haskell while writing a high-performance C++ program, for example.

I am very intrigued by Lia [0] as an example for how languages can be embedded inside of one another for useful effects. See Will Crichton's great blog post for more [1].

However, I am wondering why Python and PHP were chosen. I can't think of a compelling use for having both languages simultaneously. They are both dynamic scripting languages with similar design objectives.

[0] https://github.com/willcrichton/lia [1] http://notes.willcrichton.net/the-coming-age-of-the-polyglot...

piemonkey | 10 years ago | on: GitHub supports Universal 2nd Factor authentication

One thing to consider is the possibility of your phone itself being compromised (stagefright et al.). Note how Duo issued a security advisory to limit access for Android devices [1].

A fully isolated component like a Yubikey has a smaller attack surface area for these kinds of things (easier to audit smaller code, no sustained Internet or cellular connectivity).

[1]https://www.duosecurity.com/blog/understanding-your-exposure...

piemonkey | 11 years ago | on: DDoS attack protection for at-risk public interest websites

I'm personally shocked by how much power a DDoS has to potentially sway public opinion and influence the world at large. A few individuals have a hugely disproportionate voice in our public media by nature of the fact that they can control what other websites say through these attacks.

Is there any progress on infrastructure improvements that could potentially improve this current state of affairs? Is our only solution for benevolent companies like Cloudflare to offer their blanket of protection? I guess I'm asking, who will guard the guards?