mgreenbe's comments

mgreenbe | 5 years ago | on: Formulog: ML + Datalog + SMT

I'd say it's conventional to reuse an existing parser to generate facts.

The AST point is a subtle one. Classic Datalog (the thing that characterizes PTIME computation) doesn't have "constructors" like the ADTs (algebraic data types) we use in Formulog to define ASTs. Datalog doesn't even have records, like Soufflé. So instead you'll get facts like:

``` next_insn(i0, i1). insn_type(i0, alloc). alloc_size(i0, 8). insn_type(i1, move). insn_dest(i1, i0). insn_value(i1, 10). ```

to characterize instructions like:

``` %0 = alloc(8) %0 = 10 ```

I'm not sure if that's you mean by serializing relations. But having ASTs in your language is a boon: rather than having dozens of EDB relations to store information about your program, you can just say what it is:

``` next_insn(i0, i1). insn_is(i0, alloc(8)). insn_is(i1, move(i0, 10)). ```

As for your point about Prolog, it's a tricky thing: the interface between tools like compilers and the analyses they run is interesting, but not necessarily interesting enough to publish about. So folks just... don't work on that part, as far as I can tell. But I'm very curious about how to have an efficient EDB, what it looks like to send queries to an engine, and other modes of computation that might relax monotonicity (e.g., making multiple queries to a Datalog solver, where facts might start out true in one "round" of computation and then become false in a later "round"). Query-based compilers (e.g., https://ollef.github.io/blog/posts/query-based-compilers.htm...) could be a good place to connect the dots here, as could language servers.

mgreenbe | 5 years ago | on: Formulog: ML + Datalog + SMT

I think the solvers _are_ tuned for PP. But we're comparing CSA and PP on the queries that Formulog issues... which don't really match well with the DFS discipline that the PP stack aligns with. I think CSA beats PP in our experiments because CSA is more flexible about locality.

Broadly---and I haven't looked at the memory usage to confirm this---I think CSA trades space (cache more old formulae, not just the prefix of some stack) for time (look, our answers are in the cache!).

mgreenbe | 5 years ago | on: Formulog: ML + Datalog + SMT

Datafun is quite cool! Formulog and Datafun seem similar---both combine logic programming and pure functional programming---but they take wildly different approaches.

Datafun is a foundational re-imagining of what a Datalog could look like: start with a higher-order programming language and give it a first-class notion of least fixed points. A type system for tracking monotonicity lets you roll your own Datalog. It's impressive that you can reconstruct semi-naive evaluation (morally: in each 'round' of evaluation, only apply rules matching new results) in their setting (https://dl.acm.org/doi/abs/10.1145/3371090). Datafun is still a ways away from the performance and implementation maturity of existing Datalogs, though.

Formulog's approach is to try to let Datalog be Datalog as much as possible. We end up with restrictions around higher-order functions and other FP features in order to keep things simple on the Datalog side---quite the opposite of Datafun's fancy type system. Our Formulog interpreter does pretty well with internment, parallel execution, and magic sets, but you could easily port our design to existing Datalog compilers and get even bigger speedups. It's not clear how to do that for Datafun... yet.

(I suspect you could port our SMT interface to Datafun without too much of a problem, too.)

mgreenbe | 5 years ago | on: Formulog: ML + Datalog + SMT

We do use incremental solving. check-sat-assuming is generally better than push/pop, though, because Datalog's bottom-up search isn't DFS.

If you're interested, check out our ICLP 2020 extended abstract: https://cs.pomona.edu/~michael/papers/iclp2020_extabs.pdf. We should have more on this in not too long.

mgreenbe | 5 years ago | on: Formulog: ML + Datalog + SMT

Thanks! :) We should be very clear that the bulk of the work is Aaron Bembenek's.

I think Formulog would work great for analyzing the shell---as would any other Datalog, though SMT-based string reasoning will certainly come in handy. I don't think it will help you with parsing issues, though. The general approach to static analysis with Datalog avoids parsing in Datalog itself, relying on an EDB ("extensional database"---think of it as 'ground facts' about the world, which your program generalizes) to tell you things about the program. See, e.g., https://github.com/plast-lab/cclyzer/tree/master/tools/fact-... for an example of a program for generating EDB facts from LLVM. Just like real-world parsers, these are complicated artifacts.

mgreenbe | 5 years ago | on: Formulog: ML + Datalog + SMT

mgreenbe | 16 years ago | on: What to know before debating type systems

Fair enough. I would play down the significance where you played it up. Considering the equanimitous tone of the whole article, it seems unfair---or, at least, ungenerous---to take him to task for a colorful, one-off analogy.

mgreenbe | 16 years ago | on: What to know before debating type systems

  Dynamically typed languages are automobiles and statically 
  typed are bicyles?? Not exactly a neutral comparison and 
  hypocracy in light of his complaints against the strong 
  vs. weak labels.

Good thing he didn't make it. Please read the article next time. Here's the relevant quote:

  [Obsessively tracking type information in dynamic 
  languages] prevents a programmer from realizing the 
  benefits of dynamic typing. It's like buying a new car, 
  but refusing to drive any faster than a bicycle. The car 
  is horrible; you can't get up the mountain trails, and it 
  requires gasoline on top of everything else. Indeed, a 
  car is a pretty lousy excuse for a bicycle! Similarly, 
  dynamically typed languages are pretty lousy excuses for 
  statically typed languages.

The point the author is making is that some programmers may have been frustrated by dynamic languages because they didn't allow their programs to fail, i.e., with runtime type errors. I disagree with the author's point---I've never heard of anyone having that problem---but you've woefully misunderstood what the author says.

mgreenbe | 16 years ago | on: Surprises in Mathematics and Theory

First, Coq isn't an automatic theorem prover, it's a proof assistant. Adequacy is an issue, but (particularly) for computer-science applications it's sometimes possible to use the Coq artifact as the implementation directly. Adam Chilpala's recent work has made some pretty impressive advances in verifying working code.

I think you misunderstand, though: there's no obsession with absolute truth, at least not in the mathematical world I live in. I'm a PL researcher; we don't pretend to absolute truth, but formality and correctness are our cardinal virtues. (Take Coq as an apropos example.) Our proofs are (a) by induction, (b) long, and (c) can involve subtle and fussy steps. And (d), in an ideal our world, our proofs are also the bedrock of high-assurance systems. Proofs with different qualities may be better candidates for a more intuitive approach. In particular, actual mathematicians are probably more interested in the intuitive side of proof than we are. The article is in fact interested in that more intuitive mathematics; Lipton points out that this intuitive mathematics---whether as formal as 4CT or as informal as the vision conjecture he describes---is full of fun surprises. I'm not sure he's worried about the sort of nitpicking formality that I'm into. :)

mgreenbe | 16 years ago | on: A Working Programmer's Guide to Type-Indexed Values

A neat trick, but---as one of the commenters points out---manual dictionary passing is a bit much. I would love Haskell, but for laziness...

mgreenbe | 16 years ago | on: Productivity tips, tricks and hacks for academics

Good advice. Transaction costs and, in particular, startup costs are some of the most difficult things I face in my (graduate student) day-to-day. Simply getting up to the whiteboard can be difficult.

I'm definitely going to start using a reference managing website.

I've been using Ad Block Plus to keep myself out of Reddit and Google Reader at work. (Sadly, I think the time has come to add HN, as well.) The method works, and I've been tempted only once or twice to circumvent the blocks.

mgreenbe | 16 years ago | on: Reason as memetic immune disorder

You forgot to add the argument that it's (strictly speaking) more accurate, since Yehuda is just one of the tribes.

But s/Hebrew/Colored/. Same arguments apply, but---here we are.

mgreenbe | 16 years ago | on: Reason as memetic immune disorder

  There's one particular thing about Christianity though. 
  Well OK, a couple of things, one it's kind of a modern 
  less complex flavor of Judaism, no offense meant to any 
  Hebrews reading this.

First, the similarity between the practice of Judaism and Christianity is the product of the last century or so. At the time, the ideas of, e.g., Paul diverged significantly from contemporary Jewish thinking.

Second, if you don't want to offend Jews, don't call them "Hebrews". ;)

mgreenbe | 16 years ago | on: How To Appear To Be A Sophisticated Programmer

  There are no French programming languages, not even Pascal 
  or Eiffel uses French words for the syntax, and all the 
  documentation is in English.

OCaml is implemented by le French, and I've seen code that uses French variable names. I haven't yet seen the camlp4 hack that lets you say laissez instead of let, though.

mgreenbe | 16 years ago | on: Direct Evidence Of Role Of Sleep In Memory Formation Is Uncovered

Original article: http://www.nature.com/neuro/journal/vaop/ncurrent/abs/nn.238...

I was inclined to be skeptical---breaking news, shocking the brain during sleep impairs performance---but they used three different groups of rats:

  ...we tested the role of SPW-Rs on memory consolidation.
  Three groups of rats (test group, n = 7; stimulated 
  controls, n = 7; unimplanted controls, n = 12) were 
  trained to find food rewards ... . 

  During post-training rest and sleep, all of the online-
  detected ripples were suppressed by commissural 
  stimulations in test rats (average online detection rate 
  was 86.0 ± 1.3% (s.e.m.) of post hoc detected SPW-Rs;
  ...). Stimulated control rats underwent the same 
  protocol, except that a random delay (80–120 ms)
  was introduced between SPW-R detection and stimulation, 
  ensuring that the stimulations occurred mainly outside of 
  the ripple episodes.

Now check this out:

  Thus, these control rats received the same number of 
  stimulations as test rats, but their hippocampal
  ripples were left largely intact. The global architecture 
  of sleep and the local field potential power in distinct 
  sleep stages were not modified by the suppression of 
  SPW-Rs ... .  As stimulation outside SPW-Rs had
  no detectable effect on task performance ..., the two 
  control groups were pooled and compared with test rats. 
  Performance of the test rats was significantly impaired 
  ... .

Cool experiment design with an intriguing result.

mgreenbe | 16 years ago | on: Ask HN: Aspect Oriented Programming?

AOP was the next big thing five years ago. I've never heard of a serious, non-dog-food project written in any (pointcut-rich) implementation of aspects.

The fundamental problem with aspect-oriented programming is that it makes it impossible to reason compositionally: any part of your program could be affected by an aspect, so it can be quite hard to know what your code is doing.

I think instrumentation, as a technique, is vital. But having instrumentation as a language-level feature seems like a bad idea to me. As a programmer, I instrument code to understand its behavior, not to define it.

mgreenbe | 16 years ago | on: Where is the grave-yard of dead gods? (H.L Mencken, 1922)

To be fair, Astarte and Ashtoreth are probably the same---just a vocalization error. :)

mgreenbe | 16 years ago | on: Evolving the Python standard library

Python is just JS with extra syntax? The languages have very different object and meta-object/overloading models, as well as other divergent features: JavaScript can mutate closures, for example.

I agree, though, that a stdlib should be part of the core language. Not having a serious standard library has hurt the Scheme community a great deal, I think.

mgreenbe | 16 years ago | on: ICFP '09: "Get rid of cons!" Guy Steele on parallel algorithms & data structures

Did you not read the talk? You can provide the car/cdr interface on top of tree-based implementations. Yes, you pay a cost to use that interface...but you pay a different (Guy Steele would argue more expensive) cost if you write your algorithms using car/cdr instead of, say, empty/singleton/conc.

Moreover, as is pointed out on the slides, a lot of processing can happen without ever having to talk about car/cdr, e.g., using higher-order functions like map and filter.

mgreenbe | 16 years ago | on: Before you start learning Lisp...

Your fingers are going to be crossed for a long time. I think the Steering Committee has too many divergent interests on it for there ever to be agreement on a serious standard library. Schemers are very DIY; for example, see http://philosecurity.org/2009/01/12/interview-with-an-adware.... Rolling your own libraries on top of a tiny implementation is not particularly uncommon. (I just hope you can write more, uh, ethical software than that guy!)

PLT has the most batteries of any distro I've ever seen (disclaimer: one of its authors was my undergrad advisor). That is, PLT is (a) free, (b) mature, and (c) has a fair number of modules written by people other than the developers of the language.

There's no sense in waiting, just dive in. If you don't want to use PLT, Bigloo can call Java libraries---solving your battery problem in a different way.