Mildly unrelated, but I wanted to point out the awesomeness of the K-map[1] for handling simple minimization problems like these.
Unfortunately, once your dimensions get large it becomes difficult for humans to visualize the optimizations, which is where Espresso[2][3] comes in handy.
Ooh, espresso. There was also a tool called eqntott for converting boolean logic expressions in a human-readable form into the truth tables that Espresso takes as input. The source code was written in an archaic dialect of C, but here's a resurrected version that should compile on modern compilers:
I haven't touched this code since 2008, so beware of grues, but I remember it being a really slick system for minimizing boolean logic when you combine eqntott with espresso. Just the thing for an electrical engineering student who's sick of doing K-maps, which described me nicely at the time.
While I'm not sure thousands of if/then/else rules are actually the best knowledge representation for his domain (it sounds like he really wants to learn some model from data, in which case throwing some off-the-shelf machine learning may be a better fit), the area of "expert systems" may be relevant if they really are rules extracted directly from some source that need to be managed, such as a human domain expert.
Expert systems were particularly big in AI in the '80s, and considerably less hot now, but still widely used in industry. There are a lot of issues that come up, most of which aren't really purely technical, such as: 1) how you extract knowledge from people who might know it; and 2) how you validate that the knowledge you've extracted is actually what they know, for example by validating that it produces the same decisions that they would make; and 3) how you allow updates/revisions to the knowledge base over time. Once you have the rules and some reason to believe that they're any good, actually managing and applying them can be done through one of several rule engines, such as Drools or Jess.
The keyword "knowledge engineering" may also turn up relevant info.
Is it just me? It's not a good question, it shows lack of core coding experience.
For example a video game consists of millions of possibilities/results in movement based on environment - it's not approached as a collection of thousands of if/then/else statements.
In fact the movement of cows seems very much like video game coding, it needs a "cow engine".
It's a great question. It expresses a problem from the worldview of a coder with a particular, very common skillset. Which is something that other coders with the same problem and the same skillset will find in their searches. As long as the answers are good, it's a perfect use of the system.
I think it was a really good question. It showed that he had the intuitive instinct that there had to be a better way. He just didn't know how to ask for a "cow engine" by name.
I think it's a good question if only for the quality of answer it gained for the rest of us. But it's certainly not what I would expect from a very experienced developer.
Such a great question. Want to predict cow movement while taking into consideration sudden events, food and sun, using thousands of if then rules? Sounds like the perfect problem for one of the following tree learners that not only manage themselves but build the model for you, in order of suitability to the problem* :
Genetic Programming
ADTree
Random Forest
Decision Tree
Most of the work would be in figuring out the best way to gather all the data.
All machine learning techniques build models for you.
ADTrees, Random Forests, and Decision trees are all classifiers, so they serve this guy's needs. Genetic Programming is not a machine learning technique: it is an optimization method for a very specific task and is quite unsuited for his purposes.
Solution is great and there are very nice algorithm suggestions given here as well. Now the question is how easily one can apply growing number of rules to production without taking down the application.
I suggested using Prolog in the comments. His problem description was not quite good enough for me to assess whether or not Prolog will be a good choice, but at least it's a candidate.
A Prolog program is essentially a list of rules that are written as an implication. Something like:
Good answer. Nice to see the original questioner accepted it. To take it a bit further, so-called "answer set solvers" like clasp (http://potassco.sourceforge.net/) provide a fast way to run these kinds of logic programs.
You really don't want to think of this in terms of IF ... THEN ... ELSE but as individual binary or multi-path choices. These are most efficiently handled with hash tables that choose a route through a logic diagram in a way that is more intuitive than If ... THEN ... ELSE as well as more efficient.
You start by writing a logic flowchart that you understand, then you reduce the logic flowchart to code, usually automatically. This is how a logic flowchart looks:
Maybe I'm off the mark, but whenever I see a long runon procedure full of if-then-else, I think "This is a job for a state machine!" With that approach, you can exhaustively provide actions for every combination of states and events. You can also immediately identify the code responsible for handling any state+event. You can also track which state+event transitions have been tested automatically.
Even with the clue that he wants to "predict how cows move around in any landscape", his description seems like it wouldn't be enough any definite design advice.
What do the states of the cows and the landscape look like? What is changed by the if-then-else results. Does he already have his rules? Is this for a biology project or a video game, etc.
You could use a finite automaton system, an array of cow-states, a cow DSL or something else depending on the answers to these and other questions.
Yes. The piece missing for me in most answers I've seen is eventing. Does one pump a series of events/state changes into the CowModel and see what actions pop-out? Does the model involve interactions with other cows where the actions of one cow cause actions of others? If so, one must have an event queue to handle chain reactions. How discrete are the events? Is there a "temp is 89F" event where, for every N minutes the cow is in that temperature in direct exposure, it becomes more likely to seek shade? If so, the "it is 89F" event must be triggered multiple times. And, such a rule requires that the past states of the cow be retained. I think the if/then/else logic is only part of the solution to a simulation problem of this sort.
A finite state machine is actually not a great solution for this particular problem (in fact, FSMs are often over-used), IMHO. An expert system is a far better solution (as one commenter mentioned). The problem with a FSM is that you still must explicitly specify ALL of the nodes (without some complex algorithm to automate such a task). An expert system, like prolog, only requires you to state all of the rules, and then it will figure out any query for you (via propositional logic, backwards chaining, etc). The number of nodes typically grows exponentially with the number of rules (if the rules are inter-related), so for a problem of any significant size an expert system would probably be the method of choice.
It's basically thousands of if-this-then-that statements, but dressed up into something you can reason about and most importantly train and build automagically.
This really sounds like a machine learning problem--why hand engineer all these rules yourself (which can be error prone in any case) when a program could do it for you? If you can figure out how to get a large data set together and come up with a good general representation of the data, you're half way there.
I'd consider taking a look at recurrent neural networks if you're looking at your data as a time series or if its more of a static problem that doesn't take into account the last N things that came before, you might consider tree based methods or even DBNs if you can get a lot of data together (say 50000-100000 samples)
Potential pitfall: if you're using a NN based approach, you will judge results based on test performance, but you won't get as much insight into the "rules" your network has learned.
I have an axiom "any sufficiently complex if-then-else ruleset can be modeled as a search problem with boosts".
I have modeled a logistics lookup system (assign "best" courier for an ecommerce setup) using a search solution. The rules were modeled as documents in a search index and looked up using weights. Fundamentally, what I did was flatten an if-then-else hierarchy and assign weights based on (roughly) level of nesting.
The con of this model is obvious - it is at best a heuristic. However, with clever overrides built in (as a separate search index?), you can get pretty close to the ideal solution.
The pro of this model is scalability - when your rules are in millions, this system scales beautifully.
The question is an interesting one but it's futile in its context. You can't build a prediction engine for cattle movement. Adding factors like weather, food etc. seems well and dandy but there's hundreds if not thousands of other factors that are definitely going to be missing and that will render the prediction events pointless.
Never mind the fact that you can throw in black swan type of events into the mix or just unknown unknowns you could never think of (maybe a cow has a frog phobia and panics when it sees the frog and starts a stampede or whatever it is that cattle do).
You can't predict how an individual cow will move, but it's perfectly acceptable to use models to predict how cattle generally move. Similar to the way we use models to predict how humans navigate hallways and find paths across grass.[1] When you have a bulk of interacting particles, it tends to remove the "personality" of the individual, leaving you with how the group as a whole will move.
Philip Ball's (former editor of Nature) book Critical Mass[2] is an excellent read about the interesting effects that happen when you look at large number of complex interacting actors. He discusses the effects in traffic, pedestrian models, finance, plant growth etc. I highly recommend it.
I am late to this discussion, but I would like to ad my experiences with compiling rules to a Rete network. I have done a lot of hacking on the old Lisp OPS5 code and found it easy to work with, once you read and understood Charles Forgy's papers. I hacked OPS5 to support multiple data worlds and a few other enhancements for specific projects.
The more modern Rete based software projects are probably also easy to work with.
What, are you insinuating that asking about great big if-then-else trees means you'd never hire that person? Because, what, you've never been a complete novice programmer before?
Considering the question-asker has a specific problem to solve, I would even hazard they probably aren't a programmer by trade.
[+] [-] wickedchicken|13 years ago|reply
Unfortunately, once your dimensions get large it becomes difficult for humans to visualize the optimizations, which is where Espresso[2][3] comes in handy.
[1] http://en.wikipedia.org/wiki/Karnaugh_map
[2] http://embedded.eecs.berkeley.edu/pubs/downloads/espresso/in...
[3] ftp://ftp.cs.man.ac.uk/pub/amulet/balsa/other-software/espresso-ab-1.0.tar.gz
[+] [-] pjscott|13 years ago|reply
http://code.google.com/p/eqntott/
I haven't touched this code since 2008, so beware of grues, but I remember it being a really slick system for minimizing boolean logic when you combine eqntott with espresso. Just the thing for an electrical engineering student who's sick of doing K-maps, which described me nicely at the time.
[+] [-] theatrus2|13 years ago|reply
[+] [-] _delirium|13 years ago|reply
Expert systems were particularly big in AI in the '80s, and considerably less hot now, but still widely used in industry. There are a lot of issues that come up, most of which aren't really purely technical, such as: 1) how you extract knowledge from people who might know it; and 2) how you validate that the knowledge you've extracted is actually what they know, for example by validating that it produces the same decisions that they would make; and 3) how you allow updates/revisions to the knowledge base over time. Once you have the rules and some reason to believe that they're any good, actually managing and applying them can be done through one of several rule engines, such as Drools or Jess.
The keyword "knowledge engineering" may also turn up relevant info.
[+] [-] Dn_Ab|13 years ago|reply
[+] [-] ck2|13 years ago|reply
For example a video game consists of millions of possibilities/results in movement based on environment - it's not approached as a collection of thousands of if/then/else statements.
In fact the movement of cows seems very much like video game coding, it needs a "cow engine".
[+] [-] InclinedPlane|13 years ago|reply
[+] [-] toddmorey|13 years ago|reply
[+] [-] Dn_Ab|13 years ago|reply
[+] [-] bowmessage|13 years ago|reply
[+] [-] kamaal|13 years ago|reply
Do you have any material one can read to learn more about this?
[+] [-] krichman|13 years ago|reply
[+] [-] Dn_Ab|13 years ago|reply
Genetic Programming
ADTree
Random Forest
Decision Tree
Most of the work would be in figuring out the best way to gather all the data.
*opinion.
[+] [-] SeanLuke|13 years ago|reply
ADTrees, Random Forests, and Decision trees are all classifiers, so they serve this guy's needs. Genetic Programming is not a machine learning technique: it is an optimization method for a very specific task and is quite unsuited for his purposes.
[+] [-] psingh|13 years ago|reply
[+] [-] bicknergseng|13 years ago|reply
[deleted]
[+] [-] exDM69|13 years ago|reply
A Prolog program is essentially a list of rules that are written as an implication. Something like:
In human language: a cow moves to a location if it's hungry and there's more food in that location than in current location.I should probably write a proper answer.
edit: Added a proper answer to the stackoverflow with an almost complete introductory Prolog example. It's waiting for your upvotes :)
[+] [-] bumbledraven|13 years ago|reply
[+] [-] cartosys|13 years ago|reply
[+] [-] lutusp|13 years ago|reply
You start by writing a logic flowchart that you understand, then you reduce the logic flowchart to code, usually automatically. This is how a logic flowchart looks:
http://en.wikipedia.org/wiki/File:LampFlowchart.svg
Here is the article it appears in:
http://en.wikipedia.org/wiki/Flowchart
The single most important parts of this process are:
1. That you understand the logic diagram and see that it meets the program's requirements.
2. That there is a way to turn the logic flowchart into code, ideally without human intervention.
3. That the rebuild time be short between adding a new logical step and testing the resulting program.
[+] [-] JoeAltmaier|13 years ago|reply
[+] [-] joe_the_user|13 years ago|reply
What do the states of the cows and the landscape look like? What is changed by the if-then-else results. Does he already have his rules? Is this for a biology project or a video game, etc.
You could use a finite automaton system, an array of cow-states, a cow DSL or something else depending on the answers to these and other questions.
[insert joke about "till the cows come home"]
[+] [-] ShabbyDoo|13 years ago|reply
[+] [-] gavanwoolery|13 years ago|reply
[+] [-] JustLikeThat|13 years ago|reply
"How can one manage thousands of IF…THEN…ELSE rules? By developing a drinking problem"
[+] [-] indiecore|13 years ago|reply
[+] [-] tylermauthe|13 years ago|reply
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] psingh|13 years ago|reply
[+] [-] unknown|13 years ago|reply
[deleted]
[+] [-] Swizec|13 years ago|reply
http://en.wikipedia.org/wiki/Decision_tree
It's basically thousands of if-this-then-that statements, but dressed up into something you can reason about and most importantly train and build automagically.
[+] [-] dave_sullivan|13 years ago|reply
I'd consider taking a look at recurrent neural networks if you're looking at your data as a time series or if its more of a static problem that doesn't take into account the last N things that came before, you might consider tree based methods or even DBNs if you can get a lot of data together (say 50000-100000 samples)
Potential pitfall: if you're using a NN based approach, you will judge results based on test performance, but you won't get as much insight into the "rules" your network has learned.
[+] [-] sandGorgon|13 years ago|reply
I have modeled a logistics lookup system (assign "best" courier for an ecommerce setup) using a search solution. The rules were modeled as documents in a search index and looked up using weights. Fundamentally, what I did was flatten an if-then-else hierarchy and assign weights based on (roughly) level of nesting.
The con of this model is obvious - it is at best a heuristic. However, with clever overrides built in (as a separate search index?), you can get pretty close to the ideal solution.
The pro of this model is scalability - when your rules are in millions, this system scales beautifully.
[+] [-] instakill|13 years ago|reply
Never mind the fact that you can throw in black swan type of events into the mix or just unknown unknowns you could never think of (maybe a cow has a frog phobia and panics when it sees the frog and starts a stampede or whatever it is that cattle do).
[+] [-] mmcnickle|13 years ago|reply
Philip Ball's (former editor of Nature) book Critical Mass[2] is an excellent read about the interesting effects that happen when you look at large number of complex interacting actors. He discusses the effects in traffic, pedestrian models, finance, plant growth etc. I highly recommend it.
[1]http://www.youtube.com/watch?v=8SmRBTJ-jeU This is just a video I picked out quickly, there are much better resources, I just can't find them on my phone.
[2]http://www.amazon.co.uk/Critical-Mass-Thing-Leads-Another/dp...
[+] [-] Evbn|13 years ago|reply
[+] [-] mark_l_watson|13 years ago|reply
The more modern Rete based software projects are probably also easy to work with.
[+] [-] xarien|13 years ago|reply
[+] [-] ynniv|13 years ago|reply
[+] [-] sliverstorm|13 years ago|reply
Considering the question-asker has a specific problem to solve, I would even hazard they probably aren't a programmer by trade.
[+] [-] yabadab|13 years ago|reply
Even something as simple as a Decision Tree might be enough?
[+] [-] kos_mos|13 years ago|reply
[+] [-] tlogan|13 years ago|reply