Machine Learning

Monday, March 28, 2005

UAI paper on monotonicity constraints

Here is the abstract from the paper Tom Dietterich, Angelo Restificar, and myself have just submitted to UAI:

"When training data is sparse, more domain knowledge must be incorporated into the learning algorithm in order to reduce the effective size of the hypothesis space. This paper builds on previous work in which knowledge about qualitative monotonicities was formally represented and incorporated into learning algorithms (e.g., Clark & Matwin's work with the CN2 rule learning algorithm). We show how to interpret knowledge of qualitative influences, and in particular of monotonicities, as constraints on probability distributions, and to incorporate this knowledge into Bayesian network learning algorithms. We show that this yields improved accuracy, particularly with very small training sets."

Full text in pdf or in ps

Saturday, March 26, 2005

Why I decided to work for Google

Cross-post on my other blog.

Friday, March 25, 2005

New research tool

In case you haven't already, check out http://scholar.google.com/ . It looks pretty nice.

Thursday, March 10, 2005

More prior work

Potharst and Feelders have an article in SIGKDD Explorations, June 2002, about learning monotonic classification trees (with monotonic and non-monotonic, i.e., contradictory to the prior, data). There's no probabilistic stuff in there, but nonetheless it's interesting to see others are pursuing similar routes. Read here.

Wednesday, March 09, 2005

Qualitative and quantitative priors

Expert domain knowledge is usually qualitative, not quantitative, and thus, eliciting probability numbers from experts is usually very difficult. One of the core goals of the KI-Learn project is to use qualitative knowledge, and we attempt this with a language whose qualitative statements
  1. are easy and natural to write by domain experts
  2. have well-defined semantics for probability distributions which correspond to experts' intuitions
However, things aren't so simple. Suppose our training data contradicts an expert's qualitative statement. The proper posterior depends on how much we believe in our domain knowledge and how much data we have. In fact, the question is: how strong is our expert's prior? This is where we are forced right back into specifying quantitative aspects of our model (i.e., the numbers that parameterize the expert's prior).

So here is my question: is it ever possible to specify purely qualitative domain knowledge? I suppose the answer is: only if you assume the knowledge is true with probability 1 (which of course is simply making the quantitative part implicit). This is nasty, though. Nobody wants to state something is true with probability 1, but nobody wants to specify probabilities, either. Is there any alternative to picking the lesser of these two evils? It seems the answer is no...?

Thursday, March 03, 2005

KIML: future work

Examples of valuable domain knowledge other than monotonicities include synergistic influence (two things both positively influence an outcome, but their combined effect is greater than mere additivity) and relative strength-of-influence (two things both influence an outcome, but it is known one is a significantly stronger predictor than the other). We have not yet run experiments to test the value of these statements, but we do have defined mappings from such statements to constraints on probability distributions, and we expect similar results as were obtained for monotonicities.

Longer term "knowledge-intensive machine learning" goals at Oregon State are more ambitious: automatic feature engineering, model simplification, etc.

By the way, for those of you interested in details, I can provide a current draft of my thesis (especially if you are willing to provide constructive criticism :-).

Tuesday, March 01, 2005

KIML: known prior work

I mentioned Bayes nets as the best existing example of what "knowledge intensive machine learning" is about. There really aren't very many techniques out there for learning given priors or constraints generated from high-level qualitative knowledge, and the few that I can think of are all structural.

Inductive logic programming in some sense fits in here, but the intensional and extensional information must be in the same format -- logical statements -- which is a little unnatural for ML folks, and furthermore, the built systems don't deal well with uncertainty. Probabilistic logic programs (BLPs and SLPs) are a little closer, but they still only deal with structural domain knowledge (first-order rather than propositional). Qualitative reasoning techniques give us natural and intuitive ways to express qualitative domain knowledge (in particular, knowledge other than structural knowledge), but they don't use that knowledge to learn better models from data, and they have only primitive notions of uncertainty (in particular they don't deal with probabilities). Knowledge-based model construction offers ways of using domain knowledge to better answer queries, even ones on probabilities, but they generally assume hand-engineered probabilities, not ones learned from data.

Of course, this leaves out the maxent work which of course is all about finding distributions which satisfy constraints -- clearly something I need to research more.