<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-11045925</id><updated>2011-07-28T19:12:49.326-07:00</updated><title type='text'>Machine Learning</title><subtitle type='html'>Questions, opinions, and ideas about machine learning, statistics, and related fields.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>10</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-11045925.post-112685062630852814</id><published>2005-09-15T23:02:00.000-07:00</published><updated>2005-09-15T23:03:46.313-07:00</updated><title type='text'>Consolidation</title><content type='html'>Hi all.  I've consolidated my computer science blog, my personal blog, and my website.  It can all now be found at &lt;a href="http://zigoku.net"&gt;http://zigoku.net&lt;/a&gt; .  Thanks for reading, and see you there.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-112685062630852814?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/112685062630852814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=112685062630852814' title='41 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/112685062630852814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/112685062630852814'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/09/consolidation.html' title='Consolidation'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>41</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-111206546649783519</id><published>2005-03-28T18:58:00.000-08:00</published><updated>2005-03-28T19:04:26.500-08:00</updated><title type='text'>UAI paper on monotonicity constraints</title><content type='html'>Here is the abstract from the paper Tom Dietterich, Angelo Restificar, and myself have just submitted to UAI:&lt;br /&gt;&lt;br /&gt;"When training data is sparse, more domain knowledge must be incorporated into the learning algorithm in order to reduce the effective size of the hypothesis space.  This paper builds on previous work in which knowledge about qualitative monotonicities was formally represented and incorporated into learning algorithms (e.g., Clark &amp; Matwin's work with the CN2 rule learning algorithm).  We show how to interpret knowledge of qualitative influences, and in particular of monotonicities, as constraints on probability distributions, and to incorporate this knowledge into Bayesian network learning algorithms. We show that this yields improved accuracy, particularly with very small training sets."&lt;br /&gt;&lt;br /&gt;Full text &lt;a href="http://web.engr.oregonstate.edu/%7Ealtender/pubs/uai05.pdf"&gt;in pdf&lt;/a&gt; or &lt;a href="http://web.engr.oregonstate.edu/%7Ealtender/pubs/uai05.ps"&gt;in ps&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-111206546649783519?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/111206546649783519/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=111206546649783519' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111206546649783519'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111206546649783519'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/03/uai-paper-on-monotonicity-constraints.html' title='UAI paper on monotonicity constraints'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-111189098389883003</id><published>2005-03-26T18:20:00.000-08:00</published><updated>2005-03-26T18:39:22.406-08:00</updated><title type='text'>Why I decided to work for Google</title><content type='html'>&lt;a href="http://zigoku.blogspot.com/2005/03/why-i-decided-to-work-for-google.html"&gt;Cross-post&lt;/a&gt; on my other blog.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-111189098389883003?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/111189098389883003/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=111189098389883003' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111189098389883003'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111189098389883003'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/03/why-i-decided-to-work-for-google.html' title='Why I decided to work for Google'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-111178007689233728</id><published>2005-03-25T11:45:00.000-08:00</published><updated>2005-03-25T11:47:56.893-08:00</updated><title type='text'>New research tool</title><content type='html'>In case you haven't already, check out  &lt;a href="http://scholar.google.com/"&gt;http://scholar.google.com/&lt;/a&gt; .  It looks pretty nice.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-111178007689233728?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/111178007689233728/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=111178007689233728' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111178007689233728'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111178007689233728'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/03/new-research-tool.html' title='New research tool'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-111051451477902881</id><published>2005-03-10T20:14:00.000-08:00</published><updated>2005-03-10T20:57:38.690-08:00</updated><title type='text'>More prior work</title><content type='html'>Potharst and Feelders have an article in SIGKDD Explorations, June 2002, about learning monotonic classification trees (with monotonic and non-monotonic, i.e., contradictory to the prior, data).  There's no probabilistic stuff in there, but nonetheless it's interesting to see others are pursuing similar routes.  &lt;a href="http://portal.acm.org/citation.cfm?doid=568574.568577"&gt;Read here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-111051451477902881?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/111051451477902881/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=111051451477902881' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111051451477902881'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111051451477902881'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/03/more-prior-work.html' title='More prior work'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-111043923972690501</id><published>2005-03-09T22:54:00.000-08:00</published><updated>2005-03-09T23:20:39.730-08:00</updated><title type='text'>Qualitative and quantitative priors</title><content type='html'>Expert domain knowledge is usually qualitative, not quantitative, and thus, eliciting probability numbers from experts is usually very difficult.  One of the core goals of the KI-Learn project  is to use &lt;span style="font-style: italic;"&gt;qualitative&lt;/span&gt; knowledge, and we attempt this with a language whose qualitative statements&lt;br /&gt;&lt;ol&gt;   &lt;li&gt;are easy and natural to write by domain experts&lt;br /&gt;  &lt;/li&gt;   &lt;li&gt;have well-defined semantics for probability distributions which correspond to experts' intuitions&lt;/li&gt; &lt;/ol&gt; However, things aren't so simple.  Suppose our training data contradicts an expert's qualitative statement.  The proper posterior depends on how much we believe in our domain knowledge and how much data we have.  In fact, the question is: how strong is our expert's prior?  This is where we are forced right back into specifying quantitative aspects of our model (i.e., the numbers that parameterize the expert's prior).&lt;br /&gt;&lt;br /&gt;So here is my question: is it ever possible to specify purely qualitative domain knowledge?  I suppose the answer is: only if you assume the knowledge is true with probability 1 (which of course is simply making the quantitative part implicit).  This is nasty, though.  Nobody wants to state something is true with probability 1, but nobody wants to specify probabilities, either.  Is there any alternative to picking the lesser of these two evils?  It seems the answer is no...?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-111043923972690501?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/111043923972690501/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=111043923972690501' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111043923972690501'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/111043923972690501'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/03/qualitative-and-quantitative-priors.html' title='Qualitative and quantitative priors'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-110969919045027242</id><published>2005-03-03T01:46:00.000-08:00</published><updated>2005-03-03T02:36:12.603-08:00</updated><title type='text'>KIML: future work</title><content type='html'>Examples of valuable domain knowledge other than monotonicities include synergistic influence (two things both positively influence an outcome, but their combined effect is greater than mere additivity) and relative strength-of-influence (two things both influence an outcome, but it is known one is a significantly stronger predictor than the other). We have not yet run experiments to test the value of these statements, but we do have defined mappings from such statements to constraints on probability distributions, and we expect similar results as were obtained for monotonicities.&lt;br /&gt;&lt;br /&gt;Longer term "knowledge-intensive machine learning" goals at Oregon State are more ambitious: automatic feature engineering, model simplification, etc.&lt;br /&gt;&lt;br /&gt;By the way, for those of you interested in details, I can provide a current draft of my thesis (especially if you are willing to provide constructive criticism :-).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-110969919045027242?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/110969919045027242/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=110969919045027242' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110969919045027242'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110969919045027242'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/03/kiml-future-work.html' title='KIML: future work'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-110938821766818526</id><published>2005-03-01T09:50:00.000-08:00</published><updated>2005-03-01T09:51:14.636-08:00</updated><title type='text'>KIML: known prior work</title><content type='html'>&lt;span style="font-style: italic;"&gt;&lt;/span&gt; I mentioned Bayes nets as the best existing example of what "knowledge intensive machine learning" is about. There really aren't very many techniques out there for learning given priors or constraints generated from high-level qualitative knowledge, and the few that I can think of are all structural.&lt;br /&gt;&lt;br /&gt;Inductive logic programming in some sense fits in here, but the intensional and extensional information must be in the same format -- logical statements -- which is a little unnatural for ML folks, and furthermore, the built systems don't deal well with uncertainty. Probabilistic logic programs (BLPs and SLPs) are a little closer, but they still only deal with structural domain knowledge (first-order rather than propositional). Qualitative reasoning techniques give us natural and intuitive ways to express qualitative domain knowledge (in particular, knowledge other than structural knowledge), but they don't use that knowledge to learn better models from data, and they have only primitive notions of uncertainty (in particular they don't deal with probabilities). Knowledge-based model construction offers ways of using domain knowledge to better answer queries, even ones on probabilities, but they generally assume hand-engineered probabilities, not ones learned from data.&lt;br /&gt;&lt;br /&gt;Of course, this leaves out the maxent work which of course is all about finding distributions which satisfy constraints -- clearly something I need to research more.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-110938821766818526?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/110938821766818526/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=110938821766818526' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110938821766818526'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110938821766818526'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/03/kiml-known-prior-work.html' title='KIML: known prior work'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-110938803838125504</id><published>2005-02-25T19:15:00.000-08:00</published><updated>2005-02-25T19:20:38.383-08:00</updated><title type='text'>Knowledge Intensive Machine Learning, Part 1</title><content type='html'>Machine learning is all about learning from &lt;span style="font-style: italic;"&gt;extensional&lt;/span&gt; information (datapoints).  The incorporation of &lt;span style="font-style: italic;"&gt;intensional&lt;/span&gt; information (domain knowledge) is generally done by hand via data encoding and feature and model selection. Doing this well is arguably the most difficult part of a successful machine learning system. It would sure be nice if our learning algorithm could make direct use of qualitative domain knowledge, say for model or feature selection, or reducing the parameter space?  This is the idea behind "knowledge intensive machine learning".&lt;br /&gt;&lt;br /&gt;Probably the best (though almost trivial) example of this is the simple Bayesian network. A Bayes net is a probabilistic model which can be learned from data, but for which the user can very easily specify high-level qualitative domain knowledge (causality or conditional independence) which significantly constrains the parameter space. What else do we have? Not much.&lt;br /&gt;&lt;br /&gt;Here's a motivating example: suppose you are building a simple anti-smoking propaganda Bayes net. You have domain knowledge that tells you what nodes should be connected (for example, smoking and lung cancer). But we know more than that structure: the &lt;span style="font-style: italic;"&gt;more&lt;/span&gt; a person smokes the &lt;span style="font-style: italic;"&gt;more&lt;/span&gt; at risk they are for lung cancer. The qualitative reasoning folks would write this as "SmokingFrequency Q+ LungCancer", indicating a qualitative (perhaps stochastic) monotonic influence. Now, instead of using that statement for qualitative simulation, we use it for putting a prior on our CPTs that enforces stochastic monotonicity, resulting in better classifiers in low-data situations.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-110938803838125504?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/110938803838125504/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=110938803838125504' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110938803838125504'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110938803838125504'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/02/knowledge-intensive-machine-learning.html' title='Knowledge Intensive Machine Learning, Part 1'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-11045925.post-110922226414969887</id><published>2005-02-23T21:13:00.000-08:00</published><updated>2005-02-23T21:35:18.826-08:00</updated><title type='text'>"Hello World"</title><content type='html'>My name is Eric Altendorf.  I'm currently studying with&lt;a href="http://www.cs.orst.edu/%7Etgd"&gt; Tom Dietterich&lt;/a&gt; at &lt;a href="http://www.eecs.orst.edu/"&gt;Oregon State University&lt;/a&gt; (I apologize if that link takes you to a large photograph of cows with glasses; I am absolutely not responsible for the department's graphic design choices). I've been convinced to share some thoughts on machine learning by my friend &lt;a href="http://yaroslavvb.blogspot.com/"&gt;Yaroslav&lt;/a&gt;, though I am quite sure I will have not so much to share as he.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/11045925-110922226414969887?l=machlearn.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://machlearn.blogspot.com/feeds/110922226414969887/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=11045925&amp;postID=110922226414969887' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110922226414969887'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/11045925/posts/default/110922226414969887'/><link rel='alternate' type='text/html' href='http://machlearn.blogspot.com/2005/02/hello-world.html' title='&quot;Hello World&quot;'/><author><name>E-Rock</name><uri>http://www.blogger.com/profile/05278400667141287464</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry></feed>
