CONCEPT LEARNING AS SEARCH

 

CONCEPT LEARNING AS SEARCH

·       Concept learning can be viewed as

·       the task of searching through a large space of hypotheses implicitly defined by the hypothesis representation.

·       The goal of this search is to

·       find the hypothesis that best fits the training examples.


Example:

        Consider the instances X and hypotheses H in the EnjoySport learning task.

        The attribute

        Sky has three possible values, and

        AirTemp, Humidity, Wind, Water, Forecast each have two possible values,

        the instance space X contains

        exactly 3*2*2*2*2*2 = 96 distinct instances

        5*4*4*4*4*4 = 5120 syntactically distinct hypotheses within H.

        Every hypothesis containing one or more "Φ" symbols represents the empty set of instances; that is, it classifies every instance as negative.

        1 + (4*3*3*3*3*3) = 973 Semantically distinct hypotheses.

A CONCEPT LEARNING TASK – Instance Space



Hypothesis Space

ü  Similarly there are 5 * 4 * 4 * 4 * 4 * 4 = 5120 syntactically distinct hypotheses within H.

ü  Notice, however, that every hypothesis containing one or more "ø" symbols represents the empty set of instances; that is, it classifies every instance as negative.

ü  Therefore, the number of semantically distinct hypotheses is only 1 + (4 *3 * 3 * 3 * 3 * 3) = 973.

ü  Our EnjoySport example is a very simple learning task, with a relatively small, finite hypothesis space.



General-to-Specific Ordering of Hypotheses


        To illustrate the general-to-specific ordering, consider the two hypotheses

h1 = (Sunny, ?, ?, Strong, ?, ?)

h2 = (Sunny, ?, ?, ?, ?, ?)

        Now consider the sets of instances that are classified positive by h1 and by h2. Because h2 imposes fewer constraints on the instance, it classifies more instances as positive.

        In fact, any instance classified positive by h1 will also be classified positive by h2. Therefore, we say that h2 is more general than h1.

 



More General Than hypothesis



 



·       In the figure,

ü  the box on the left represents the set X of all instances,

ü  the box on the right the set H of all hypotheses.

·       Each hypothesis corresponds to some subset of X

ü  – the subset of instances that it classifies positive.

·       The arrows connecting hypotheses represent

ü   the more - general -than relation,

ü  with the arrow pointing toward the less general hypothesis.

·       Note the subset of instances characterized by

ü     h2 subsumes the subset characterized by h1,

ü     hence  h2 is more - general– than h1.


Comments