Vapnik–Chervonenkis dimension: Difference between revisions

From Machinelearning
No edit summary
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 7: Line 7:
* https://en.wikipedia.org/wiki/Shattered_set
* https://en.wikipedia.org/wiki/Shattered_set
* https://math.stackexchange.com/questions/96655/how-to-calculate-vapnik-chervonenkis-dimension
* https://math.stackexchange.com/questions/96655/how-to-calculate-vapnik-chervonenkis-dimension
* baum's ''What is thought'' contains a discussion


I think there's three different "views" of the VC dimension:
I think there's three different "views" of the VC dimension:


* in terms of sets/powersets and shattering
* in terms of sets/powersets and [[shattering]]
* in terms of fitting parameters for a function
* in terms of fitting parameters for a function
* adversarial/game: to show the VC dimension is at least ''n'': you choose ''n'' points, the adversary chooses the labels, you must find a hypothesis from the hypothesis class that separates the labels cleanly
* adversarial/game: to show the VC dimension is at least ''n'': you choose ''n'' points, the adversary chooses the labels, you must find a hypothesis from the hypothesis class that separates the labels cleanly
Line 16: Line 17:
questions:
questions:


* does this work with more than two labels?
* does this work with more than two labels? (with the power sets view, obviously there's only yes/no classifications. but with the other two views, you can generalize to more labels; does doing this yield anything useful?)
* in the adversarial perspective, why do you get to pick the points? (this is a question about which definition is most useful.) is there a name for the thing where you can separate ''all'' points and all labels?
* in the adversarial perspective, why do you get to pick the points? (this is a question about which definition is most useful.) is there a name for the thing where you can separate ''all'' points and all labels?
* where does the hypothesis class come from? it seems like "lines", "circles", "convex sets" are some examples used.
* where does the hypothesis class come from? it seems like "lines", "circles", "convex sets" are some examples used.

Latest revision as of 19:53, 15 July 2019

just storing tabs here for now:

I think there's three different "views" of the VC dimension:

  • in terms of sets/powersets and shattering
  • in terms of fitting parameters for a function
  • adversarial/game: to show the VC dimension is at least n: you choose n points, the adversary chooses the labels, you must find a hypothesis from the hypothesis class that separates the labels cleanly

questions:

  • does this work with more than two labels? (with the power sets view, obviously there's only yes/no classifications. but with the other two views, you can generalize to more labels; does doing this yield anything useful?)
  • in the adversarial perspective, why do you get to pick the points? (this is a question about which definition is most useful.) is there a name for the thing where you can separate all points and all labels?
  • where does the hypothesis class come from? it seems like "lines", "circles", "convex sets" are some examples used.