Context spaces
From TheTangentSpace
for
Context spaces
Jump to:
navigation
,
search
;Definition : A ''context space'' is a triple <math>(X,I,Y)</math> of sets where <math>Y=\{y_i | i\in I\}</math>, each <math>y_i\subset X</math>, and every element <math>x\in X</math> is exactly the intersection (as a singleton) of some subset of <math>Y</math>. == Motivation and intuition == We can think of two people - Alice and Bob - trying to communicate; our goal is to model their communication in a very general way. Alice and Bob both think in terms of a certain context space <math>(X, I, Y)</math>, and Alice has chosen a particular <math>x\in X</math> to convey to Bob. But she can only communicate elements of the set <math>I</math>, so she sends over a subset <math>J\subset I</math> which is meant to indicate the element <math>\cap_{j\in J}y_j</math>. This may sound vaguely similar to Information Theory, but we are not focusing on probabilistic models or noise. The idea here is more pure and simplistic, in that we assume everything Alice says reaches Bob intact, and we do not consider likelihoods as a basic element of the study. In addition, traditional information theory implicitly assumes a sequential model of communication, whereas a context space is inspired by a setting without implicit order. The name ''context space'' was chosen based on the very intuitive idea that <blockquote> meaning = data + context. </blockquote> As a thought experiment, imagine we have a file which was recovered but whose file format is unknown. If we open the file as an image in a particular format, we see a picture of a puppy dog. If we treat the file as a text file, it is a recipe for a grilled cheese sandwich. If we don't know the origin of the file, we really have no way to know what the intended meaning was; hence the informal idea that understanding requires both raw data (i.e. the file itself) along with a ''context'' in which to interpret that data. We can imagine the same sort of ambiguity happening with a phonetic phrase which has distinct meanings in different languages. To be even more realistic, we can consider certain hand gestures known to be variously polite or not depending on the culture, or puns which are known to have some meaning in multiple contexts. ;Example : Let <math>X = \{a, b, c, d, e, f\}, I = \{1, 2, 3, 4\}</math>, and <math>y_1 = \{a, b, c\},\; y_2 = \{a, d, e\},\; y_3 = \{b, d, f\},\; y_4 = \{c, e, f\}</math>. To convey that she's thinking about element <math>c\in X</math>, Alice can send the set <math>\{1, 4\}</math>. From a naive perspective, it may seem that this is more work since it is 'two pieces' of information (the numbers 1 and 4) rather than one (the letter c). But the interesting thing here is that <math>|Y| < |X|</math>. If we measure ease of communication by <math>|Y|</math>, then using this context space makes the information easier to send. It's not obvious at this point that we can always find a <math>|Y|</math> much smaller than <math>|X|</math>, or that doing so actually corresponds to any kind of real-world efficiency. The following definitions and propositions will allow us to better understand the utility of context spaces. == Duality == If we consider the abstract structure of a context space (i.e. if we temporarily forget the rule that every singleton is an intersection from <math>Y</math>), then we arrive at a very natural and nice sense of duality. ; Definitions : We'll refer to any triple of sets <math>(X, I, Y)</math> in which <math>Y = \{y_i | i\in I\}</math> as a ''subset space''. For any element <math>x\in X</math>, define <math>\mathsf{rep}(x) := \{i\in I | x\in y_i\}</math>; this is meant to capture the best 'representation' of <math>x</math> available as a subset of <math>I</math>. Now we can define the ''dual of a subset space'' <math>(X, I, Y)</math> as <math>(X, I, Y)^* := (I, X, W)</math>, where <math>W = \{\mathsf{rep}(x) | x\in X\}</math>. From this definition, we'll use the notation <math>W(X, I, Y)</math> to indicate the set <math>W = \{\mathsf{rep}(x) | x\in X\}</math> for a given subset space <math>(X, I, Y)</math>. ; Property : <math>(X, I, Y)^{**} = (X, I, Y)</math> for any subset space. ; Proof : (Informal). We can identify every subset space with a <math>\{0, 1\}</math> matrix <math>A = (a_{ix})_{i\in I, x\in X}</math> with entries given by <math>a_{ix} = 1(x\in y_i)</math>. [The notation <math>1(a\in b)</math> is a function with value 1 when <math>a\in b</math>, and value 0 otherwise.] Each row (fixed <math>i</math>) gives the indicator function for the corresponding <math>y_i</math>, and each column (fixed <math>x</math>) gives <math>\mathsf{rep}(x)</math>. In this form, the dual is the matrix transpose. So it becomes clear that a double-transpose returns the original matrix. It would be easy to write out a more rigorous proof, but I think this analogy gives a good understanding of why the duality works. <math>\Box</math> == Context spaces are antichains == The following theorem reveals that context spaces are actually very natural objects, and ones which may admit some deep and general insights. By the way, an ''antichain'' is a set of mutually incomparable objects with respect to a certain partial ordering. ; The Antichain Theorem : A subset space <math>(X, I, Y)</math> is a context space iff <math>\cup Y = X</math>, and <math>W=W(X,I,Y)</math> is an antichain with respect to the subset relation. Equivalently, given a subset space <math>(X, I, W)</math>, we see that its dual is a context space iff <math>W</math> is an antichain, i.e. both <math>w_1\not\subset w_2</math> and <math>w_2\not\subset w_1</math> for any distinct subsets <math>w_1, w_2 \in W</math>. The proof (below) is straightforward, but will be a bit more elegant if we first analyze a pair of useful functions. === Basic properties of the elt and rep functions === We previously introduced the <math>\mathsf{rep}</math> function as a mapping <math>X\to 2^I</math>; from here on we'll re-interpret this as a set function <math>\mathsf{rep}:2^X\to 2^I</math> by taking the intersection of the singletons' images: <math>\forall \chi\subset X, \mathsf{rep}(\chi) := \{i\in I | \chi\subset y_i\}.</math> We'll now define the closely related function <math>\mathsf{elt}:2^I\to 2^X</math> as <math>\forall \iota\subset I, \mathsf{elt}(\iota) := \cap_{i\in\iota}y_i.</math> This pair of functions observes many nice properties. Without proof here (for brevity): we claim that they switch roles under duality. Each can be considered as an intersection of their values on singletons. In addition, given the pair <math>(X,I)</math>, any of the objects <math>Y, W, \mathsf{rep},</math> or <math>\mathsf{elt}</math> is sufficient to know everything about the context space <math>(X, I, Y)</math>. These are all straightforward to prove, including the following ; Property : * If <math>a\subset b\subset X</math>, then <math>\mathsf{rep}(b)\subset\mathsf{rep}(a)</math>. * If <math>a\subset b\subset I</math>, then <math>\mathsf{elt}(b)\subset\mathsf{elt}(a)</math>. We are now ready to prove the theorem. ; Proof of the antichain theorem : The <math>\Rightarrow</math> direction: <blockquote> Suppose that <math>w_1\subset w_2</math> where <math>w_1 = \mathsf{rep}(x_1), w_2 = \mathsf{rep}(x_2)\in W</math> for elements <math>x_1, x_2\in X</math>. By the above property, this means that <math>x_1\in\mathsf{elt}(w_1)\subset\mathsf{elt}(w_2)\ni x_2</math> <math>\Rightarrow</math> <math>\{x_1, x_2\}\subset\mathsf{elt}[\mathsf{rep}(x_2)]</math>. So every subset in <math>Y</math> containing <math>x_2</math> also contains <math>x_1</math>, and there is no intersection containing only the singleton <math>\{x_2\}</math>; i.e. <math>(X, I, Y)</math> cannot be a context space. </blockquote> The <math>\Leftarrow</math> direction: <blockquote> Suppose <math>W</math> is an antichain. For any <math>x_1\in X</math>, we know that <math>x_1\in\mathsf{elt}[\mathsf{rep}(x_1)]</math>; our goal is to show that any other element <math>x_2</math> is ''not'' in this set. Since <math>\mathsf{rep}(x_1)\not\subset\mathsf{rep}(x_2)</math>, there must be some element <math>y_0\in \mathsf{rep}(x_1)-\mathsf{rep}(x_2)</math> such that <math>x_1\in y_0 \not\ni x_2</math>. Thus <math>x_2\not\in\mathsf{elt}[y_0\cap\mathsf{rep}(x_1)]</math> <math>= \mathsf{elt}[\mathsf{rep}(x_1)]</math>, which was our goal.<math>\Box</math> </blockquote> == A lower bound from the antichain theorem == With the antichain theorem, we can better understand how small <math>|Y|</math> can be compared to <math>|X|</math>. ; Property : If <math>n=|Y|</math> in context space <math>(X,I,Y)</math>, then <math>|X|\le {n\choose \lfloor n/2 \rfloor}</math>. This inequality works as an equality when <math>Y</math> is the set of all subsets of <math>X</math> of size <math>\lfloor n/2 \rfloor</math>. This fact is basically a restatement of [http://en.wikipedia.org/wiki/Sperner_family Sperner's theorem]. Asymptotically, this means we can construct a family of context spaces <math>\bigg[C_n = (X_n, I_n, Y_n)\bigg]_{n=1}^\infty</math> where <math>n=|Y_n|</math> and <math>|X_n|\sim 2^n\sqrt{\frac{2}{\pi n}}</math> (from the above property and Stirling's formula). In other words, the set of elements represented (<math>X</math>) can be exponentially bigger than the pieces of information available (<math>Y</math>). Keep in mind that this is without any reference to positional or sequential information. This is significant since our normal decimal representation fundamentally uses the position and known length of a string of digits as major piece of information. To put this in perspective, consider how little you know about a number given that it contains digits from the set <math>\{1, 2, 7, 9\}</math>. This is not to argue that our traditional number systems are bad, but rather that it would be inappropriate to directly compare the above asymptotic growth against something like <math>d^n</math>, the number of elements representable by length-<math>n</math> strings from <math>d</math> digits. == Possible future work == As an extremely general but (as far as I know) not-yet-well-studied object, context spaces offer a plethora of research opportunities. Here are a few ideas. ; Computability and complexity theory <blockquote> * We can think of any algorithm as essentially converting one representation of data into another. For example, a Turing machine that recognizes language <math>L</math> can be thought of as converting an input context space (which depends on the problem) into another space with the single element <math>L\in Y</math>. From that point of view, it's easy to justify that we cannot require such a machine to guarantee termination if an input is not in <math>L</math> (since it has nothing to output in that case). Thus a context space naturally handles Turing recognition versus decidability. * I've already built a Turing-complete model of computation out of context spaces in a very natural way. Because this model is very general and is built upon easily interchangeable, axiom-like building blocks, it may lend itself to a theorem that solidifies the currently nonrigorous folk knowledge of the Church-Turing thesis. More specifically, we may be able to more carefully codify when a system is or is not Turing complete, and classify any theoretical systems which may have "more power" than this. * In addition, it may be interesting to look for lower bounds on the complexity of algorithms, or a class of related algorithms, using the new model. No progress has been made in this direction yet. </blockquote> ; Coding theory : We might be able to generalize many ideas from traditional coding theory which assume sequential or mutually exclusive pieces of data. <blockquote> * Entropy and redundancy - The traditional idea of entropy is built around the thought that a question has a set of mutually exclusive answers. In reality, there are many situations in which this is false. For example, if we ask someone their profession, there are many people who could fairly give different descriptions of what they do. We get one answer, but the resulting subsets contain significant overlap. Hence it could be useful to generalize entropy, as a measure of uncertainty, to a question whose answer is an element of <math>Y</math>, as well as to measure the redundancy of such a space / question. * Compression - As we saw in the bound resulting from the antichain theorem, it seems likely that we can derive bounds on the compressibility of certain representation techniques, as well as methods by which to achieve this. * Error correction - If we abstract entropy and compression results, it is our duty as reliable mathematicians to extend error correction ideas as well :) </blockquote> ; Looking for better representation systems <blockquote> * Ever since I learned about Roman numerals, I wondered if our current system of numeric representation is really optimal. The setting of context spaces could allow us to make some practical restrictions on which systems we consider, and actually prove (or disprove!) the optimality of traditional base-<math>n</math> systems within those restrictions. In addition to the results of the antichain theorem, we may be able to arrive at even better representation systems in other settings (e.g. non-sequential, or only partially-sequential data). * From a certain point of view, 3 is the best traditional base. I should make another post about that. This may partially justify why DNA "chose" to be base 4 (nearly optimal, from this perspective), or give us motivation to attempt to build more systems which work in base 3. </blockquote> ; Posys theory <blockquote> * By ''posys theory'' I'm trying to refer to a theory of generalized proofs in which a statement follows from a set of previous statements. I get this terminology from Chris Altomare, although I don't know how well-known the work is. Here is one link with context spaces: suppose that, for <math>\iota\subset I</math>, we have <math>\mathsf{elt}(\iota)\subset y_i</math>. We can think of this as indicating that "knowing" <math>\iota</math> makes the information of having property <math>i</math> completely redundant; we could summarize this as <math>\iota\Rightarrow i</math>. From here we can study this type of redundancy, or links in <math>I</math>, from the perspective of proof systems. For example, is it true that any proof system can be represented as such by a context space? Intuitively, I will guess that any nontrivial proof system can be represented this way. </blockquote> ; Zero-knowledge proofs <blockquote> * A [http://en.wikipedia.org/wiki/Zero-knowledge_proof zero-knowledge proof] (ZKP) is a protocol for interaction between parties that allows one person to practically confirm that another knows <math>x</math> without the value of <math>x</math> itself being revealed. It seems that context spaces provide a very natural setting within which to arrive at general ZKP results, as well as a potential uniform language within which to present all such protocols. </blockquote> ; Game theory <blockquote> * Although it doesn't usually fall under the now-classic study of ''game theory'', the game of [http://en.wikipedia.org/wiki/Mastermind_(board_game) mastermind] was actually the representation I was originally trying to model that lead to the definition of a context space. To generalize this, there may be many interactions (such as even those of classic game theory) where a player may take many actions, but we may systematically categorize and study these actions as pertaining to a context space. From one point of view, we may assume that each player has a hidden state <math>x\in X</math>, and interpret each action as revealing another element <math>i\in I</math> which gives a clue about their hidden state. From another perspective, we may consider games as having an element of chance to them, which we could model as an extra player whose moves are never publicly revealed - but the consequence of those moves is. In reality, such a player could model the mood of the market for competing companies, or other external factors such as the weather. Now we can consider any set of known-player moves resulting in not a single deterministic outcome, but in a set of possible outcomes, whose final value depends on the hidden player. In this setting, each player's moves are similar to choosing elements in <math>I_1</math> from a context space; and at the same time,if we temporarily assume the hidden player has a static state, each outcome of the hidden player could be another piece of information <math>i\in I_2</math> in another (related) context space. </blockquote> ; General questions <blockquote> * Can we immediately achieve some other interesting facts about context spaces by applying other equivalent objects, as we did for Sperner families? From some point of view, each of the following objects are equivalent to a subset space: ** Binary matrices ** Hypergraphs ** Bipartite graphs ** Relations * What is the natural morphism between two context spaces? Can we elegantly build a category for these objects? * Which interactions/communications in real life are better modeled non-sequentially than sequentially? </blockquote> [[Category: Math]]
Return to
Context spaces
.
Views
Page
Discussion
View source
History
Personal tools
Log in
Navigation
Main page
Community portal
Current events
Recent changes
Random page
Help
Search
Toolbox
What links here
Related changes
Special pages