Set theory is considered (rightly or wrongly) as a foundation

for mathematics, i.e. a theory in which all mathematics either takes
place, or at least can be embedded. It's not quite clear whether it
is entirely successful in this: is it true that all mathematical
work takes place in set theory? For example, some reasonable
manipulations with the collection of all sets cannot be expressed in
the most standard set theory, $\text{ZFC}$. Also, it is not
really obvious if most mathematicians think of numbers such as
$1/2$ or $\pi $ as sets, rather than plain numbers, or if set theory
does not already go beyond ordinary

mathematics in some ways.
(We shall see examples of this later.) However, set theory, and in
particular $\text{ZFC}$, is about the best we currently have as a
theory for all mathematics

.

It is easy to see why we would like a formal

or axiomatic

theory of sets: this is to allow us to apply metamathematical methods
to determine the relative strength of various systems, the strength
of proposed new axioms, and so on. Gödel's theorems tell us we cannot
hope to have a single system for mathematics that is provably
consistent, but we can at least study relative consistency results.
(Again, we shall see examples of this later.)

So what on basis is such a system to be found? Well one necessity is that the system of logic should be formalised too, and there is one and only one single successful logic to which all or almost all mathematicians are happy enough to work in and for: first order logic. So at least as a starting point we look for theories of sets in first order logic.

The particular language we take for set theory is one that seems at first sight to be very restrictive. We take the language ${L}_{\in}$ with usual first order connectives (including equality) and the single binary relation $\in $ for set membership. That this isn't quite as restrictive as it seems follows by looking at the definition of many familiar set theoretical notatons. For example $x\subseteq y$ for $\forall z\u200a(z\in x\to z\in y)$ (note that axiomatic set theory tends to use lower case letters for sets, and also of course the first order implication sign $\to $); $x=\left\{u,v\right\}$ for $\forall z\u200a(z\in x\leftrightarrow z=u\vee z=v)$; and $x=\{z:\varphi \left(z\right)\}$ for $\forall z\u200a(z\in x\leftrightarrow \varphi (z\left)\right)$.

In axiomatic set theory, objects of the theory are all sets. That is, they are
pure

sets: their elements are sets, and elements of those, and so on.

The idea of an axiomatic theory of sets is that it should provide a
framework for the whole of mathematics. A second consideration is
that, as logicians we are studying a set as the extension

of
property

. We would like, it seems at first glance, to allow that
every definable property to have a set of objects associated with it,
more precisely the set of all elements that satisfy this property. If
this is to make sense we should accept the principle that two sets are
equal if and only if they have the same elements. This is called the
*Extensionailty axiom*.

**Axiom of Extensionality:**
$\forall x\u200a\forall y\u200a(\forall z\u200a(z\in x\leftrightarrow z\in y)\to x=y)$.

The implication in the other direction, $x=y\to \forall z\u200a(z\in x\leftrightarrow z\in y)$, follows from first order logic and we do not have to state it as a special case here. All our theories of sets will include the axiom of Extensionality.

Frege's theory of sets was to take extensionality plus the axiom scheme of comprehension,

**Axiom Scheme of Comprehension:**
$\forall \stackrel{\_}{a}\u200a\exists x\u200a\forall z\u200a(z\in x\leftrightarrow \varphi (z,\stackrel{\_}{a}\left)\right)$
for all first order $\varphi (z,\stackrel{\_}{a})$

This makes a very natural theory exactly in accordance with our idea that sets should correspond to (first order) properties. Frege spent many years of his life developing his theory as a mathematical-cum-philosophical theory of property, and as a basis for the idea of number. For Frege, a number is the property of a set describing the size of that set, and a set might have 6 elements if it has the property of having size 6, and hence is a member of the set 6. This is an elegant idea to get the counting numbers off the ground. Unfortunately Russell proved Frege's theory inconsistent. Any model of Frege's theory would have to have an element $R$ such that $\forall z\u200a(z\in R\leftrightarrow z\notin z)$. But then $R\in R\leftrightarrow R\notin R$ which is contradictory.

A number of alternatives to Frege's theory have been proposed, all with some weaknesses or other. The main one of these is Zermelo-Fraenkel set theory, proposed by Zermelo and developed further by Fraenkel and Skolem, which is discussed later. However others also suggested modification of Frege's comprehension axiom, with a certain amount of success. Here are some of these, in no particular order.

- Russell suggested that the language of set theory should be replaced by a
typed

language in which $z\notin z$ could not even be stated. His theory, the theory of types, is consistent but not very strong. - Quine suggested that Russell's idea of typing could be applied
to the comprehension axiom scheme alone, and so $\{z:z\notin z\}$
is not provided by this axiom scheme. The resulting theory, NF, can prove assertions
not available in Russell's type theory such as $\exists x\u200ax\in x$. However
the consistency of NF (relative to some
standard

theory such as ZF) is a notoriously difficult problem, and highlights the fact that Quine's suggestion is a syntactic trick, and doesn't provide semantic insight into sets. A variant where the axiom of extensionality is weakened, NFU, is known to be consistent. - A number of people (Malitz, Hinnion, Forti, Weydert and Honsell) have pointed out
that the problem with the Russell set could be
the negation sign (the
not

in $z\notin z$); this leads to a theory of positive set theory, which is weak but consistent, with some well-understood models. - Krajíček has suggested that modal logic can rescue Frege's comprehension scheme: $\forall \stackrel{\_}{a}\u200a\exists x\u200a\forall z\u200a(z\in x\leftrightarrow \varphi (z,\stackrel{\_}{a}\left)\right)$. The consistency of this scheme is open and seems to depend on the choice of modal logic used.

As an alternative, Zermelo started in a more piecemeal way, adding axioms of set theory that follow from comprehension, as and when they are needed. We present a few of these now.

**Axiom of Empty Set:**
$\exists x\u200a\forall y\u200ay\notin x$.

**Axiom of Pair Set:**
$\forall x\u200a\forall y\u200a\exists z\u200a\forall w\u200a(w\in z\leftrightarrow w=x\vee w=y)$.

**Axiom of Sum Set:**
$\forall x\u200a\exists y\u200a\forall z\u200a(z\in y\leftrightarrow \exists w\u200a(z\in w\wedge w\in x))$.

**Axiom of Power Set:**
$\forall x\u200a\exists y\u200a\forall z\u200a(z\in y\leftrightarrow \forall w\u200a(w\in z\to w\in x))$.

**Axiom Scheme of Separation:**
$\forall \stackrel{\_}{a}\u200a\forall b\u200a\exists x\u200a\forall z\u200a(z\in x\leftrightarrow z\in b\wedge \varphi (z,\stackrel{\_}{a}\left)\right)$.

This doesn't complete the axioms, but makes a start. The set whose existence is stated in the pair set axiom is usually written $\left\{x,y\right\}$. That in the sum set axiom is $\cup x$. (This is a kind of union; more precisely it is the union of all elements of $x$.) And that in the power set axiom is $P\left(x\right)$. As before, separation is an axiom scheme, one axiom for each first order $\varphi $. The idea is we can only make by comprehension subsets of sets already constructed.

Some formulations of set theory are based on a conception of constructions
of sets in stages. The cumulative hierarchy is the main one quoted, especially
to back up Zermelo's theory. Against this it has been argued that even the axiom of separation
is too strong and difficult to justify as in $\{x\in y:\varphi \left(x\right)\}$ the quantifiers
in $\varphi \left(x\right)$ have to range over the whole universe befor that universe has been created.
Such people argue for weaker *predicative* theories with definitions presented in correct order.