Published on

Linear Algebra Ch1: Vectors and Vector Spaces

Authors
  • avatar
    Name
    走歪的工程師James
    Twitter

中文版點此

This series will the the culmination of the last 4 months I spent studying linear algebra. I intend to make a video series on linear algebra. This text version will serve as a stepping stone that will allow me to organize the logic and content to include in the video version, so that I can later focus entirely on crafting the animations without splitting my attention.

Most of the content is extracted from Linear Algebra Done Right and put in my own language. The numbering for all the definitions and theorems will be the same as the original text to allow the reader to refer back to the original text.

The purpose of this series is to provide an easier starting point for beginning learners, not to replace the original text or serve as a standalone resource for learning. I encourage motivated readers to read the original text for deeper understanding and insights.

The original text provides many exercises for readers to complete. Maybe too many! And I think doing exercises is only effective up to a certain point. The main goal is to understand the key concepts. You will forget a lot of the details after you are done with the book anyway. Hence, I will pick a selection of the problems for readers to complete for optimal efficiency.

Math cannot be learnt without actually writing it down on paper and thinking it through step by step. So I highly recommend doing these exercises. Refer to the original text or use help from AI to to facilitate your learning. They will definitely be of great help!

1.A:Vectors

Before taking linear algebra, most of us have seen a lot of vectors. In high school math, what we regarded as vectors were arrows or points on the 2D plane. Such as [21]\left[\begin{array}{r}2 \\ 1 \end{array}\right].

We've also seen arrows and points in the 3D space like [212]\left[\begin{array}{r}2 \\ 1 \\ -2 \end{array}\right]. But vectors are not limited to 2 or 3 dimensional spaces. We can extend this concept to n-dimensional spaces.

[x1x2xn]\left[\begin{array}{r}x_1 \\ x_2 \\ \vdots \\ x_n \end{array}\right]

Up to this point, we have denoted vectors by vertical n×1n \times 1 matrices. Conventionally, we call vertical n×1n \times 1 matrices vectors, not horizontal 1×n1 \times n matrices.

Horizontal matrices such as [21]\left[\begin{array}{r}2 & 1\end{array}\right] are called row vectors.

Another way to denote (column) vectors are parentheses. For example, Both of these notations denote the same vector.

[1.10.03.67.2]=(1.1,0.0,3.6,7.2)\left[\begin{array}{r}-1.1 \\ 0.0 \\ 3.6 \\ -7.2\end{array}\right] = (-1.1,0.0,3.6,-7.2)

More generally,

[x1x2xn]=(x1,x2,...,xn)\left[\begin{array}{r}x_1 \\ x_2 \\ \vdots \\ x_n\end{array}\right] = (x_1, x_2, ...,x_n)

Besides arrows or points in spaces, vectors can take on many physical meanings:

Color Codes

Colors can be expressed by RGB vectors: colors

Asset Allocation

The asset allocation below can be represented by (100,50,25,10)(100, 50, 25, 10)

Stock Returns

A vector can represent the daily return of a stock. (0.022,+0.014,+0.004)(−0.022, +0.014, +0.004), for example, can represent a stock that went down 2.2% on the first day, up 1.4% the next day, and up again 0.4% on the third day.

Cash Flow

Can flow can be represented by a vector as well. Suppose each entry gives the cash flow in a quarter. (1000,10,10,1010)(1000, -10, -10, -1010) means a 1 year loan of $1000 with 1% interest payments made each quarter, and the principle and the last interest payment at the end.

Images

grayscale image

Above is a 256×256256 \times 256 image. Flattening it gives us a 65536-vector that represents the image.

(0.792,0.788,0.819,0.792,0.795,,0.803,0.803,0.807,0.780,0.835)(0.792,0.788, 0.819, 0.792, 0.795, \dots, 0.803, 0.803, 0.807, 0.780, 0.835)

Videos

A grayscale video composed of KK frames that are M×NM \times N, can be represented by a KMNKMN-vector.

Other Examples

There are many other examples I can't fit into this article. Here are some of them. Try to think of a way to represent these data with vectors:

  • quantities of products hold
  • values across a population (e.g. blood pressure)
  • probabilities (e.g. coin flip)
  • time series (temperature measurements)
  • customer purchases
  • features or attributes of an entity
  • frequencies (e.g. word count)

Notations

One of the biggest mistakes I've made with math is the mindset of "as long as it's understandable" and not appreciating the importance of writing math in a neat, organized fashion. I used to just write down just a pile of numbers to show the process of my calculation. It was through learning how to write proofs that I understood how you can use language and words to explain and illustrate what you are doing with the numbers and notations. Knowing how to do this will make your math much more readable and understandable.

Knowing the notations well is necessary in order to translate your thoughts into clear and rigorous expressions.

Think of code. Using your notations well is equivalent to naming your variables well when coding. Explain each steps in your reasoning is equivalent to writing readable code and meaningful commit messages and comments.

Now we will look at notations that you will often see and use in linear algebra. It is crucial to familiarize yourself with notations.

00 vectors

For convenience, we have cleaner notations for vectors. For example:

03=(0,0,0)0_3 = (0, 0, 0)

We often leave out the subscript as well. In this example, the dimensionality of 00 is implied by the context. So we can write:

(3,4,1)+0=(3,4,1)+(0,0,0)(3,4,1) + 0 = (3,4,1) + (0, 0, 0)

unit vectors

The i-th unit vector refers to the vector that has 1 in the ith position and 0 everywhere else.

e1=[100],e2=[010],e3=[001]e_1=\left[\begin{array}{l}1 \\ 0 \\ 0\end{array}\right], \quad e_2=\left[\begin{array}{l}0 \\ 1 \\ 0\end{array}\right], \quad e_3=\left[\begin{array}{l}0 \\ 0 \\ 1\end{array}\right]

In other words:

(ei)j={1j=i0ji,\left(e_i\right)_j= \begin{cases}1 & j=i \\ 0 & j \neq i,\end{cases}

Here, eie_i is a vector, and (ei)j(e_i)_j is a number. (the jj-th element).

ones vectors

We use the notation 1n\mathbf{1}_n for the n-vector with all its elements equal to 1. As with zero vectors, the size is usually determined from the context, in which case we just write 1\mathbf{1}.

Definition of Vector and Notations

Lists

Having looked at various examples, we've familiarized ourselves with vectors. However, we have not defined what a vector truly is. Before we do that, let us define lists.

Definition 1.8:list

  • Supposenn is a non-negative number, a list of length nn is an ordered collection of elements. (an element could be a number, a list or other objects)
  • Two lists are equal if they are equal in length and have the same elements in the same order

Some examples of valid lists:

  • (1,2)(1, 2)
  • ()()
  • ((1,2),(3),3,i)((1, 2), (3), 3, i)

You maybe thinking

Lists look similar to vectors, are vectors just lists?

The answer is it depends. Lists are indeed one of the most commonly seen form of vectors. However, vectors are a more general concept, so besides lists, there are other instantiations of vectors. As we formally define vector spaces, and see examples that are not lists, you will gradually develop an intuition for it.

R\mathbf{R}, C\mathbf{C}, and F\mathbf{F}

You are probably already familiar with the notations R\mathbf{R} and C\mathbf{C}. R\mathbf{R} represents the set of all the real numbers, and C\mathbf{C} the set of all the complex numbers.

Rn\mathbf{R}^n, naturally, represents the set of all n-lists of real numbers.

For example:

  • (1,5)R2(1, 5) \in \mathbf{R}^2
  • (1+i,2+3i,0)C3(-1+i, 2+3i, 0) \in \mathbf{C}^3

So, are vectors just lists of length n of elements of R\mathbf{R} or C\mathbf{C}?

From an application perspective, this definition indeed works most for of the cases. But mathematicians are not satisfied with this. They want to go as abstract as possible. They want to find a structure more general than R\mathbf{R} or C\mathbf{C}. You will see what this means should you stay till the end. For now, you can think of vectors as just Rn\mathbf{R}^n or Cn\mathbf{C}^n to help conceptualize it.

Many times, we don't specify if we are working with R\mathbf{R} or C\mathbf{C}, so we will use F\mathbf{F} to denote R\mathbf{R} or C\mathbf{C}.

Definition 1.6:notation: F

F\mathbf{F} = R\mathbf{R} or C\mathbf{C}
More specifically, if F=R\mathbf{F}=\mathbf{R} and n=2n=2, then Fn\mathbf{F}^n is the real plane(R2\mathbf{R}^2). If n=3n=3, then Fn\mathbf{F}^n represents R3\mathbf{R}^3

In fact, many of the theorems that work for R\mathbf{R} and C\mathbf{C} also work for arbitrary fields. A field is essentially a set over which addition, subtraction, multiplication and division are defined. So R\mathbf{R} and C\mathbf{C} are both fields.

Even though many theorems can be applied to any field, most applications are concerned with R\mathbf{R} and C\mathbf{C}. In this series, we will not deal with fields other than R\mathbf{R} and C\mathbf{C}.

Addition and Scalar Multiplication on F^n

What linear algebra does, to put it in very simple terms, is to extend addition and multiplication from 1 dimension to multi-dimensional spaces. So let's start with the most intuitive example, extending addition to Fn\mathbf{F}^n.

Addition on Fn\mathbf{F}^n

Addition on Fn\mathbf{F}^n is identical to the familiar addition on R2\mathbf{R}^2以及R3\mathbf{R}^3: you just perform number addition on each entry of the list.

定義1.13: F^n中的加法

(x1,x2,...,xn)+(y1,y2,...,yn)=(x1+y1,x2+y2,...,xn+yn)(x_1, x_2, ..., x_n) + (y_1, y_2, ..., y_n) = (x_1+y_1, x_2+y_2, ..., x_n+y_n)

A few examples:

  • If F=RF=R,then (1,1)+(3,5)=(2,4)(1,-1) + (-3, 5) = (-2, 4)
  • If F=CF=C,then (1+i,2i)+(i,0)=(1+2i,2i)(1+i, 2-i) + (i, 0) = (1+2i, 2-i)

This definition of addition satisfies commutativity, as you may verify.

Theorem 1.14:commutativity in F^n

If x,yFnx, y \in \mathbf{F}^n, then x+y=y+xx+y=y+x.

Notice that we use the word "define" here. This implies we could define a non-commutative addition. For example:

(x1,x2,...,xn)+(y1,y2,...,yn)=(x1+2y1,x2+2y2,...,xn+2yn)(x_1, x_2, ..., x_n) + (y_1, y_2, ..., y_n) = (x_1+2y_1, x_2+2y_2, ..., x_n+2y_n)

This definition of addition is not commutative.

We are not typically interested in these types of addition though, as they don't have the properties we desire.

Scalar Multiplication

Now we turn to define scalar multiplication on Fn\mathbf{F}^n

As its name suggests, we multiply a vector by a scalar. We don't multiply 2 vectors with scaler multiplication.

Suppose we have a vector (x1,x2,...,xn)Fn(x_1, x_2, ..., x_n) \in F^n, and a scalar λF\lambda \in F, then

Definition 1.18:scalar multiplication on F^n

λ(x1,x2,...,xn)=(λx1,λx2,...,λxn)\lambda (x_1, x_2, ..., x_n) = (\lambda x_1, \lambda x_2, ..., \lambda x_n)

A few examples:

  • If F=RF=R,then 3(1,1)=(3,3)3(1,-1) = (3, -3)
  • If F=RF=R,then 0(1,1)=00(1,-1) = 0, the 00 on the right being a vector, not a number
  • If F=CF=C,then i(1+i,2i)=(1+i,1+2i)i(1+i, 2-i) = (-1+i, 1+2i)

Selected Exercises from Linear Algebra Done Right, Ch1.A

  1. Find xR4x \in \mathbf{R}^4 such that
(4,3,1,7)+2x=(5,9,6,8)(4,-3,1,7)+2 x=(5,9,-6,8)
  1. Explain why there does not exist λC\lambda \in \mathrm{C} such that
λ(23i,5+4i,6+7i)=(125i,7+22i,329i).\lambda(2-3 i, 5+4 i,-6+7 i)=(12-5 i, 7+22 i,-32-9 i) .
  1. Show that λ(x+y)=λx+λy\lambda(x+y)=\lambda x+\lambda y for all λF\lambda \in \mathbf{F} and all x,yFnx, y \in \mathbf{F}^n.

1.B:Vector Spaces

Defining Vector Spaces

We previously defined addition and multiplication over Fn\mathbf{F}^n to show how these operations extend from one dimension to multiple dimensions.

Now we can proceed to define a vector space.

In simple terms, a vector space is a set where we can perform addition and scalar multiplication meaningfully

Let's break this down:

  1. For any two elements u,vVu, v \in V in a vector space VV, their sum u+vu+v is also in VV.
  2. For any scalar λF\lambda \in \mathbf{F} and vector vVv \in V, their product λv\lambda v is in VV.

To make this intuitive concept more formal, let us look at the actual definition of a vector space.

Definition 1.20:definition of vector space

A vector space VV is a set along with an addition and scalar multiplication on VV such that the following properties hold.

commutativity

u+v=v+uu+v=v+u for all u,vVu, v \in V

associativity

(u+v)+w=u+(v+w)(u+v)+w=u+(v+w) and (ab)v=a(bv)(a b) v=a(b v) for all u,v,wVu, v, w \in V and for all a,bFa, b \in \mathbf{F}

additive identity

There exists an element 0V0 \in V such that v+0=vv+0=v for all vVv \in V

additive inverse

For every vVv \in V,there exists wVw \in V such that v+w=0v+w=0

multiplicative identity

1v=v1 v=v for all vVv \in V

distributive properties

a(u+v)=au+ava(u+v)=a u+a v(a+b)v=av+bv(a+b) v=a v+b v for all a,bFa, b \in \mathbf{F} and all u,vVu, v \in V

Wow, that's a lot of conditions all at once - looks complicated!

Actually, this concept is quite simple. Remember what we said earlier? The most intuitive way to understand a vector space is as a "multi-dimensional set with defined addition and multiplication."

We want all the familiar properties of addition and multiplication that we know from R\mathbf{R} and C\mathbf{C} to apply to VV as well. That's really all there is to it - you'll notice that all the rules above are just properties we're familiar with from addition and multiplication in R\mathbf{R} or C\mathbf{C}.

We want these same properties to work in vector spaces, which is why we define vector spaces this way.

At this point, you might want to go back and verify that the addition and multiplication operations we defined earlier for Fn\mathbf{F}^n satisfy all these conditions. Check why Fn\mathbf{F}^n is a vector space, whether F=R\mathbf{F}=\mathbf{R} or F=C\mathbf{F}=\mathbf{C}. Let me demonstrate how to verify that Fn\mathbf{F}^n satisfies the third rule.

F^n has an additive identity

Suppose V=FnV= \mathbf{F}^n and uVu\in V,then there exists 0V0 \in V such that u+0=uu+0=u

Proof:

  • (0,,0)Fn(0, \dots, 0) \in \mathbf{F}^n
  • u+(0,,0)=uu + (0, \dots, 0) = u

"Isn't this obvious just by looking at it? Why do we need to prove it?"

It might seem tedious, but this is the nature of mathematical proof - we need to ensure each step is rigorous and can't skip any steps. This example is simple, so you might feel an urge to skip the proof since the result seems obvious. However, as the things we need to prove become more complex, you'll start to appreciate the importance of becoming familiar with this step-by-step reasoning process.

You can verify in the same way that Fn\mathbf{F}^n satisfies all the conditions for a vector space. It's precisely because Fn\mathbf{F}^n meets these conditions that we can say Fn\mathbf{F}^n is indeed a vector space.

The vector spaces Rn\mathbf{R}^n and Cn\mathbf{C}^n are so common they get their own names:

Definition 1.22: Real Vector Space, Complex Vector Space

  • A vector space over R\mathbf{R} is called a real vector space
  • A vector space over C\mathbf{C} is called a complex vector space.

Now we look at an example to illustrate that vector spaces aren't limited to Fn\mathbf{F}^n, but are actually a more general structure.

Definition 1.24:Vector Space of Functions

  • If SS is a set, then FS\mathbf{F}^S denotes the set of all functions from set SS to F\mathbf{F}
  • For f,gFSf, g \in \mathbf{F}^S, their sum f+gFSf+g \in \mathbf{F}^S is a function. For all xSx \in S, we define f+gf+g by
(f+g)(x)=f(x)+g(x)(f+g)(x)=f(x)+g(x)
  • For λF\lambda \in \mathbf{F} and fFSf \in \mathbf{F}^S,their product λfFS\lambda f \in \mathbf{F}^S is a function. For any xSx \in S,we defineλf\lambda f by (λf)(x)=λf(x)(\lambda f)(x)=\lambda f(x)

The above definition, is a set of functions. We defined addition and multiplication on this set to make it a vector space. You may verify this definition satisfies the definition of a vector space, making FS\mathbf{F}^S a valid vector space.

As an example, let us verify FS\mathbf{F}^S has an additive identity.

In the set FS\mathbf{F}^S, there exists 0:SF0: S \rightarrow \mathbf{F} defined by

0(x)=00(x)=0

for all xSx \in S

This 00 function is the additive identity in FS\mathbf{F}^S. The other conditions can be verified in a similar fashion.

As such, FS\mathbf{F}^S is a valid vector space, and we can refer to functions in this set as vectors, as with any other vector spaces.

At this point, you should be able to appreciate that vector spaces are not limited to the form Fn\mathbf{F}^n, and a vector doesn't have to be a list.

Further Thoughts: Matrices

We can use Fm,n\mathbf{F}^{m,n} to represent the set of all m×nm \times n matrices.

For example:

  • [1001]F2,2 \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \in \mathbf{F}^{2, 2}
  • [123444]F3,2 \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 4 & 4 \end{bmatrix} \in \mathbf{F}^{3,2}

So, is Fm,n\mathbf{F}^{m,n} a vector space? What definitions of matrix addition and scalar multiplication would make Fm,n\mathbf{F}^{m,n} satisfy the conditions of a vector space?

Selected Exercises From Linear Algebra Done Right, Ch1.B

  1. The empty set is not a vector space. The empty set fails to satisfy only one of the requirements listed in the definition of a vector space (1.20). Which one?

  2. Show that in the definition of a vector space (1.20), the additive inverse condition can be replaced with the condition that

    0v=0 for all vV0 v=0 \text { for all } v \in V \text {. }

    Here the 0 on the left side is the number 0, and the 0 on the right side is the additive identity of VV.

    The phrase a "condition can be replaced" in a definition means that the collection of objects satisfying the definition is unchanged if the original condition is replaced with the new condition.

  3. Suppose VV is a real vector space.

    • The complexification of VV, denoted by VCV_{\mathrm{C}}, equals V×VV \times V. An element of VCV_{\mathrm{C}} is an ordered pair (u,v)(u, v), where u,vVu, v \in V, but we write this as u+ivu+i v.

    • Addition on VCV_{\mathrm{C}} is defined by

      (u1+iv1)+(u2+iv2)=(u1+u2)+i(v1+v2)\left(u_1+i v_1\right)+\left(u_2+i v_2\right)=\left(u_1+u_2\right)+i\left(v_1+v_2\right)

      for all u1,v1,u2,v2Vu_1, v_1, u_2, v_2 \in V.

    • Complex scalar multiplication on VCV_{\mathrm{C}} is defined by

      (a+bi)(u+iv)=(aubv)+i(av+bu)(a+b i)(u+i v)=(a u-b v)+i(a v+b u)

      for all a,bRa, b \in \mathbf{R} and all u,vVu, v \in V.

    Prove that with the definitions of addition and scalar multiplication as above, VCV_{\mathrm{C}} is a complex vector space.

1.C:Subspaces

Now that we have a clear idea of what vector spaces are, we are ready to learn what a subspace is.

Definition 1.33:Subspace

Let UU be a subset of VV. If UU satisfies the definition of a vector space under the same additive identity, vector addition, and scalar multiplication definitions, then UU is a subspace of VV.

In simple terms, if a subset of VV is itself a vector space, then it's a subspace. For example, R2\mathbf{R}^2 is a vector space, and (1,1){(1, 1)} is a subset, but clearly it's not a vector space. This is because when you multiply (1,1)(1, 1) by any scalar, the resulting vector isn't in this subset. However, λ(1,1):λR{\lambda(1, 1) : \lambda \in R} is a subspace.

So, how can we quickly check if a subset is a valid subspace? Do we really need to verify all the six rules every time?

Actually, it's simpler than you might think. We only need to check:

Theorem 1.34: quick check for subspaces

A subset UU of a vector space VV is a subspace of VV if and only if it satisfies these three conditions:

Additive identity

0U0 \in U

Closed under addition

If u,wUu, w \in U, then u+wUu+w \in U

Closed under scalar multiplication

If aFa \in \mathbf{F} and uUu \in U, then auUau \in U

Straightforward, isn't it? A vector space is, by definition, closed under addition and multiplication after all.

Straightforward as it may seem, it not not that simple. To prove these 3 conditions are both necessary and sufficient for a subspace, we need to prove both directions: left to right and right to left.

From left to right, that is

UU is a subspace     \implies the three conditions hold

This proof is straightforward because if we assume UU is a subspace, then UU satisfies all 6 basic definitions of a vector space, so naturally it satisfies these three conditions as well.

The trickier part is proving from right to left, that is:

The three conditions hold     \implies UU is a subspace

To complete this proof, we need to first assume those 3 conditions holds and derive from there that UU satisfies the definition of a subspace.

Since this series is meant to be just an introduction, providing an entry point, we won't delve into the details of the proof. If you want to understand more deeply, you can refer to Linear Algebra Done Right.

Back to the main topic. Now that we know how to verify a subspace, let's look at a few examples to familiarize ourselves with the concept:

  1. If bFb \in \mathbf{F}, then

    U={(x1,x2,x3,x4)F4:x3=5x4+b}U=\left\{\left(x_1, x_2, x_3, x_4\right) \in \mathbf{F}^4: x_3=5 x_4+b\right\}

    is a subspace of F4\mathbf{F}^4 if and only if b=0b=0.

  2. The set of all differentiable real-valued functions on R\mathbf{R} is a subspace of RR\mathbf{R}^{\mathbf{R}}.

Example 1

Let's first look at the first example

U={(x1,x2,x3,x4)F4:x3=5x4+b}U=\left\{\left(x_1, x_2, x_3, x_4\right) \in \mathbf{F}^4: x_3=5 x_4+b\right\}

If b0b \neq 0 ,then x4=0x_4=0 implies x3x_3 is not 00. This implies the additive identity 00 is not in this set. Therefore UU is not a subspace.

The above characterization provides an intuition as an entry point. To complete a rigorous proof, we, again, need to prove both directions.

  1. (    )(\implies) Suppose UU is a subspace, and then prove b=0b=0
  2. (    )(\impliedby) Suppose b=0b=0,and then prove UU is a subspace

I will leave out the details here. Try to complete the proof following the sketch of the proof provided.

Example 2

The set of all differentiable functions on R\mathbf{R} can be denoted by:

U={f:RRf is differentiable}U=\{f: \mathbf{R} \rightarrow \mathbf{R} \mid f \text{ is differentiable} \}

Since fUf \in U is a function that maps R\mathbf{R} to R\mathbf{R}, fRRf \in \mathbf{R}^\mathbf{R}, so UU is a subset of RR\mathbf{R}^\mathbf{R}.

So to show, UU is a subspace, we only need to verify UU is (a) closed under addition, (b) closed under scalar multiplication, and (c) has an additive identity.

(a): The sum of two differntiable function is clearly differentiable. Hence (a) holds.

(b): A scalar multiple of a differentiable function is still differentiable. Hence (b) holds.

(c): Lastly, 0(x)=00(x) = 0 is trivially a differentiable function, so the additive identity exists in UU.

With that, we have successfully proved UU is indeed a subspace of RR\mathbf{R}^\mathbf{R}.

Sum and Direct Sum of Vector Spaces

We have learned what is a valid subspace via some examples.

An idea that naturally follows is, how do we combine subspaces?

The Union operation on sets may come to mind as vector spaces are, in fact, sets. However, some quick verification will reveal that the sum of two vector spaces is not always a valid vector space with union as addition

For example, suppose A={(x,0):xR}A = \{(x, 0) : x \in R\}, and B={(0,y):yR}B = \{(0, y) : y \in R\}. Even though both AA and BB are vector spaces, their union U=ABU = A \cup B is not a vector space, because (1,0)U(1, 0) \in U and (0,1)U(0, 1) \in U, but the sum 1,1U(1, 1)\notin U

So, union is not the definition of addition that we want.

You may pause and think, what kind of definition of addition will make the sum still a valid subspace? Below is the natural result you will arrive at:

Definition 1.36: sum of subspaces

Suppose V1,,VmV_1, \ldots, V_m are subspaces of VV. The sum of V1,,VmV_1, \ldots, V_m denoted by V1++VmV_1+\cdots+V_m, is the set of all possible sums of elements in V1,,VmV_1, \ldots, V_m. Specifically,

V1++Vm={v1++vm:v1V1,,vmVm}.V_1+\cdots+V_m=\left\{v_1+\cdots+v_m: v_1 \in V_1, \ldots, v_m \in V_m\right\} .

With this definition of addition, the sum of vector spaces will still be a vector space.

Going back to the example we just looked at: A={(x,0):xR}A = \{(x, 0) : x \in R\} and B={(0,y):yR}B = \{(0, y) : y \in R\}. With this definition of addition, A+B=R2A+B=\mathbf{R}^2.

Direct Sums

The sum of vector spaces involves adding vector spaces together, and a common application of this is when we want to decompose a vector space into multiple subspaces. In this case, we are particularly interested in situations where each vector can be uniquely decomposed.

This situation is called a direct sum. Let's look at the definition of a direct sum.

Definition 1.41: Direct Sum

Suppose V1,,VmV_1, \ldots, V_m are subspaces of VV.

  • If every element in V1++VmV_1+\cdots+V_m can be represented uniquely as a sum v1++vmv_1+\cdots+v_m (where each vkVkv_k \in V_k), then the sum V1++VmV_1+\cdots+V_m is called a direct sum.
  • If V1++VmV_1+\cdots+V_m is a direct sum, we denote V1++VmV_1+\cdots+V_m as V1VmV_1 \oplus \cdots \oplus V_m, where the symbol \oplus is used to indicate that this is a direct sum.

Let's use a few examples to deepen our understanding of direct sums:

Example of a Direct Sum

Suppose UU is the subspace of F3\mathbf{F}^3 where the last coordinate is equal to 0, and WW is the subspace of F3\mathbf{F}^3 where the first two coordinates are equal to 0, that is:

U={(x,y,0)F3:x,yF} and W={(0,0,z)F3:zF}.U=\left\{(x, y, 0) \in \mathbf{F}^3: x, y \in \mathbf{F}\right\} \quad \text { and } \quad W=\left\{(0,0, z) \in \mathbf{F}^3: z \in \mathbf{F}\right\} .

Then F3=UW\mathbf{F}^3=U \oplus W, which you can verify for yourselves.

Here's the English translation of the paragraph:

Example of a Non-Direct Sum

Suppose

V1={(x,y,0)F3:x,yF},V2={(0,0,z)F3:zF},V3={(0,y,y)F3:yF}\begin{aligned} & V_1=\left\{(x, y, 0) \in \mathbf{F}^3: x, y \in \mathbf{F}\right\}, \\ & V_2=\left\{(0,0, z) \in \mathbf{F}^3: z \in \mathbf{F}\right\}, \\ & V_3=\left\{(0, y, y) \in \mathbf{F}^3: y \in \mathbf{F}\right\} \end{aligned}

Then F3=V1+V2+V3\mathbf{F}^3=V_1+V_2+V_3, because every vector (x,y,z)F3(x, y, z) \in \mathbf{F}^3 can be written as

(x,y,z)=(x,y,0)+(0,0,z)+(0,0,0)(x, y, z)=(x, y, 0)+(0,0, z)+(0,0,0)

where the first vector on the right side belongs to V1V_1, the second vector belongs to V2V_2, and the third vector belongs to V3V_3.

However, F3\mathbf{F}^3 is not a direct sum of V1,V2,V3V_1, V_2, V_3, because the vector (0,0,0)(0,0,0) can be written as a sum v1+v2+v3v_1+v_2+v_3 in multiple ways. Specifically, here's one way to decompose it:

(0,0,0)=(0,1,0)+(0,0,1)+(0,1,1)(0,0,0)=(0,1,0)+(0,0,1)+(0,-1,-1)

And here's another way:

(0,0,0)=(0,0,0)+(0,0,0)+(0,0,0),(0,0,0)=(0,0,0)+(0,0,0)+(0,0,0),

where in each of the above equations, the first vector on the right side belongs to V1V_1, the second vector belongs to V2V_2, and the third vector belongs to V3V_3. Therefore, the sum V1+V2+V3V_1+V_2+V_3 is not a direct sum.

Next, let's look at some properties of direct sums.

Theorem 1.45: Condition for Direct Sum

Suppose V1,,VmV_1, \ldots, V_m are subspaces of VV. Then V1++VmV_1+\cdots+V_m is a direct sum \Longleftrightarrow the only way to represent 0 as a sum v1++vmv_1+\cdots+v_m (where each vkVkv_k \in V_k) is by having each vkv_k equal to 0.

First, how do we prove this theorem? Since this is an if and only if relationship, we need to prove both directions.

(    )(\implies)

To prove the forward direction, we first assume that V1,,VmV_1, \ldots, V_m is a direct sum. Then by the definition of a direct sum, all vectors have only one way to be decomposed. So naturally, the 0 vector also has only one way to be decomposed.

(    )(\impliedby)

To prove the reverse direction, we first assume that 0 has only one way to be decomposed. Since V1,,VmV_1, \ldots, V_m are individually subspaces, they each contain the 0 vector (the additive identity element). So, 0 can be decomposed in the following way:

0=0++00 = 0+\ldots+0

Since we've already assumed that 0 has only one way to be decomposed, this means the above decomposition is the only way.

Next, assume any vector vV1,Vmv \in V_1, \ldots V_m, and let's decompose vv in two ways:

v=v1++vmv=u1++umv=v_1+\cdots+v_m \\ v=u_1+\cdots+u_m

Subtracting these two equations, we get

0=(v1u1)++(vmum).0=\left(v_1-u_1\right)+\cdots+\left(v_m-u_m\right) .

Because we've already proven that the only way to decompose 0 is

0=0++00 = 0 + \ldots + 0

We get

u1=v1,u2=v2,,um=vmu_1=v_1, u_2=v_2, \ldots , u_m=v_m

Therefore, any vector v has only one way to be decomposed. With this, we have proven both directions.

How should we interpret this theorem? What does it mean?

It means that "we only need to confirm that the only way to decompose the 0 vector into a combination of V1,VmV_1, \dots V_m is 0+0++00+0+\dots+0, to be certain that V1,,VmV_1, \dots, V_m is a direct sum." In other words, we don't need to confirm that every vector in VV can only be decomposed into a unique combination, we just need to confirm that 0 has only one combination.

You might have also got a vague sense that the intersection of two subspaces VV and WW must be {0}\{0\} for them to be a direct sum. Because any subspace must contain {0}\{0\}, the intersection must also have {0}\{0\}.

Moreover, if the intersection contains vectors other than 0, it won't be a direct sum, because there will be vectors with more than one way to decompose. Specifically: if UWU \cap W contains a non-zero vector vv, then it can have two decompositions v=0+v=v+0v=0+v=v+0, so they are not a direct sum.

In fact, UW={0}U \cap W=\{0\} is not only a necessary condition, it's also a sufficient condition. That is, as long as UW={0}U \cap W=\{0\}, it means UWU \oplus W. Let's see why below.

Theorem 1.46: Direct Sum Implies Intersection is Zero

Suppose UU and WW are subspaces of VV, then:

U+WU+W is a direct sum UW={0}\Longleftrightarrow U \cap W=\{0\}.

Proof:

This proof also requires proving both directions.

(    )(\implies)

Assume V+WV+W is a direct sum. Now we want to prove that their intersection contains only the 00 vector. So we assume vVWv \in V \cap W. If we can successfully prove that v=0v = 0, we will have successfully proven that V+W=0V+W=0.

  • Since vVWv \in V \cap W, vVv \in V and vWv \in W
  • Since WW is a vector space, vW-v \in W
  • We can decompose 00 as 0=v+(v)0 = v + (-v), where vVv \in V and vW-v \in W
  • Since VV and WW are both subspaces, they both contain the 00 vector, so 0=0+00 = 0 + 0 is also a way to decompose
  • Since V+WV+W is a direct sum, 00 has only one way to decompose, so v=v=0v = -v = 0, which proves the statement

(    )(\impliedby)

Assume UW={0}U \cap W = \{0\}. According to 1.45, we only need to prove that the unique way to decompose 00 is 0=0+00 = 0+0 to prove that U+WU+W is a direct sum.

  • Assume 0=u+w0 = u + w, where uUu \in U and wWw \in W
  • 0=u+w0 = u + w means u=wu = -w
  • This means uWu \in W (because uu is the additive inverse of ww)
  • So uUW={0}u \in U \cap W=\{0\}
  • Therefore u=0u=0 and w=0w=0, which proves the statement

Selected Exercises From Linear Algebra Done Right, Ch1.C

  1. For each of the following subsets of F3\mathbf{F}^3, determine whether it is a subspace of F3\mathbf{F}^3.

    • (a) {(x1,x2,x3)F3:x1+2x2+3x3=0}\left\{\left(x_1, x_2, x_3\right) \in \mathbf{F}^3: x_1+2 x_2+3 x_3=0\right\}
    • (b) {(x1,x2,x3)F3:x1+2x2+3x3=4}\left\{\left(x_1, x_2, x_3\right) \in \mathbf{F}^3: x_1+2 x_2+3 x_3=4\right\}
    • (c) {(x1,x2,x3)F3:x1x2x3=0}\left\{\left(x_1, x_2, x_3\right) \in \mathbf{F}^3: x_1 x_2 x_3=0\right\}
    • (d) {(x1,x2,x3)F3:x1=5x3}\left\{\left(x_1, x_2, x_3\right) \in \mathbf{F}^3: x_1=5 x_3\right\}
  2. Prove or give a counterexample: If UU is a nonempty subset of R2\mathbf{R}^2 such that UU is closed under addition and under taking additive inverses (meaning uU-u \in U whenever uU)u \in U), then UU is a subspace of R2\mathbf{R}^2.

  3. Prove or give a counterexample: If V1,V2,UV_1, V_2, U are subspaces of VV such that

    V=V1U and V=V2U,V=V_1 \oplus U \quad \text { and } \quad V=V_2 \oplus U,

    then V1=V2V_1=V_2.

    Hint: When trying to discover whether a conjecture in linear algebra is true or false, it is often useful to start by experimenting in F2\mathbf{F}^2.

References