% No 'submit' option for the problems by themselves.
\documentclass{cos302}
% Use the 'submit' option when you submit your solutions.
%\documentclass[submit]{cos302}
% Put in your full name and email address.
\name{Your Name}
\email{email@princeton.edu}
\discussants{Marie Curie \\ Leonhard Euler}
% You don't need to change these.
\course{COS302-S20}
\assignment{Assignment \#2}
\duedate{6:00pm Friday 19 February 2021}
\dropboxurl{https://www.gradescope.com/courses/214271/assignments/1003646}
\usepackage{bm} % for bold mat
\begin{document}
% IMPORTANT: Uncomment the \documentclass command above with the [submit] option and put your name where it says ``your name''.
\begin{center}
Assignments in COS 302 should be done individually. See the \href{https://www.cs.princeton.edu/courses/archive/spring21/cos302/files/syllabus.pdf}{course syllabus} for the collaboration policy.
\vspace{\baselineskip}
\textcolor{red}{%
Remember to append your Colab PDF as explained in the first homework, with all outputs visible.\\
When you print to PDF it may be helpful to scale at 95\% or so to get everything on the page.}
\end{center}
\begin{problem}[23pts]
Imagine that you have an old-fashioned balance scale as shown below.
\begin{center}
\includegraphics[width=4cm]{scale}
\end{center}
You have five objects of unknown mass~${m_1, m_2, m_3, m_4, m_5}$.
You would like to find the mass of each of these objects using the scale.
If you put one or more of the objects on each side of the scale, you will be able to measure the difference in weight.
This scale does not work, however, if one side is left empty.
\begin{enumerate}[(A)]
\item You make a measurement of objects~$\{1,2\}$ against objects~$\{3,4\}$ and get a difference of 1kg.
Write this as a linear equation for the masses.
\item What is the minimum number of such measurements necessary to recover all five masses?
\item If the answer to the above question is~$N$, come up with~${N-1}$ measurements that, when combined with the answer to (A) above would make it possible to recover the masses.
\item What would~$\bm{A}$ and~$\bm{b}$ be in the matrix form of the linear system corresponding to the~$N$ measurements?
That is, if you were to set this up as~$\bm{A}\bm{m}=\bm{b}$ where~$\bm{m}$ is the vector of unknown masses.
Denote the result of the~$i$th measurement as~$b_i$.
(Typeset these using the \href{https://www.overleaf.com/learn/latex/Matrices}{\texttt{bmatrix}} environment.)
\end{enumerate}
\end{problem}
% Example LaTeX code for the bmatrix environment:
%
% \begin{bmatrix}
% 1 & 2 & 3 \\
% 4 & 5 & 6
% \end{bmatrix}
%
\newpage
\begin{problem}[25pts]
The core library for numerical computation in Python is called \href{https://numpy.org/}{NumPy}.
NumPy provides useful abstractions to create and manipulate multidimensional arrays (e.g., vectors, matrices, tensors), and functions that are thin layers over high-performance C/C++/Fortran BLAS and LAPACK libraries.
Conventionally you would start your numeric Python code with an \texttt{import numpy as np}.
The most basic object is the NumPy array, which is a multi-dimensional array usually made up of type \texttt{double}, i.e., \href{https://en.wikipedia.org/wiki/Double-precision_floating-point_format}{64-bit floating point numbers}.
A vector is usually a one-dimensional array and a matrix is a two-dimensional array:
\begin{lstlisting}[style=python]
example_vector = np.array([50.0, 1.4, 0.4, 0.23, 1.04])
example_matrix = np.array([[ 3.4, 1.10, 0.8 ],
[ 4.2, 100.0, -1.4],
[ 1.1, 0.44, 9.4]])
\end{lstlisting}
Higher-dimensional arrays (tensors) are possible, but come up less often than vectors and matrices.
The tuple made up of the size each dimension is called the \emph{shape} and can be accessed as an attribute, so:
\begin{lstlisting}[style=python]
print(example_vector.shape, example_matrix.shape)
# Output: (5,) (3,3)
\end{lstlisting}
Create a Colab notebook following the same pattern as the previous assignment, and do the following:
\begin{enumerate}[(A)]
\item Use \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html}{\texttt{arange}} to create a variable named \texttt{foo} that stores an array of numbers from 0 to 29, inclusive. Print \texttt{foo} and its shape.
\item Use the \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html}{\texttt{reshape}} function to change \texttt{foo} to a validly-shaped two-dimensional matrix and store it in a new variable called \texttt{bar}. Print \texttt{bar} and its shape.
\item Create a third variable, \texttt{baz} that reshapes it into a valid three-dimensional shape. Print \texttt{baz} and its shape.
\item There are several different ways to \href{https://docs.scipy.org/doc/numpy/user/basics.indexing.html}{index into NumPy arrays}. Use two-dimensional array indexing to set the first value in the second row of \texttt{bar} to -1. Now look at \texttt{foo} and \texttt{baz}. Did they change? Explain what's going on. (Hint: does \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html}{\texttt{reshape}} return a \href{https://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html}{view or a copy}?)
\item Another thing that comes up a lot with array shapes is thinking about how to aggregate over specific dimensions. Figure out how the NumPy \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.sum.html}{\texttt{sum}} function works (and the \texttt{axis} argument in particular) and do the following:
\begin{enumerate}[(i)]
\item Sum \texttt{baz} over its second dimension and print the result.
\item Sum \texttt{baz} over its third dimension and print the result.
\item Sum \texttt{baz} over \textbf{both} its first and third dimensions and print the result.
\end{enumerate}
\item Along with shaping and indexing, we also do a lot of \href{https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html}{slicing} which is where you index with ranges to get subvectors and sometimes submatrices. Write code to do the following:
\begin{enumerate}[(i)]
\item Slice out the second row of \texttt{bar} and print it.
\item Slice out the last column of \texttt{bar} using the -1 notation and print it.
\item Slice out the top right $2\times 2$ submatrix of \texttt{bar} and print it.
\end{enumerate}
\end{enumerate}
\end{problem}
\newpage
\begin{problem}[25pts]
One feature of NumPy that is powerful but tricky is the ability to perform \href{https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html}{broadcasting}, which really just refers to repeatedly performing an operation over one or more dimensions.
Start a new problem in your Colab notebook and when you're done with the entire assignment, follow the same procedure for appending a PDF and inserting the URL as you did in the previous assignment.
The notebook URL is near the end of the assignment.
\begin{enumerate}[(A)]
\item The most basic kind of broadcast is with a scalar, in which you can perform a binary operation (e.g., add, multiply, ...) on an array and a scalar, the effect is to perform that operation with the scalar for every element of the array.
To try this out, create a vector~${1,2,\ldots,10}$ by adding 1 to the result of the \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html}{\texttt{arange}} function.
\item Now, create a~${10 \times 10}$ matrix~$\bm{A}$ in which~${A_{ij}=i+j}$. You'll be able to do this using the vector you just created, and adding it to a reshaped version of itself.
\item A very common use of broadcasting is to standardize data, i.e., to make it have zero mean and unit variance.
First, create a fake ``data set'' with 50 examples, each with five dimensions.
\begin{lstlisting}[style=python]
import numpy.random as npr
data = np.exp(npr.randn(50,5))
\end{lstlisting}
You don't worry too much about what this code is doing at this stage of the course, but for completeness: it imports the \href{https://docs.scipy.org/doc/numpy-1.14.0/reference/routines.random.html}{NumPy random number generation library}, then generates a~${50 \times 5}$ matrix of standard normal random variates and exponentiates them.
The effect of this is to have a pretend data set of 50 independent and identically-distributed vectors from a log-normal distribution.
\item Now, compute the \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html}{mean} and \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.std.html#numpy.std}{standard deviation} of each column.
This should result in two vectors of length 5.
You'll need to think a little bit about how to use the \texttt{axis} argument to \texttt{mean} and \texttt{std}.
Store these vectors into variables and print both of them.
\item Now standardize the \texttt{data} matrix by 1) subtracting the mean off of each column, and 2) dividing each column by its standard deviation.
Do this via broadcasting, and store the result in a matrix called \texttt{normalized}.
To verify that you successfully did it, compute the mean and standard deviation of the columns of \texttt{normalized} and print them out.
\end{enumerate}
\end{problem}
\newpage
\begin{problem}[25pts]
Fairly often in machine learning and other disciplines that use linear algebra to formalize computational problems, you see explicit matrix inverses like~$\bm{A}^{-1}\bm{b}$ in papers and books.
Of course, you learn in linear algebra classes how to perform such an inverse, and in the \href{https://docs.scipy.org/doc/numpy/reference/routines.linalg.html}{NumPy linear algebra library} you'll see \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.inv.html#numpy.linalg.inv}{a function that will compute the inverse}.
It seems sensible right? And it's right there on the page: \emph{Invert the matrix~$\bm{A}$ and then multiply by the vector~$\bm{b}$.} How hard can it be?
Well... don't do that. When you see~$\bm{A}^{-1}\bm{b}$ you should read that as ``the~$\bm{x}$ that is the result of solving the linear system~$\bm{A}\bm{x}=\bm{b}$''.
You should use something like \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.solve.html#numpy.linalg.solve}{the \texttt{solve} function} in the NumPy linear algebra package.
If you have to solve multiple such systems with the same matrix, you might think about performing an \href{https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.linalg.lu.html}{LU decomposition}, a topic we'll return to in a couple of weeks.
The issue is one of \href{https://en.wikipedia.org/wiki/Numerical_stability}{numerical stability}, which boils down to the fact that 1) floating point numbers on computers don't have infinite precision, and 2) some quantities (e.g., infinite sums or numerical integrals) require us to truncate and approximate if we want the computation to complete.
These issues are deep and this course won't delve into them significantly.
However, the ``explicit inverse'' issue is worth looking at just to see an example.
\begin{enumerate}[(A)]
\item A \href{https://en.wikipedia.org/wiki/Vandermonde_matrix}{Vandermonde matrix} is a matrix generated from a vector in which each column of the matrix is an integer power starting from zero.
So, if I have a column vector~$[x_1, x_2, \ldots, x_N]^{T}$, then the associated (square) Vandermonde matrix would be
\begin{align*}
\bm{V} &= \begin{bmatrix}
1 & x_1 & x_1^2 & x_1^3 & \cdots & x_1^{N-1}\\
1 & x_2 & x_2^2 & x_2^3 & \cdots & x_2^{N-1}\\
\vdots & \vdots & \vdots & \vdots &\ddots & \vdots\\
1 & x_N & x_N^2 & x_N^3 & \cdots & x_N^{N-1}
\end{bmatrix}
\end{align*}
Use what you learned about broadcasting in the previous problem to write a function that will produce a Vandermonde matrix for a vector~$[1, 2, \ldots, N]^{T}$ for any~$N$.
Do it without using a loop.
Here's a stub to get you started:
\begin{lstlisting}[style=python]
def vandermonde(N):
vec = np.arange(N)+1
# Finish me.
\end{lstlisting}
Use your function for $N=12$, store it in variable named \texttt{vander}, and print the result.
\item Now, let's make a pretend linear system problem with this matrix.
Create a \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.ones.html}{vector of all ones}, of length 12 and call it \texttt{x}.
Perform a \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html}{matrix-vector multiplication} of \texttt{vander} with the vector you just created and store that in a new vector and call it \texttt{b}.
Print the vector \texttt{b}.
\item First, solve the linear system the na\"ive way, pretending like you don't know \texttt{x}.
Import \href{https://docs.scipy.org/doc/numpy/reference/routines.linalg.html}{\texttt{numpy.linalg}}, invert \texttt{V} and multiply it by \texttt{b}.
Print out your result.
What should you get for your answer? If the answer is different than what you expected, write a sentence about that difference.
\item Now, solve the same linear system using \href{https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.solve.html#numpy.linalg.solve}{\texttt{solve}}.
Print out the result.
Does it seem more or less in line with what you'd expect?
\end{enumerate}
\end{problem}
\newpage
\begin{problem}[2pts]
Approximately how many hours did this assignment take you to complete?
\end{problem}
My notebook URL:
{\small \url{https://colab.research.google.com/XXXXXXXXXXXXXXXXXXXXXXX}}
\subsection*{Changelog}
\begin{itemize}
\item 8 February 2021 -- S21 version.
\end{itemize}
% \includepdf[pages=-]{mynotebook.pdf}
\end{document}