% No 'submit' option for the problems by themselves.
\documentclass{cos302}
% Use the 'submit' option when you submit your solutions.
%\documentclass[submit]{cos302}
% Put in your full name and email address.
\name{Your Name}
\email{email@princeton.edu}
\discussants{Marie Curie \\ Leonhard Euler}
% You don't need to change these.
\course{COS302-S21}
\assignment{Assignment \#8}
\duedate{6:00pm April 9, 2021}
\dropboxurl{https://www.gradescope.com/courses/214271/assignments/1137002}
\usepackage{bm} % for bold math
\usepackage{mathtools} % colon-equals
\usepackage{multirow}
\begin{document}
% IMPORTANT: Uncomment the \documentclass command above with the [submit] option and put your name where it says ``your name''.
\begin{center}
Assignments in COS 302 should be done individually. See the \href{https://www.cs.princeton.edu/courses/archive/spring21/cos302/files/syllabus.pdf}{course syllabus} for the collaboration policy.
\vspace{\baselineskip}
\textcolor{red}{%
Remember to append your Colab PDF as explained in the first homework, with all outputs visible.\\
When you print to PDF it may be helpful to scale at 95\% or so to get everything on the page.}
\end{center}
\begin{problem}[20pts]
Consider the following cumulative distribution function for a random variable~$X$ that takes values in~$\mathbb{R}$:
\begin{align*}
F(x) &= P(X \leq x) = \frac{1}{1 + e^{-x}}
\end{align*}
\begin{enumerate}[(A)]
\item What is the probability density function for this random variable?
\item Find the inverse distribution (quantile) function~$F^{-1}(u)$ that maps from~$(0,1)$ to~$\mathbb{R}$.
\item In a Colab notebook, implement inversion sampling and draw 1000 samples from this distribution.
Make a histogram of your results.
\end{enumerate}
\end{problem}
\newpage
\begin{problem}[20pts]
Consider two random variables~$X$ and~$Y$ that are exponential and Poisson, respectively.
The exponential random variable~$X$ has parameter~$\lambda$:
\begin{align*}
X &\sim \text{Exponential}(\lambda)\,.
\end{align*}
The notation~$\sim$ is like saying ``drawn according to the distribution ...''.
The Poisson distribution, however, has as its parameter the random variable~$X$:
\begin{align*}
Y\,|\,X\!=\!x &\sim \text{Poisson}(x)
\end{align*}
This notation is saying ``the random variable $Y$, conditioned on~$X=x$ is drawn according to a Poisson distribution where~$x$ is the parameter''.
\begin{enumerate}[(A)]
\item Write the joint probability density function~$p(x,y)$ for these random variables.
\item Compute the marginal distribution for~$Y$.
Here are the rough steps you should follow to do this, which are reflective of the kinds of computations one often does in probabilistic machine learning:
\begin{enumerate}[1.]
\item Write down the definite integral that you solve in order to get~$p(y)$ from $p(x,y)$.
\item Pull terms that don't depend on~$x$ outside the integral.
\item Inside the integral, collect exponents.
\item See if the pieces inside the integral have a form that you recognize.
If they look like an unnormalized probability density with known form, then you can sometimes solve the integral by inspection using that PDF's normalization constant.
This kind of thing comes up a lot.
(Hint: for this problem you want to have a look at the \href{https://en.wikipedia.org/wiki/Gamma_distribution}{gamma distribution}.)
\item Simplify. In this problem you'll probably want to take advantage of the relationship between the factorial and the \href{https://en.wikipedia.org/wiki/Gamma_function}{gamma function}.
\end{enumerate}
\end{enumerate}
\end{problem}
\newpage
% Independence vs Conditional Indepdence
\begin{problem}[18pts]
Recall that two random variables \(X\) and \(Y\) are independent if and only if \(P(X,Y) = P(X)P(Y)\).
Two random variables \(X\) and \(Y\) are \textbf{conditionally} independent given a third event \(Z\)
if \(P(X,Y | Z) = P(X|Z)P(Y|Z)\).
The following problem explores the relationship between independence and conditional independence, specifically
whether independence implies conditional independence or vice versa.
\begin{enumerate}[(A)]
\item
Imagine that you have two coins: one regular fair coin ($P(\text{heads}) = 0.5$) and one fake two-headed coin ($P(\text{heads})=1$).
Consider the following experiment: choose a coin at random and toss it twice. Define the following events:
\begin{itemize}
\item $A$ is the event that the first coin toss results in a heads.
\item $B$ is the event that the second coin toss results in a heads.
\item $C$ is the event that the regular fair coin has been selected.
\end{itemize}
Are the events $A$ and $B$ independent? Give a qualitative answer, i.e., either yes because... or no because...
Are they conditionally independent given $C$? Show your work, i.e., explicitly show the definition holds as given in the problem statement.
\item
Roll a single six-sided die and consider the following events:
\begin{itemize}
\item \(X\) is that you roll 1 or 2, i.e., \(X = \{1,2\}\).
\item \(Y\) is that you roll an even number, i.e., \(Y = \{2,4,6\}\).
\item \(Z\) is that you roll 1 or 4, i.e., \(Z = \{1,4\}\).
\end{itemize}
Are the events \(X\) and \(Y\) independent? Are they conditionally independent given \(Z\)? For both of these
questions show the definitions hold as in the problem statement.
\end{enumerate}
\end{problem}
\newpage
\begin{problem}[40pts]
In 1777, \href{https://en.wikipedia.org/wiki/Georges-Louis_Leclerc,_Comte_de_Buffon}{Georges-Louis Leclerc, Comte de
Buffon} introduced a simple Monte Carlo method for estimation, now called the \href{https://en.wikipedia.org/wiki/B
uffon\%27s_needle_problem}{Buffon's needle problem}.
It's a fun way to estimate the constant~$\pi$ via physical simulation.
In this problem, we'll look at the following variant: you have a piece of paper on a tabletop and drawn on it is a square with sides of length~$w$.
Centered and entirely contained in the square is a circle with radius~$r$. (So $2r
\leq w$.) You drop $n$ points (say, grains of sand) onto the piece of paper in a spatially uniform way. You count the
grains and find that the ratio of grains inside the circle ($n_c$) to the total number grains inside the square is~$\alpha$:
\begin{align*}
\alpha &= \frac{\text{\# of grains inside circle}}{\text{\# of grains inside square}} = \frac{n_c}{n}\,.
\end{align*}
\begin{enumerate}[(A)]
\item Write an estimate for~$\pi$ in terms of~$w$, $r$, $n$ and $n_c$.
\item What kind of probability distribution will $n_c$ follow? What is the standard deviation of $n_c$ as a function of $n$ and $\alpha$?
\item Assume that the radius of the circle is 1. What box width $w$ minimizes the standard deviation of your estimate of $\pi$ when you form this Monte Carlo estimate? (Hint: use the standard deviation from your answer in in (B), and reason about the relationship between $w$ and $\alpha$. Do you want $w$ to be as small as possible, as big as possible, or somewhere in between?
\item Perform a real-life version of this experiment and estimate~$\pi$ with physical simulation. You could
do it with paper and sand or something, as described above. Or you can be more creative! Some ideas:
\begin{itemize}
\item Grate a known amount of cheese onto a square pan that has a pizza in the middle. Weigh the cheese that doesn't fall onto the pizza.
\item Put a circular cup into a box and drop coins or peas or something into the box and count the fraction that fall into the cup.
\item Make a batch of cookies that are all perfect circles. Spread sprinkles over the entire baking pan and count (or weigh) how many don't make it onto cookies. You'll need to work out the compensation for having more than one cookie.
\item Put a round bucket inside a square bucket and stick it out in the rain. Measure the difference in water collected.
\end{itemize}
You'll need to measure the dimensions of the rectangular region and the radius of the circle.
Draw a couple of dozen samples and report your estimate. Explain what you did and include a photograph of your setup!
(Use a command like \verb|\includegraphics[width=3in]{myimage.jpg}| to include an image file
within \LaTeX.)
\end{enumerate}
\end{problem}
\newpage
\begin{problem}[2pts]
Approximately how many hours did this assignment take you to complete?
\end{problem}
My notebook URL:
{\small \url{https://colab.research.google.com/XXXXXXXXXXXXXXXXXXXXXXX}}
\subsection*{Changelog}
\begin{itemize}
\item 29 March 2021 - S21 initial version
\end{itemize}
% \includepdf[pages=-]{mynotebook.pdf}
\end{document}