List of sample questions for Quant Interviews

Andy Nguyen · 9/29/11

Question #1 (statistics)
How do you test whether a data sample is normal or not?
(Comment: this tests your knowledge in basic statistics; there are several different methods to test normality.)
Question #2 (math)
Show that a set is convex if and only if its intersection with any line is convex. Show that a set is affine if and only if its intersection with any line is affine.
(Taken from chapter 2 of Boyd and Vandenberghe, Convex Optimization)
(Comment: concepts related to convex sets are important in optimization, and many quant jobs involve optimization research.)
Question #3 (finance - bond pricing)
I don't know anything about bond pricing, but I've heard people use something called the discount rate when they price bonds. Can you explain what this "discount rate" is? Why is it important? Where do I get its value? What is the current discount rate (as of today)? When I price a 30-year bond, should I use today's discount rate or should I use a different discount rate for each of the next 30 years?
(Comment: typical set of basic questions asked at a lot of quant interviews, not just those for fixed income-related positions)
Question #4 (probability theory)
Say you are on a game show [historical sidenote: this question was first played on the 60s American game show Let's Make a Deal, hosted by Monty Hall], and there are three closed doors. Behind one door is a car, the prize you dream of, and behind the other two are goats. You pick a door. The host, who knows what's behind each door, opens another which reveals a goat. Now, the host lets you make another choice: should you stick with you first door choice, or should you switch and pick a different door, in order to win the car?
Question #5 (applied math)
What's a Hermitian matrix? What important property does a Hermitian matrix's eigenvalues possess? What's the practical implication of this property in applications?
Question #6 (statistics)
Random variable X is distributed as N(a, b), and random variable Y is distributed as N(c, d). What is the distribution of (1) X+Y, (2) X-Y, (3) X*Y, (4) X/Y?
(Comment: another very popular quant interview question, regardless of whether the position itself involves statistical modeling)
Question #7 (finance - options)
What's put-call parity in option pricing? How does one derive this relationship? What crucial assumptions are necesary?
Tough case question: if you observe put-call parity not currently holding in the market, how do you make money off this observation? As you trade, what do you need to watch out for and what risks must you be aware of?
Question #8 (math - stochastics)
Show that exp{-t/2 + W(t)} is a martingale.
[Courtesy of Dr. Yun Cheng of ITG]
Question #9 (programming - C++)
Is the following valid C++ code? If so, what does it print?
cout << (int *) "Home of the jolly bytes";
[Taken from chapter 4 of Prata, C++ Primer Plus (5th ed.)]
Question #10 (econometrics - time series)
Is an AR(p) process stationary? Why or why not?
Tough question: in practice, how do you determine the order of an MA or AR model?
Question #11 (econometrics - time series)
What's a GARCH model? Why is it an important/useful model? When would you use the GARCH model?
Can you write down its general formulation? What does the GARCH model say in plain English? What does it "try" to achieve?
How do you determine the order of the model? How do you estimate the model in practice?
Tough follow-up question: how do you implement a GARCH model in Excel?
Question #12 (econometrics - time series)
People use the GARCH model to study volatility. Can you tell me if we can use the GARCH framework to study the correlation between two assets/time series? If so, what additional assumptions and/or adjustments must we make to the original GARCH model?
Question #13 (brainteaser)
With an ordinary tape measure and a watch, how would you measure the exact height of the Empire State Building (or the Sears Tower, or the Big Ben Clock Tower, or the Oriental Pearl TV Tower, or any famous tall building)?
Question #14 (brainteaser - logic - deduction)
(There are many versions of this type of question. Here are some examples.)
1. How many pizzas are consumed every day in the U.S.?
2. How many gas stations are there in the U.S.?
3. How many cars are stolen every month in the U.S.?
4. How many prostitutes do you think work the streets in New York (or London, or L.A., or Shanghai, or Tokyo, or Singapore, ...)?
5. How many quants are there in the world?
6. How many people make their livings on Wall Street?
7. How many university graduates try to find a job on Wall Street each year?
8. How many tennis balls can you fit in a Boeing 747 (or Airbus A320)?
9. How many Yankees fans go to every home game each season? [I was asked this at the last round of my McKinsey interviews]
10. How many people in China can speak English? [A little different from the previous nine...]
Question #15 (case question hard to categorize)
You work for an arbitrage desk. Your model shows that if you bought stock A and simultaneously sold stock B, you have a 51.3% chance of making a profit by today's close. Should you make this trade?
Question #16 (applied math - control theory)
The latest "hot" topic in financial research is using the Kalman filter in various applications. Can you explain the basic idea behind the Kalman filter (i.e., what does the filter try to do with the data)? Can you write the basic Kalman filter model? What are some of the applications of the Kalman filter?
Tough question: How do you estimate (or implement) the Kalman filter? For example, to study stock price movement.
Question #17 (finance - asset pricing)
Tell me the intuition behind CAPM. Can you write down the model? What does each of the variables stand for?
Two tough advanced questions: How do you test CAPM using real data? What are the major points of criticism against CAPM?
Question #18 (econometrics)
When modeling binary-choice problems, what are the advantages of using logit over probit? What are the disadvantages of logit vs. probit?
What about multiple-choice models: is logit or probit better?
Question #19 (finance)
What does VaR (value at risk) measure? What are some of the assumptions behind the VaR concept? Given two portfolios A and B, does the following relationship hold: VaR(A+B) = VaR(A) + VaR(B)? Why or why not (i.e., prove your previous answer)?
Question #20 (applied math - stochastic calculus)
What is Ito's Lemma? What is its significance in studying stochastic processes? How is it used in finance? Can you write out the equation?
When used to model financial derivatives, what assumptions must be made of the properties of the derivatives for Ito's Lemma to be applied correctly?
Question #21 (programming)
I give you a text file, x.txt, which has millions of records with three columns in each record:
ID, age, income
The records are sorted by ID, and no two IDs are the same.
Now, write a short program in each of the following languages to pull out 10,000 randomly selected records from x.txt. Put these 10,000 randomly pulled records in an output file called y.txt.
C++
Visual Basic
Matlab
Perl
Python
SAS
R or S-Plus
UNIX shell script
Question #22 (programming)
This is a tougher version of the previous question (#21).
You get the same input file x.txt with millions of records sorted by ID. However, some records are missing either age or income.
Now, your task is to write a program to pull out a random sample of 10,000 records, but only those with neither age nor income missing.
(Comment: both questions #21 and #22 test your ability to both write a working program and to produce an efficient program - but foremost you must write a program that works correctly)
Question #23 (applied math - linear algebra)
In linear algebra, why are we interested in matrix decompositions? Explain each of the following:
LU decomposition
Singular value decomposition (SVD)
Cholesky decomposition
QR decomposition
When and how is each of these decomposition techniques applied?
(Comment: matrix operations, including decompositions, are extremely important in applied quantitative finance - they are often the clue between modeling and implementation)
Question #24 (mathematical brainteaser)
Answer this as fast as you can, without writing anything down:
The perimeter of a right triangle is 5 inches. The two legs are each 2 inches long. What's the length of the hypotenuse?
(Comment: this is one of my favorite questions)
Question #25 (finance - asset pricing)
Can you show me how the APT model is derived? What's the intuition behind APT? How does it compare to CAPM? What are some of the criticisms of APT?
Question #26 (mathematics)
What is Jensen's Inequality? What are some of its applications? Can you write out the inequality and provide a sketch of a proof?
(Hint: Jensen's Inequality is an important concept in probability theory; other important inequalities include Hölder's Inequality and Minkowski's Inequality)
Question #27 (economics - game theory)
What's a Nash equilibrium? Can you write down its formal definition? Can you provide an example?
Question #28 (programming - SQL)
In SQL, what's an inner join and what's an outer join? What's the difference between a left join and a right join?
(Comment: SQL is the standard programming language in the database world, and more and more quant shops are setting up SQL-based databases and data warehouses)
Question #29 (statistics)
How do you calculate sample variance? Show me the formula and implement it in C or C++.
Question #30 (finance case question)
There are two stocks A and B. I already own A, but I'm thinking of buying B to replace A. (I can only own either A or B at the same time.) I'm a U.S.-based investor subject to all U.S. taxes. How will my tax situation affect my decision whether to keep A, or to sell A and buy B? Please explain in detail.
[Courtesy of Dr. Warren Hrung of the New York Fed]
Question #31 (probability theory)
There are 30 people in my group. What are the odds that at least two people share the same birth month and day (e.g., July 25). What are the odds that exactly two people share the same birth month and day? Finally, what are the odds that everybody was born in the same decade (where a decade is defined as any ten-year span, not necessarily "50s" or "60s" or "70s" etc.)?
Question #32 (statistics)
What's the difference between the t-stat and R2 in a regression? What does each measure? When you get a very large value in one but a very small value in the other, what does that tell you about the regression?
Question #33 (finance - options)
Can you plot an option's delta as a function of the underlying stock's price? What does this plot tell you?
Question #34 (finance - portfolio theory)
Consider the utility function U(W) = W-1/2 . What are the characteristics of this function with respect to absolute and relative risk aversion?
Explain the difference between absolute and relative risk aversion.
[First question taken from chapter 10 of Elton, et al. Modern Portfolio Theory and Investment Analysis]
Question #35 (programming - Perl)
In Perl, given a hash %bonus where the key is employee ID and the value represents the employee's expected year-end bonus, sort this hash by value from highest bonus to lowest.
Bonus question: how would you do this whole ID-->bonus mapping and sorting in C++ or C#?
Question #36 (brainteaser)
(The interviewer writes down the following equation on the whiteboard...)
XI + I = X
This is an equation expressed in Roman numerals. Imagine this equation is actually written out using sticks. Without touching or adding any stick, how can you make this equation true?
Question #37 (financial economics-related case question)
When you trade stocks, what are some of the different types of cost associated with your trading? How would you mitigate each type of cost?
(Hint: a cost need not be explicit...)
Question #38 (econometrics)
What are some of the causes of heteroskedasticity? How do you test for the presence of heteroskedasticity? (Please name at least two tests.) Finally, what are some of the techniques for dealing with heteroskedasticity?
Question #39 (financial time series)
What is Principal Component Analysis? Please explain in plain English as well as write down the model.
How does PCA differ from factor analysis?
(Comment: PCA is used heavily in studying asset returns; it is, for instance, a backbone of statistical arbitrage models)
Question #40 (mathematics - number theory)
Can you show that, for any prime number p that is at least equal to 5, the value of p2-1 is a multiple of 24 (i.e., wholly divisible by 24)?
Question #41 (probability theory)
You are offered to play a game of chance. A fair coin is tossed repeatedly until you get the first tails, at which point the game ends and you get the prize. The prize "pot" starts at $1 and doubles each time you get heads. So for instance, if you get heads the first toss, the pot becomes $2. If you get heads again the second toss, the pot becomes $4. If you get heads the third time, the pot becomes $8. If the fourth toss gives you the tail of the coin, you win and take home the $8 prize.
Before you play, you must pay a fee to enter this game. The question is, what's the maximum amount you're willing to pay in order to play this game? Explain your answer carefully.
Question #42 (probability theory)
What's the expectation of a uniform(a, b) distribution? What's its variance? Please derive your answers in mathematical terms, starting with the pdf.
Question #43 (finance - derivatives)
Kindly explain the difference between a futures contract and a forward contract. How are they priced differently?
Question #44 (statistics)
Given a dataset, how do you determine its sample distribution? Please provide at least two methods.
Question #45 (mathematics - algebra)
Let n be a natural number. Give the reduced expression for the following:
(1) 1+2+3+...+n
(2) 1+22+32+...+n2
(3) 1+23+33+...+n3
(4) 1+2k+3k+...+nk, where k is another natural number.
Question #46 (programming - C++)
What are virtual functions in C++? What are they used for? Please write down an example of a virtual function to illustrate its usage.
Question #47 (applied math - stochastics)
A random walk process starts at the point 0. What is the probability that this random walk hits -2 before it hits 3? What if the process is a Brownian motion instead?
[Courtesy of Dr. Yun Cheng of ITG]
Question #48 (finance - options)
What is the lower bound for the price of a European call option on a non-dividend-paying stock? Can you derive this lower bound in a formal fashion?
Now, what if the call option is American? What if the stock pays a dividend every quarter?
Question #49 (general computing - Excel)
There are at least two ways in Excel to perform an OLS regression. What are they? What are some of the limitations of doing OLS in Excel (as opposed to using a real statistical package like EViews, Stata, R, S-Plus, or SAS)?
Question #50 (finance - capital market)
Why do price spreads exist in asset-trading markets? Can spreads ever be negative? If so, under what conditions?
Tougher: what are some examples of markets where price spreads do not necessarily exist?

http://v2.moneyscience.com/Information_Base/List_of_Sample_Questions_for_Quant_Interviews.html

mkchan · 9/29/11

Question #1 (statistics)
Jarque-Bera test

Question #3 (finance - bond pricing)
Discount rate is what you use in the denominator when you discount your future expected cash flow from a bond. Its the time value of money. For short term bonds ~0-0.25% 10 year 2%. You can get the rest of the years by looking up the term structure of rates.
When you discount a 30 year bond, you discount each expected coupon by the corresponding time that coupon is paid. You can use one rate if you have a zero coupon bond and you hold to maturity.

Question #4 (probability theory)
You should switch

Question #5 (applied math)
Question #6 (statistics)
(1) X+Y ~ N(a+c, sqrt(b^2 + d^2 + 2pbd))
(2) X-Y ~ N(a-c, sqrt(b^2+d^2 - 2pbd))
(3) X*Y ~ N(a*c, sqrt(a^2*d + c^2*b + b*d))
(4) X/Y ~ N(a/c, ??)

Question #7 (finance - options)
c+Xe^-rt = p + S You can derive from arbitrage. You need to make sure youre comparing European options. You can sort of use American options but you need to remember that value of American puts are greater than European puts. Amer call=Euro call
To arbitrage short the side of the equation that is overvalued, and long the side of the equation that is undervalued.
If you make this trade, you hope that interest rates dont unfavorably change on you.

Question #8 (math - stochastics)
E[ exp{-(t-s + s)/2 + W(t-s + s) | Fs ] = exp{W(s)-s/2} E[exp{W(t) - W(s) - (t-s)/2} | Fs] (exp{W(s) - s/2} is measurable w.r.t. Fs)
Go with Yt = W(t) - W(s) - (t-s)/2
Then E[exp(Yt)] = E[ exp{ E(Yt) + 1/2Var(Yt) ]
E(Yt) = -(t-s)/2 Var(Yt) = t-s
E[exp(Yt)] = E[exp{ -(t-s)/2 + (t-s)/2}] = E[exp{0}] = 1
Then you have E[exp{W(t) - t/2} | Fs] = exp{W(s) - s/2} * 1 , which is what you need to prove to say exp{W(t) - t/2} is a martingale

Question #9 (programming - C++)
Doesnt work. You are casting a string into an integer pointer. makes no sense

Question #10 (econometrics - time series)
If the AR(p) process has coefficients less than one, then it is stationary. If its greater than one, the process is a random walk. Because AR process depends on past values of itself, if the coefficient is greater than one, then you will have drift.
To find the order of AR process look at the PACF, for MA process look at the ACF. Which ever lags are significant is the order of the process. AR(1) is like a MA(inf) and an MA(1) is like a AR(inf)

Question #11 (econometrics - time series)
A GARCH process is just an ARMA process for volatility. You can use GARCH for calculating VAR or forecast volatility to price options. The model is sigma^2 = alpha*Ut-1^2 + beta*sigma_t-1^2. In plain english it means next periods volatility is equal to some percentage of last period's vol + a gaussian white noise error term. You can use ARMA techniques ACF/PACF to estimate the order of GARCH .
You can estimate with OLS or MLE.

Question #15 (case question hard to categorize)
If I have a 51.3% chance of profiting from a trade, I would need to find out what the variance is from the analysis. Is 51.3% significant? Whats the t-stat, pvalue. If the pvalue is low (less than 1%), I'd go for it

Question #16 (applied math - control theory)
The Kalman filter is based on a state equation and an observation process. The observation process is just a random process with a parameter that we try to estimate/keep track of with the state equation. It recursively predicts the next step in the observation process given the Kalman gain factor and last period's error actual-predicted.
The Kalman filter is good for estimating unobservable variables like spot price of oil, smoothing out random walks, finding moving averages. Landing spacecraft on the moon etc...

atreides · 9/29/11

Question #1 (statistics)
You can also do a QQ plot where you compare the distribution against a normal distribution, draw a line through you the points plotted. If all your points lie on this straight line then your distribution is normal. This is gives you a visual of normality, also you can see if there are left or right tails, or you can compute the Jarque-Bera test, if JB > 6 , -> distribution is not normal, if JB < 6 -> distribution normal

Question #4 (probability theory)
You'll win 2/3 of the time if you switch

Question #19 (finance)
VaR measures how much you could loose over a certain period of time with a certain confidence level.

Distribution of returns are covariance stationary -> ie mean and variance are constant over time
Market returns are normal and iid over time

It depends. If A and B are independent then Cov(A,B) = 0 , so you have VaR(A+B) = VaR(A) + VaR(B) else if Cov(A,B) != 0 then VaR(A+B) = VaR(A) + VaR(B) + 2Cov(A,B)

Yike Lu · 9/30/11

mkchan said:
Question #15 (case question hard to categorize)
If I have a 51.3% chance of profiting from a trade, I would need to find out what the variance is from the analysis. Is 51.3% significant? Whats the t-stat, pvalue. If the pvalue is low (less than 1%), I'd go for it

I can do better: what is the expected value and is it significantly larger than zero (nobody said the upside and downside were symmetric)? If so, what is the standard deviation of returns? We can form some type of Sharpe ratio to measure the reward/risk ratio.

Brad Warren · 9/30/11

Yike Lu said:
I can do better: what is the expected value and is it significantly larger than zero (nobody said the upside and downside were symmetric)? If so, what is the standard deviation of returns? We can form some type of Sharpe ratio to measure the reward/risk ratio.

I think if I work on an arbitrage desk, I'm not making a trade with a 49% chance of a loss.

Brad Warren · 9/30/11

atreides said:
Question #4 (probability theory)
You'll win 2/3 of the time if you switch

If I know the host always opens another door, then I would switch. But maybe he does it only because I picked the right door.

Yike Lu · 9/30/11

Brad Warren said:
I think if I work on an arbitrage desk, I'm not making a trade with a 49% chance of a loss.

Names can be deceiving these days, what with Volcker Rule and all.

In any case... what if the upside/downside is 2:1? You are really going to tell me you would not take that trade?

Brad Warren · 9/30/11

Yike Lu said:
Names can be deceiving these days, what with Volcker Rule and all.

In any case... what if the upside/downside is 2:1? You are really going to tell me you would not take that trade?

Well in that case then I would probably look at the downside risk and use some risk-adjusted measure to evaluate the trade.

D C · 10/3/11

Question #36 (brainteaser)
XI + I = X is "11 + 1 = 10" which is not true as is. (if written on a whiteboard) I would tell the interviewer to turn around, back to the white board, stick his head between his legs, and look at the board again. It should now read "X = I + IX" which is "10 = 1 + 9" which is true.

euroazn · 10/3/11

Is 21 for real!?! You have to know _all_ of those languages?

DStahl · 10/10/11

Question #47 (applied math - stochastics)
A random walk process starts at the point 0. What is the probability that this random walk hits -2 before it hits 3? What if the process is a Brownian motion instead?

I am most likely incorrect, but for Brownian motion I get p=.116642

$ P(M<m|W(t)=w) =e^{-\frac{2m(m-w)}{t}} $ Using methods outlined by Shreve.

Given m=-2, w=3,
$ P(M<-2|W(t)=3)=e^{-\frac{20}{t}} $

But we do not know when "t" is...so we find the expected value of this expression with respect to the pdf of the distribution of t hitting 3.

$ \int_0 ^ \infty e^{\frac{-20}{t}} \frac{|m|}{2t\sqrt{2\pi t}}e^{-\frac{m^2}{2t}} $
$ \int_0 ^ \infty \frac{3}{2t\sqrt{2\pi t}}e^{-\frac{9+40}{2t}} $
$ \int_0 ^ \infty \frac{3}{2t\sqrt{2\pi t}}e^{-\frac{49}{2t}} $
$ =\frac{\sqrt{2/3}}{7}=.116642 $

Brad Warren · 10/10/11

DStahl said:
Question #47 (applied math - stochastics)
A random walk process starts at the point 0. What is the probability that this random walk hits -2 before it hits 3? What if the process is a Brownian motion instead?

Click to expand...

I am most likely incorrect, but for Brownian motion I get p=.116642

$ P(M<m|W(t)=w) =e^{-\frac{2m(m-w)}{t}} $ Using methods outlined by Shreve.

Given m=-2, w=3,
$ P(M<-2|W(t)=3)=e^{-\frac{20}{t}} $

But we do not know when "t" is...so we find the expected value of this expression with respect to the pdf of the distribution of t hitting 3.

$ \int_0 ^ \infty e^{\frac{-20}{t}} \frac{|m|}{2t\sqrt{2\pi t}}e^{-\frac{m^2}{2t}} $
$ \int_0 ^ \infty \frac{3}{2t\sqrt{2\pi t}}e^{-\frac{9+40}{2t}} $
$ \int_0 ^ \infty \frac{3}{2t\sqrt{2\pi t}}e^{-\frac{49}{2t}} $
$ =\frac{\sqrt{2/3}}{7}=.116642 $

The probability should be greater than 0.5 because -2 is closer to 0 than 3.

The process is a martingale. Its expectation at any future time is 0. Let us use the expectation in terms of the probability of hitting -2 first, at the time it hits either -2 or 3.

0 = -2p + 3(1 - p)

p = 3/5

I didn't assume the process was either a random walk or Brownian motion, so it doesn't make a difference.

DStahl · 10/10/11

Brad Warren said:
The probability should be greater than 0.5 because -2 is closer to 0 than 3.

The process is a martingale. Its expectation at any future time is 0. Let us use the expectation in terms of the probability of hitting -2 first, at the time it hits either -2 or 3.

0 = -2p + 3(1 - p)

p = 3/5

I didn't assume the process was either a random walk or Brownian motion, so it doesn't make a difference.

I am not sure I agree...The problem gives that the brownian motion hits 3. Given that it hits 3, how many possible paths include -2? I would think less than 50 percent.

Edit: Upon thinking more about it, I think you are correct. The probability of hitting 3 eventually is one, so the probability of it hitting -2 or 3 is independent of the fact that we are given that it hits 3. In which case all we have to do is calculate the probability that it hits -2 or it hits 3, and it is obviously more likely to hit -2.

Tom Maloney · 10/10/11

#47 is the gambler's ruin problem. Consider starting with $2, and let p be the probability of winning $1, and let q be the probability of losing $1. Then the question becomes: what is the probability that that gambler goes bust before he hits his goal of $5. The answer is 3/5 if p = q, or $\frac{(q/p)^2-(q/p)^5}{1-(q/p)^5}$ if $q\neq p$.

nomadnerd · 10/28/11

Question 31.

Ok, here's my very rude way to compute it. Please correct me if you find a mistake!

So, let's assume every year has 365 days. The chance of being born one specific day is equal to the probability of being born any other day (1/365).

Then:

Possible combinations PC = 365^30

Then, the combination of having at least two persons born on one day D are: 29*30. In fact:

Array of birthdays:
D D x x x x x x x x x x x x x x x x x x x x x x x x x x x x
D x D x x x x x x x x x x x x x x x x x x x x x x x x x x x
D x x D x x x x x x x x x x x x x x x x x x x x x x x x x x
...
D x x x x x x x x x x x x x x x x x x x x x x x x x x x x D
x D D x x x x x x x x x x x x x x x x x x x x x x x x x x x
...
x x x x x x x x x x x x x x x x x x x x x x x x x x x x D D

This has to be true for every possible day of the year.
Thus the combinations with at least 2 equal birthdays are: 29*30*365.

The probability is 29*30*365/365^30

Way to make it more complicated: birthdays are unequally distributed along the year, so we weight the probability of every single day instead of considering them all 1/365.

How to consider leap years?

Brad Warren · 10/28/11

nomadnerd said:
Question 31.

Ok, here's my very rude way to compute it. Please correct me if you find a mistake!

So, let's assume every year has 365 days. The chance of being born one specific day is equal to the probability of being born any other day (1/365).

Then:

Possible combinations PC = 365^30

Then, the combination of having at least two persons born on one day D are: 29*30. In fact:

Array of birthdays:
D D x x x x x x x x x x x x x x x x x x x x x x x x x x x x
D x D x x x x x x x x x x x x x x x x x x x x x x x x x x x
D x x D x x x x x x x x x x x x x x x x x x x x x x x x x x
...
D x x x x x x x x x x x x x x x x x x x x x x x x x x x x D
x D D x x x x x x x x x x x x x x x x x x x x x x x x x x x
...
x x x x x x x x x x x x x x x x x x x x x x x x x x x x D D

This has to be true for every possible day of the year.
Thus the combinations with at least 2 equal birthdays are: 29*30*365.

The probability is 29*30*365/365^30

Way to make it more complicated: birthdays are unequally distributed along the year, so we weight the probability of every single day instead of considering them all 1/365.

How to consider leap years?

Having at least 2 birthdays the same is the complement of no one sharing a birthday:

$P=1-\frac{365}{365}\frac{364}{365}...\frac{365-n+1}{365}=1-\frac{365!}{(365-n)!365^n}$

The probability exactly two people share the same birthday is the probability only the first 2 people do multiplied by the number of combinations:

$P=\frac{365}{365}\frac{1}{365}\frac{364}{365}...\frac{365-n+2}{365}{n\choose2}=\frac{365!}{(365-n+1)!365^n}{n\choose2}$

The odds that everybody was born in the same decade would depend on the nature of the group. I mean, it would be more likely at a high school reunion than say if it was a group of neighbors.

quotes · 12/8/11

Brad Warren said:
The probability should be greater than 0.5 because -2 is closer to 0 than 3.

The process is a martingale. Its expectation at any future time is 0. Let us use the expectation in terms of the probability of hitting -2 first, at the time it hits either -2 or 3.

0 = -2p + 3(1 - p)

p = 3/5

I didn't assume the process was either a random walk or Brownian motion, so it doesn't make a difference.

How about the process with finite time allowed?
To preserve a martingale
P(ending in upper boundary)+P(ending positive in the end but below upper boundary)=0.5
Am I correct?

Brad Warren · 12/8/11

quotes said:
How about the process with finite time allowed?
To preserve a martingale
P(ending in upper boundary)+P(ending positive in the end but below upper boundary)=0.5
Am I correct?

Since the barriers are not symmetric about zero, the distribution for the random walk won't be either, so the probability of being > 0 is not 1/2. The probability is dependent on time. Immediately after time 0, the probability will be 0.5 because there is no chance of hitting the barrier. As time increases, the probability will decrease and approach 0.4 because it is increasingly likely to hit a barrier, with a greater chance of reaching the negative one.

quotes · 12/8/11

Brad Warren said:
Since the barriers are not symmetric about zero, the distribution for the random walk won't be either, so the probability of being > 0 is not 1/2. The probability is dependent on time. Immediately after time 0, the probability will be 0.5 because there is no chance of hitting the barrier. As time increases, the probability will decrease and approach 0.4 because it is increasingly likely to hit a barrier, with a greater chance of reaching the negative one.

Unfinished Gambler's Ruin
Can you answer the question in this post?

List of sample questions for Quant Interviews

Andy Nguyen

mkchan

atreides

Graduate Student

Yike Lu

Finder of biased coins.

Brad Warren

Brad Warren

Yike Lu

Finder of biased coins.

Brad Warren

D C

Grad Student

euroazn

DStahl

Brad Warren

DStahl

Tom Maloney

nomadnerd

Brad Warren

quotes

Brad Warren

quotes

Similar threads