Date: Tue, 13 Oct 2009 19:08:45 +0100

Subject: Request for data "Genetic Determinants of Financial Risk Taking"

From: Dan Bolser

To: c-kuhnen@kellogg.northwestern.edu, jchiao@northwestern.edu

Dear Drs. Kuhnen and Chiao,

I read with great interest your article "Genetic Determinants of

Financial Risk Taking".

I noticed that the p-values for the significance of the difference in

the means (figure 1 B and C) are quite marginal, less than 0.02 and

less than 0.04 for 5-HTTLPR and DRD4 respectively.

Was there a significant difference under a two-tailed significance test?

I would be interested to see the underlying data so that I can try my

own statistical analysis. I'm reviewing your paper for a journal club,

and I think one question for discussion will be the difference between

a one-tailed and a two-tailed significance test. I note that your

hypothesis for the direction of the difference seems valid, and that a

one-tailed test seems appropriate in this case.

One question that I would love to know with regard to the work, which

was missing from the paper... Which group of people made the most

money during the test? I guess this result is somewhat 'sensitive'?

;-)

Thanks very much for your assistance with the above request and the questions.

I've shown this paper to about 10 people so far, and I'm really

looking forward to reviewing it. I think it touches on one of the

"defining issues of our age" (to steal a piece of hyperbole from the

global warming campaigners... actually I think you could go so far as

to call your results "an inconvenient truth"!).

Sincerely,

Dan Bolser.

Date: Tue, 13 Oct 2009 13:37:59 -0500

From: "Camelia M. Kuhnen"

To: Dan Bolser

Cc: jchiao@northwestern.edu

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

Dear Dan,

Thank you for your interest in our paper. Unfortunately, we are unable

to share the individual level data from this project at this point.

Regards,

Camelia

Camelia M. Kuhnen

Assistant Professor of Finance

Kellogg School of Management

Northwestern University

2001 Sheridan Road

Evanston, IL 60208-2001

E-mail: c-kuhnen@kellogg.northwestern.edu

Phone: 847-467-1841

Fax: 847-491-5719

Web: http://www.kellogg.northwestern.edu/faculty/kuhnen/htm/index.html

Date: Wed, 14 Oct 2009 02:19:51 +0100

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

From: Dan Bolser

To: "Camelia M. Kuhnen"

Cc: jchiao@northwestern.edu

2009/10/13 Camelia M. Kuhnen

:
> Dear Dan,

> Thank you for your interest in our paper. Unfortunately, we are unable to

> share the individual level data from this project at this point.

Why is that if you don't mind me asking?

At what point do you think you will be able to share it?

I was wondering, for example, if there were any significant

associations with gender that could be used (on average) to improve

the calculation of the residual 'excess risky investment'? If the data

is unavailable, would it be possible for you to make this test for me?

If I can't repeat the analysis directly, can I ask exactly which 'mean

comparison test' was used? i.e. was it the t-test? Can you please

describe in a bit more detail how the regression was done?

Also (sorry!), Figure 1 B/C shows 'standard errors'. Are these

standard errors of the mean? (it looks that way, but I'm just trying

to make sure I have all the details for my 'journal club').

Thanks for getting back to me so promptly, and thanks again for the

interesting work, and for considering the above questions. I'm sorry

that I can't simply work on these questions myself using the

underlying data.

Sincerely,

Dan.

From: "Camelia M. Kuhnen"

To: Dan Bolser

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

Date: Tue, 13 Oct 2009 20:39:45 -0500

Cc: "jchiao@northwestern.edu"

I am sorry, Dan, but the data set is proprietary and cost us a lot of

money to obtain. We plan to use it in other studies in the future and

at this point we will not make it publicly available. While I

understand there are interesting questions to answer that we have not

addressed in the PLoS paper, such as whether there are gender effects,

I really do not have time right now to do additional analyses to

answer these questions. Please don't take it personally, but I have

other priorities at work at the moment.

Regards,

Camelia

Date: Wed, 14 Oct 2009 03:11:47 +0100

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

From: Dan Bolser

To: "Camelia M. Kuhnen"

Cc: "jchiao@northwestern.edu"

2009/10/14 Camelia M. Kuhnen

:
> I am sorry, Dan, but the data set is proprietary and cost us a lot of money

> to obtain. We plan to use it in other studies in the future and at this

> point we will not make it publicly available. While I understand there are

> interesting questions to answer that we have not addressed in the PLoS

> paper, such as whether there are gender effects, I really do not have time

> right now to do additional analyses to answer these questions. Please don't

> take it personally, but I have other priorities at work at the moment.

No, of course I don't take it personally, I'm just surprised that

there is no way for you to give me the data.

The normal practice in biology is to provide the data under an

institutional 'non disclosure' agreement that basically says that if I

want to publish any findings based on the data that I should get your

express permission to do so first, and abide by any conditions that

you require (usually co-authorship). Failure to abide by the NDA is

then a serious institutional legal issue.

Anyway, could you not provide the ~5,900 'residuals' with only a s/s

yes/no flag and a 7-repeat yes/no flag? I'm struggling to see what I

could do with that data other than simply try to repeat the analysis

that you have presented. I'm assuming this would just be 65 rows of

data with three columns?

Could you or one of your colleagues please let me know how the

difference between the mean residuals in the different groups was

tested for significance? I'm guessing t-test, but I'd like to know for

sure.

Thanks again for your help (and patience).

Sincerely,

Dan.

Date: Wed, 14 Oct 2009 07:09:53 -0500

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

From: joan.chiao@gmail.com

To: Dan Bolser

Cc: "Camelia M. Kuhnen"

Dear Dr. Bolser,

Thanks very much for your interest in our work and this project more

broadly.

If you would like to collaborate on this project, would you be send us a cv

or some kind of resume so we can better understand your scientific

background and expertise? I was unable to find a your professional website

online.

Regarding distribution of primary data, I understand the issue of 'open

access' to data, but usually, at least in neuroimaging, this is done by the

authors depositing the data into a repository which then mitigates requests

for data. As I understand from Cami, in the field of Finance, distribution

of raw data is much rarer, and typically only occurs to collaborators or

known colleagues who work in the same research field. Finally, as Cami

mentioned earlier, we are still working with the data and have planned a

number of additional analyses and at this time would prefer not to make the

data 'open access'.

In any case, hope that you understand the starting assumptions underlying

our preference not to distribute raw data for the time being.

Sincere regards,

Joan Chiao

Date: Wed, 14 Oct 2009 16:10:04 +0100

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

From: Dan Bolser

To: jchiao@northwestern.edu

Cc: "Camelia M. Kuhnen"

Hi guys,

Thanks for all your kind, patient and careful replies so far.

First off, to pick up your request, I have put some information about

myself at the end of the email. I must apologise, because I appreciate

that this is a courtesy I often forgo. I hope the information won't

affect your decision about weather or not to answer my email! ;-)

OK, to clarify there are two outstanding issues:

1) request for the data, and

2) request for clarification of the methods.

Request for data:

To be clear, I'm not asking for you to make the data open access.

In the first instance, I thought you could 'safely' send me the table

of the 65 "average excess risky investment" scores per subject, along

with a flag for the two genotypes of interest. i.e. 65 rows, three

columns. I want this data so I can repeat the statistical analysis

that you presented in the paper. As a bonus, I thought you could send

me an extra column with the gender of the subjects so I could look for

correlation within those groups. This request still stands, because I

just can't see how releasing that data to me could be in any way

problematic for you.

In the second instance, as Professor Kuhnen raised the issue of

proprietary data, I suggested that I would be willing to sign an NDA,

that would legally guarantee your right to fully control the data and

any results, be they commercial or academic in nature. I am still

willing to sign an NDA for the above data if you see fit to draw one

up.

In case you are worrying, my intention is simply to better understand

the analysis that you published. i.e. I want to look at the standard

deviation of the "average excess risky investment" and also try

various different statistical tests to compare the difference in the

mean, including a comparison of a one-tailed and a two-tailed test.

This is simply for my own curiosity. I do not intend to publish any of

the results, nor do I seek to obtain financial gain from the results.

Request for clarification of the methods:

I asked several questions about the analysis presented in the

publication that I honestly feel are (more or less) reasonable. I feel

that this kind of question is exactly why there is a communicating

author. I'll reiterate my questions here because of the garbled nature

of my previous emails (sorry about that):

1) Figure 1 B and C show 'standard errors'. Are these standard errors

of the mean?

2) Which 'mean comparison test' was used in the comparison? i.e. was

it the t-test?

3) Was there a significant difference between the groups under a

two-tailed significance test?

4) Can you describe in a bit more detail how the regression was done?

5) Is there any significant association with subject gender?

6) Which group of people made the most money during the test? I

suppose this has more to do with the experimental design than the

exact strategy employed.

Please accept my apology if any of these questions are very basic

within your respective fields. As I hinted before, I'm a biologist

(bioinformatician), and I don't think there are such 'standard' tests

in the field.

About me

I'm a bioinformatician working as a post doc research assistant at the

University of Dundee. I have a Ph. D in the bioinformatics of

protein-protein interaction and protein structure analysis. I'm

interested in NGS, genotyping, GWAS, and the future of a

'post-genomic' society. Since you were looking, here are some links:

* http://network.nature.com/people/dan

* http://www.linkedin.com/in/bolser

* http://openwetware.org/wiki/User:Dan_Bolser

* http://www.slideshare.net/danbolser

Please let me know if anything is still unclear.

Sincerely,

Dan Bolser

Date: Wed, 14 Oct 2009 10:40:40 -0500

From: "Camelia M. Kuhnen"

To: Dan Bolser

Cc: jchiao@northwestern.edu

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

Dan,

I normally do not share data unless requested by the editor of a

journal. However, I will make an exception now just because of the

cultural difference between your field and mine.

I have attached here the data file containing subject-level residual

investment and genotype. This is absolutely the maximum amount of data

that Joan and I will share at this point.

Regarding your questions:

(1), (2) and (3) you can answer on your own based on the attached data

(4) is answerable based on the details in the paper and supplement

(5) and (6) are not the focus of the paper and we will not discuss these

additional results for now. They may be part of other publications.

Good luck with your research,

Camelia

Date: Wed, 14 Oct 2009 17:39:19 +0100

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

From: Dan Bolser

To: "Camelia M. Kuhnen"

Cc: jchiao@northwestern.edu

Awesome!

Thanks very much for providing the data!

> Regarding your questions:

>

> (1), (2) and (3) you can answer on your own based on the attached data

> (4) is answerable based on the details in the paper and supplement

> (5) and (6) are not the focus of the paper and we will not discuss these

> additional results for now. They may be part of other publications.

With regard to question 2, "Which 'mean comparison test' was used in

the comparison?", I have tried a one-tailed t-test (Welch Two Sample

t-test), a one-tailed ks-test (Two-sample Kolmogorov-Smirnov test) and

a one-tailed wilcox-test (Wilcoxon rank sum test with continuity

correction).

The wilcox-test seems to fit the given p-value for 5-HTTLPR, but not

DRD4 (nor any of the other tests).

Can you enlighten me as to which 'mean comparison test' was used to

obtain the p-value of 0.04?

The lowest p-value I can obtain using the above tests is about 0.07.

Thanks again for providing the data.

Sincerely,

Dan.

Date: Wed, 14 Oct 2009 13:00:38 -0500

From: "Camelia M. Kuhnen"

To: Dan Bolser

Cc: Joan Chiao

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

See attached pdf -- the one-tailed p-value of the mean comparison test

is 0.0364.

Camelia

Date: Thu, 15 Oct 2009 10:38:41 +0100

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

From: Dan Bolser

To: "Camelia M. Kuhnen"

Cc: Joan Chiao

2009/10/14 Camelia M. Kuhnen

:
> See attached pdf -- the one-tailed p-value of the mean comparison test is

> 0.0364.

Hi Camelia,

Thanks for the information.

I have now resolved the difference that I was seeing between your

results and mine (details included below for your reference).

Would you mind if I wrote this up as a comment for the PLoS One

website? I think this is the kind of thing they would like to see

there. If you agree, I'll send the text for your approval before

posting it there.

Thanks very much for all the help.

All the best,

Dan.

Details:

I found that the problem was that the stats package I use (called 'R')

does not automatically pool the variance of the two groups when

performing the t-test. (Stata clearly does this by default.) The

result is that R uses the Welch (or Satterthwaite) approximation to

the degrees of freedom for when the variances of the two samples is

not equal. Usually the difference in the magnitude of p is not large,

but in this case it was 'important' (0.07 vs. 0.04).

I used Levene's test to check the assumption of equal variance, and

found no evidence for unequal variances (F(1,63) = 0.8758, p =

0.3529).

Changing the t-test to use equal variances now gives me the exact same

result in R as you observe in Stata.

Date: Thu, 15 Oct 2009 13:31:05 -0500

From: "Camelia M. Kuhnen"

To: Dan Bolser

Cc: Joan Chiao

Subject: Re: Request for data "Genetic Determinants of Financial Risk Taking"

Sure, Dan, you have my approval to write up the comment on the PLoS One

website.

Regards,

Camelia