PDA

View Full Version : Bootstrap summary



Prasanna Malaviya
05-22-1998, 05:03 AM
Hello All,

Many many thanks to all who responded to my questions regarding the bootstrap
resampling technique. I am very close to having a workable program which
will handle my data.

I found the information at http://www.statistics.com to be very useful in
bringing me up to speed as to what resampling is all about. I read through
the on-line version of Julian Simon's book "Reasmpling: the new statistics"
and found it very lucid and easy to understand. Using the information on
chapter 8 of that book I am writing a program to work with my data.

I could not get hold of the Efron papers, unfortunately, but some people
recommended that. However, I did get the book "Introduction to Robust
Estimation and Hypothesis Testing", (by Rand Wilcox, Academic Press, 1997)
from our library and found that useful too.

Again many thanks to all who responded. The compiled postings of all who
responded is included below.

Have a nice Memorial Day weekend!

Prasanna Malaviya, PhD
-----------------------------------------------------------------------
>From jhiggins@sfsu.edu Fri May 8 18:27 EDT 1998

You may want to contact stats@resample.com. I think thay have a home page
but I do not have it bookmarked on this computer...sorry...if I remember I
will look when I am at my office on Monday...good luck...this is i think a
solvable problem...JRH

Joseph R. Higgins
Department of Kinesiology
San Francisco State University
415 339-1746
415 338-7566 (FAX)
------------------------------------------------------------------------
>From amrik@bme.ri.ccf.org Thu May 7 14:27 EDT 1998

Prasanna,
I may have a few suggestions but have a few questions to begin with.
1) Since Rij are different, how did you use GLM to handle the repeated
nature of the observations? A more appropriate procedure would be "Proc
Mixed" in SAS. Note: the repeated structure of data has to be adjusted
by modelling the correlation within obs from same animal. Failure to do
so may result in an underestimate of the variances leading to erroneous
conclusions in Hypothesis testing.
2) What do you mean by ranges being 8-10 etc...? Does this imply the
number of repeat measures for each subject??
3) N=3 is very small, 5 or more may be appropriate for a repeated
measures setting.
4) Bootstrap will not increase the confidence level, it will allow you
to estimate the power and Type 2 error rate. If Type 2 error rate is
high (i.e. power is low) then you results may be purely due to chance.
Two possible ways of implementing:
Method 1- You could set up a sampling scheme in which you randomly pick
a certain number of obs from each activity level (with replacement) and
keep in mind the animal to whom the picked obs belongs. Then you
basically redo the analysis and note the conclusion i.e. reject or
accept null hypothesis. Repeat the above 100-500 times and see for what
percent the null hypothesis is rejected, this is the power.

Method 2- This is more of simulation. From the parameters estimated from
data (i.e. mean for each activity level, variance and covariances),
simulate the data set with arbitrary number of repeat measures per
animal per activity level. Do the analysis on this simulated data set
and repeat the whole process many times. Again, this will provide an
estimate of power. This may be a little tricky since you have to use the
within-subject and between-subject variability estimates to generate the
data.

Both techniques should give pretty much the same results. Let me know
if I can answer any specific questions. Cheers,
Amrik
-----------------------------------------------------------------------------

>From rajensen@nmu.edu Wed May 6 13:09 EDT 1998

You might wnat to have
a look at the article below. Although it is not with gait analysis, the
procedure is the same whatever you're looking at. Furthermore, you can
use most statistical techniques and not just regression analysis as we
did for this paper.


R.L. Jensen and G. Kline. (1994) The resampling cross-validation
technique in exercise science: Modeling rowing power.
Medicine and Science in Sports and Exercise, 26:929-933.


How the bootstrap helps increase the confidence in your conclusions is
that it is simliar to taking several repeated samples of the population
of interest. This should give you a better idea if the reason you found
something statistically significant was just due to the peculiarities
of the initial group or if there was actually something going on.


The article gives a description of how to run the bootstrap, or
resampling procedure, as well as some limitations to consider. One
major assumption is that the group you have sampled initially is a true
representation of the population as a whole.

If you have any further questions, please contact me.


RJ
Randy Jensen
Dept. HPER
Northern Michigan University
Marquette, MI 49855
Phone: (906) 227-1184
FAX: (906) 227-2181
---------------------------------------------------------------------------

>From terry@brcinc.com Wed May 6 12:01 EDT 1998

You want to look for texts by Dr. Rand Wilcox of USC entitled:

Statistics for the Social Sciences, Academic Press, 1996.

Introduction to robust estimation and hypothesis testing, Academic Press, 1997.

The bootstrap technique basically takes the data that you have collected and
draws a sample from the n observations just sampled and computes the measure of
location of interest (i.e. Harrell-Davis estimate, etc.) - It performs this
random sampling B times where B usually =100. The sample standard deviation of
the B values is determined and this represents your standard error.

In the latter text, he provides readers with a web site to download macros that
run in SPLUS - those macros include several different bootstrap techniques that
would greatly increase the power of your statistical analysis. One of the main
premises of this type of statistical analysis is the fact that one cannot always
assume homogeneity of variance ... which is one of the fundamental assumptions
of the ANOVA analysis. The relatively low number of observations that we are
CONSTANTLY faced with in biomechanical studies also decreases the power of our
statistical analysis. A bootstrap analysis increases the number of observations
and consequently increases the statistical power. That is likely why the
reviewer suggested you turn to something like a bootstrap.

Good Luck
Terry Smith
Biomechanist
----------------------------------------------------------------------------

>From D.R.Mullineaux@tees.ac.uk Wed May 6 11:37 EDT 1998

Dear Prasanna

I will try and help you with each of your questions:

> 1. How would it be useful to apply the bootstrap technique to analyze
>our data? How will it help increase the confidence in our conclusions that
>activity has a significant effect on various measures of in vivo force?

Bootstrapping is generally helpful when the assumptions underpinning
the traditional inferential statistical tests is violated. With
regard to the ANOVA you have used bootstrapping can help if the
data is not normally distributed. If the data is normally
distributed (and you have met any other necessary assumptions such as
homogeneity of variance) then the ANOVA results are valid and
bootstrapping is not required.

I normally use bootstrapping for correlational/regression analysis.
Firstly, I obtain the standard correlation coefficient. I then use
bootstrapping to obtain the 95% confidence intervals in the
correlation coefficient. I assume it works similarly with ANOVA
although I haven't tried it.


> 2. Can you guide me to a publication(s) which I can read through (and
>perhaps further discuss with you) to help me formulate how to use this
>technique?
A good reference for bootstrapping is Zhu, W. (1997). Making
bootstrap statistical inferences: a tutorial. Research Quaterly for
Exercise and Sport, 68, 44-55.


I hope this is of help.

Best regards, David
Mr D.R.Mullineaux
School of Social Sciences
University of Teesside
Middlesbrough
Cleveland
TS1 3BA
UK

Tel: +44-1642-342355
Fax: +44-1642-342067
Email: D.R.Mullineaux@tees.ac.uk
--------------------------------------------------------------------------
>From clk@is.dal.ca Wed May 6 08:32 EDT 1998

Efron has written on this topic and perhaps the following three references
will help you.
1. B. Efron. The 1977 Rietz Lecture: Bootstrap methods :
another look at jacknife. Ann Stat, 7:1-26, 1979.
2. B. Efron. Estimating the error rate of a prediction rule: improvement
on cross validation. J Am Stat Assoc, 78: 316-331, 1983.
3. B. Efron. The jacknife, bootstrapp and other resampling plans. Society
for Inducstrial and Applied Mathematics, Philadelphia, 1982.
I also believe he has a recent book on the topic although I do not have
the reference.
I have a paper where this approach was used for ECG signals for diagnostic
classification, it may help you with the application aspect. ref.
Hubley-Kozey et al, Spatial features in .... ventricular
tachycardia Circulation, 1995;92:1825-1838.
Hope this helps. Cheryl Kozey
-----------------------------------------------------------------------------

>From 344dapp@CMUVM.CSV.CMICH.EDU Tue May 5 21:42 EDT 1998

I recommend visiting the following URL. They have a computer program
called Resampling Statistics that you can download for a 30 day trial.

http://www.statistics.com/

Pete

Peter V. Loubert PhD, PT, ATC
Associate Professor of Physical Therapy
Central Michigan University
Mount Pleasant, MI 48859 USA

Email: Peter.Loubert@cmich.edu
-------------------------------------------------------------------------------

>From lcaillo@popalex1.linknet.net Wed May 6 09:22 EDT 1998

If the concern is a small sample, why would bootstrap or other resampling
techniques be useful. The suggestion sounds similar to trying a different
filter on a time series collected at too low a sampling frequency. If the
data is not there, it's just not there.

Isn't the real question is whether the sample size was adequate? Sounds
like you've got a reviewer who wants to experiment with new statistical
techniques on someone else's time...

Leon.
************************************************** *************************
Julianne D. & Leonard G. Caillouet
15617 Shenandoah Square
Baton Rouge, Louisiana 70817
504-753-7471
e-mail: lcaillo@popalex1.linknet.net
************************************************** *************************
---------------------------------------------------------------------------

>From arnold@grbb.polymtl.ca Tue May 5 20:26 EDT 1998

Dear Prasanna,

My first general comment is that referees are not always right, all humans
make mistakes, both we and also referees. I would like to be more
specific, however. As I didn't see your article I cannot judge about how
you did performed ANOVA. It seems to me that the situation you described is
pretty usual and ANOVA should work.

Concerning the Bootstrap I suggest you to take a look in the article in
Scientific America but I unfortunately completely forgot the issue. It
seems pretty old (may be end of 70 or beginning of 80) appeared after
Bradly Efron introduces this method. Pleae, find this article and you will
understand the ideas, thiugh I am not sure that you MUST use it.

The idea (if you insist, or just obey the referee) is that you have to
"prepare" therandom samples from you data set and repete the calculations
numerous times (may be 1000 or 10000) and computing means, etc. It is not a
magic stick, however. You will be always in the same data set. An advantage
of Bootstrap is that you use "computer experiments" and calculate empirical
distributions and confident intervals without using parametric tests. It is
howevr, questionnable whether or not in you specific data you will get
better results with Bootstrap then using what you did. The Bootstrap in you
field is not often used (as I understood from you search in the Medline).
May be somebody from the readears of Biomech-L will give you SPECIFIC
EXAMPLES.
-------------------------------------------------------------------

>From bi_942288@coco.cchs.usyd.EDU.AU Tue May 5 18:48 EDT 1998

Dear Prasanna,

David Mullineaux presented the paper " Allometric scaling of anaerobic
performance and the use of 'bootstrapping' for statistical inference from a
small sample" at the 2nd Australia and New Zealand Society of Biomechanics
Conference in Auckland, New Zealand, January this year. His e-mail address
is "D.R.Mullineaux@tees.ac.uk". I think he will be able to help you out.

Good luck.

Uangthip
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+--+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ Uangthip Rattanaprasert Smith e-mail: bi_942288@cchs.usyd.edu.au +
+ PhD Candidate, Home address: 16 Walters Rd, +
+ School of Exercise and Sport Science Berala 2141 NSW +
+ Faculty of Health Sciences, AUSTRALIA ,-_|\ +
+ The University of Sydney voice: +61 2 9649 5596 / \ +
+ East Street, Lidcombe, NSW 2141 fax: +61 2 9351 9204 \_,-._* +
+ AUSTRALIA v +
+ http://www.cchs.usyd.edu.au/Academic/ESS/main.html +
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

--
Email:prasanna.malaviya@me.gatech.edu
------------------------------------------------------------------------------
1206 Monterey Parkway | Petit Inst for Bioengineering & Bioscience
Atlanta, GA 30350 | Georgia Institute of Technology
Ph: (770) 399-9950 | 281 Ferst Drive, Rm. 314B, SSTC-1
| Atlanta, GA 30332-0363
| Ph: 404-894-2212, Fax: 404-894-2291
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-------------------------------------------------------------------
To unsubscribe send UNSUBSCRIBE BIOMCH-L to LISTSERV@nic.surfnet.nl
For information and archives: http://www.bme.ccf.org/isb/biomch-l
-------------------------------------------------------------------