tpribanic57

10-25-2005, 07:22 PM

Thank you all for your input.

Best, Tomo.

Original question:

Dear All,

very often one validates(compares) two different methods of some kind. In its perhaps simplest form one would calculate differences for each method between some known ground truth and value given by the proposed method (model). Then by calculating certain statistical charchestic of your data, such as rms, mean error etc., the conclusion would be drawn which method is more accurate. In addition, to simply expressing mean values for each method some people demand to go at least one step further. For example, they argue it is not sufficient to only say that one value (mean error) is smaller then the other, but it is also necessary to run down further statistical(probability) tests which will determine if the obtained difference is statistically significant etc. In other words stated hypotheses about two methods validation(comparison) has to be tested.

What would be the correct strategy when one would like to compare several different camera calibration methods (and/or different 2D systems) in terms of 3D reconstruction accuracy. Usually, I have seen works where people would simply find (mean) differences between reconstructed and ground truth known positions, lengths (static tests) and/or velocities, accelerations (dynamic test). Do we need to employ here also for instance things like t-test to determine if the difference between two methods(systems) is statistically significant? If so, is it enough to perform only one time calibration for each system(method) or multiple and how to then combine results of multiple calibrations(reconstructions)? Is it possible to argue that single time calibration is representative enough and work solely on it? What would be test for that?

Thank you all for your input, the summary will follow.

Best,

Tomislav Pribanic, M.Sc., EE

Department for Electronic Systems and Information Processing

Faculty of Electrical Engineering and Computing

3 Unska, 10000 Zagreb, Croatia

tel. ..385 1 612 98 67, fax. ..385 1 612 96 52

E-mail : tomislav.pribanic@fer.hr

Replay1:

one existing method to compare two measurement devices is to use what is

called the Bland and Altman plot named after the name of the researchers who

proposed it.

may be it may help you

the bibliographic reference is:

Altman DG and Bland JM (1983), "Measurement in Medicine: the Analysis of

Method Comparison Studies, " The Statistician, 32, 307-317.

Nasser Rezzoug

Replay2:

Tomislav,

I have not performed such tests that you are presenting, but, if I were to

review a paper, here is what I would be looking for:

Do we need to employ here also for instance things like t-test to determine

if the difference between two methods(systems) is statistically significant?

In general, you do need to quantify the significance of the difference.

Consecutively, I would compare the methods with a t-test (statistical

comparison) and, regardless if the difference is statistically significant,

I would question if the difference is practically significant.

If so, is it enough to perform only one time calibration for each

system(method) or multiple and how to then combine results of multiple

calibrations(reconstructions)? Is it possible to argue that single time

calibration is representative enough and work solely on it?

If you are confident that the calibration is very accurate compared to the

data you are collecting (usually, factor 10), it is fair to assume that one

calibration is sufficient. If not, I would run the test at least three times

to ensure that the calibration does not affect your data.

What would be test for that?

If think that you can then run a linear regression or ANOVA to find which

parameters influence more the accuracy of your data.

Cheers!

Sylvain

sylvaincouillard@hotmail.com

Replay3:

I read you request for information on stats for your reconstruction data. I think it is very important to find the repeatibility of the reconstruction. I think you have three options, you could repeat the test and use descriptive stats (mean, standard deviations) which is very easy to interpret. Second you repeat the test and depending on the amount of systems you are testing you could use a t-test if you've got 2, or ANOVA if you've got 3 or more.

You should try and get hold of the book by Vincent, William, J. 'Statistics in Kinesiology' from Human Kinetics - ISBN: 0736057927. The stats are based around physiological studies but the methods are the same.

Hope this helps.

Kind regards,

Cheryl Metcalf

Cheryl Metcalf

Postgraduate Researcher

Room 3055, Mountbatten Building 53

University of Southampton

Highfield, Southampton

SO17 1BJ

Replay4:

Hi Tomislav,

In addition, the 3D reconstruction error has another dimension, the 3D location where the error is given. Typically one would expect the error to be evenly distributed within the calibrated volume and to increase outside the volume. This is not always the case.

See my abstract on this:

Barton, JG, Lees, A (2002) Spatial visualisation of the reconstruction error of optoelectronic 3D motion analysis systems. Abstract / Gait and Posture, 16/Suppl.1: 140-141. 11th ESMAC Meeting and Conference, Leuven, Belgium, 16-21 Sept.

Gabor

--

Dr Gabor Barton (MD)

Senior Lecturer in Biomechanics

Research Institute for Sport and Exercise Sciences

Liverpool John Moores University

Webster Street

Liverpool

L3 2ET

United Kingdom

http://www.livjm.ac.uk

tel: +44 (0)151 231 4333 / 4321

fax: +44 (0)151 231 4353

Best, Tomo.

Original question:

Dear All,

very often one validates(compares) two different methods of some kind. In its perhaps simplest form one would calculate differences for each method between some known ground truth and value given by the proposed method (model). Then by calculating certain statistical charchestic of your data, such as rms, mean error etc., the conclusion would be drawn which method is more accurate. In addition, to simply expressing mean values for each method some people demand to go at least one step further. For example, they argue it is not sufficient to only say that one value (mean error) is smaller then the other, but it is also necessary to run down further statistical(probability) tests which will determine if the obtained difference is statistically significant etc. In other words stated hypotheses about two methods validation(comparison) has to be tested.

What would be the correct strategy when one would like to compare several different camera calibration methods (and/or different 2D systems) in terms of 3D reconstruction accuracy. Usually, I have seen works where people would simply find (mean) differences between reconstructed and ground truth known positions, lengths (static tests) and/or velocities, accelerations (dynamic test). Do we need to employ here also for instance things like t-test to determine if the difference between two methods(systems) is statistically significant? If so, is it enough to perform only one time calibration for each system(method) or multiple and how to then combine results of multiple calibrations(reconstructions)? Is it possible to argue that single time calibration is representative enough and work solely on it? What would be test for that?

Thank you all for your input, the summary will follow.

Best,

Tomislav Pribanic, M.Sc., EE

Department for Electronic Systems and Information Processing

Faculty of Electrical Engineering and Computing

3 Unska, 10000 Zagreb, Croatia

tel. ..385 1 612 98 67, fax. ..385 1 612 96 52

E-mail : tomislav.pribanic@fer.hr

Replay1:

one existing method to compare two measurement devices is to use what is

called the Bland and Altman plot named after the name of the researchers who

proposed it.

may be it may help you

the bibliographic reference is:

Altman DG and Bland JM (1983), "Measurement in Medicine: the Analysis of

Method Comparison Studies, " The Statistician, 32, 307-317.

Nasser Rezzoug

Replay2:

Tomislav,

I have not performed such tests that you are presenting, but, if I were to

review a paper, here is what I would be looking for:

Do we need to employ here also for instance things like t-test to determine

if the difference between two methods(systems) is statistically significant?

In general, you do need to quantify the significance of the difference.

Consecutively, I would compare the methods with a t-test (statistical

comparison) and, regardless if the difference is statistically significant,

I would question if the difference is practically significant.

If so, is it enough to perform only one time calibration for each

system(method) or multiple and how to then combine results of multiple

calibrations(reconstructions)? Is it possible to argue that single time

calibration is representative enough and work solely on it?

If you are confident that the calibration is very accurate compared to the

data you are collecting (usually, factor 10), it is fair to assume that one

calibration is sufficient. If not, I would run the test at least three times

to ensure that the calibration does not affect your data.

What would be test for that?

If think that you can then run a linear regression or ANOVA to find which

parameters influence more the accuracy of your data.

Cheers!

Sylvain

sylvaincouillard@hotmail.com

Replay3:

I read you request for information on stats for your reconstruction data. I think it is very important to find the repeatibility of the reconstruction. I think you have three options, you could repeat the test and use descriptive stats (mean, standard deviations) which is very easy to interpret. Second you repeat the test and depending on the amount of systems you are testing you could use a t-test if you've got 2, or ANOVA if you've got 3 or more.

You should try and get hold of the book by Vincent, William, J. 'Statistics in Kinesiology' from Human Kinetics - ISBN: 0736057927. The stats are based around physiological studies but the methods are the same.

Hope this helps.

Kind regards,

Cheryl Metcalf

Cheryl Metcalf

Postgraduate Researcher

Room 3055, Mountbatten Building 53

University of Southampton

Highfield, Southampton

SO17 1BJ

Replay4:

Hi Tomislav,

In addition, the 3D reconstruction error has another dimension, the 3D location where the error is given. Typically one would expect the error to be evenly distributed within the calibrated volume and to increase outside the volume. This is not always the case.

See my abstract on this:

Barton, JG, Lees, A (2002) Spatial visualisation of the reconstruction error of optoelectronic 3D motion analysis systems. Abstract / Gait and Posture, 16/Suppl.1: 140-141. 11th ESMAC Meeting and Conference, Leuven, Belgium, 16-21 Sept.

Gabor

--

Dr Gabor Barton (MD)

Senior Lecturer in Biomechanics

Research Institute for Sport and Exercise Sciences

Liverpool John Moores University

Webster Street

Liverpool

L3 2ET

United Kingdom

http://www.livjm.ac.uk

tel: +44 (0)151 231 4333 / 4321

fax: +44 (0)151 231 4353