PDA

View Full Version : Is there a statistician in the house?



Sticks
2010-Nov-10, 02:58 PM
From the attached image you will see the minimum data I have

The set up is I have a heart rate monitor my sister sent me and one of the things it tells me along with maximum and average heart rate calories burned is how long I have been in "The zone" for a session of "exercise"

Walking home I have two possible routes and after taking some readings going one way wondered if I would be in the zone for a longer period of time as a proportion of the duration it takes going the other way.

At first it seemed that the new route via Corporation Street meant I spent longer in the zone, but a third data point seems to dispute this, so I turned to statistical tools.

I discovered that Excel can give me a T-test figure, and from the attached image I have used this function, assuming the variances are not equal as well as a difference in sample size so this is unpaired.

As my Ho is that the means uPitt St = uCorp St and H1 that the two means are different this is a two tailed test.

Excel gave me a figure therefore for t as t=0.072191936

The problem is, I do not have access at this time to copies of statistical tables, so what does this value of t mean?

Is it saying there is or there is not a difference between the two routes

(If I can I may gather more data later in the week)

grant hutchison
2010-Nov-10, 03:14 PM
I'm not sure how you're managing to do a t-test with those data. Are you assuming that the four percentages in the first column constitute one normally distributed group, and the three percentages in the second column are another normally distributed group? If so, these are very small groups, and even if the assumption of normality is valid, you're not going to achieve statistical significance with such small values for t and DF.
I'd have thought you'd be better (better assumptions, better stats) doing a chi-squared test. Record the total number of minutes you spend "in the zone" and "out of the zone" for one route, and likewise for the other, by adding up the totals for all the journeys along a particular route. Draw up a two-by-two table, and compute your chi-squared statistic and p-value from that. There are lots of on-line calculators that will do the job for you.

Grant Hutchison

Swift
2010-Nov-10, 03:17 PM
Given how few data points you have, it basically means nothing. With only 3 datapoints for Corporation Street, that difference is not statistically different.

What I would suggest is that you check the two routes several more times, and get a bigger set of data, and then do the t-test.

mike alexander
2010-Nov-10, 05:27 PM
As Grant noted, for the t-test to be meaningful you must test the data for normality first. As Swift noted, you really need more data.

Sticks
2010-Nov-11, 11:52 AM
It was brought to my attention that what Excel gave me was a p-value, and if I assume the two means have equal variance and not unequal as before I get a p-value of 0.016478695

I appreciate that large data sets are desirable and I will endevour to collect more data, but I thought I better give more info on the set up so you can appreciate the difficulty in collecting this data.

The heart monitor is a Polar FT4 wath with chest strap. The strap has to be wetted and the transmitter / monitor attached. Readings are transmitted to the watch.

So far so good

The complication arose as to what you can wear over the top of the chest strap and transmitter if anything. The instructions did not say, and only referred to shirts of any kind if one had an allegenic reaction to the metal bits on the strap. Then you put the shirt on and ensured it was wet where the strap went over the top of it.

So on initial use of this device I did not wear a shirt on over the top of it, and only had my blazer on. Changing in convenient rest rooms / toilets where I could.

Now when I asked about shirts on the forum related to the manufacture of this product, people said they did wear somthing over the top and so I followed their lead.

When I was on Lundy Island in October, I was using this product on one of my walks and on the first walk on the island, I managed to achieve a maxium heart rate of 219 beats per minute. The assumption is that if that were true I would be dead, so it had to be caused by interfearence.

Now I was using a digital camera and unexpectedly on that walk I received a text on one of my mobiles from the provider. I have tried to recreate the anomolous reading with those devices and have not been able to, which leaves the possibility that it was the shirt , because it was pointed out on that forum that polyester shirts can cause static and interfere with the transmissions between transmitter and receiver and the shirt I wore that day contains polyester, as do many of my other shirts.

Until I can fathom a way to generate on cue a static spark, reliably (I do not have access to a Vander Graff Generator) I have to assume the fault lay with wearing the shirt over the top of the transmitter.

This means when collecting data using the watch, to ensure there are no stray signals, having to not wear a shirt over the top of the device, under my blazer / sport's jacket.

As I am measuring my journey home, I therefore have to go into a toilet just before the security barriers, put on the strap et al. Remove the shirt, put back on the blazer, start the monitor / watch and briefly with the lapels of blazer hide the fact I do not have a shirt on underneath and rapidly exit my works, so nobody is none the wiser, otherwise there would be words. So far I have goten away with it, or at least nobody so far has mentioned this at work.

As soon I get sufficently away from the building although I keep the blazer buttoned up, I make sure there is an uninterupted line of sight as much as I can between the transmitter and receiver.

Needless to say as winter approaches carrying out further tests may be curtailed.

BTW if anyone has an idea on how to recreate the rouge reading it would be nice

Sticks
2010-Nov-11, 06:54 PM
I did another walk up Corporation street today on the heart rate monitor and added that into the mix

The additional values was 73.01% and the new p-value for type = 2 where the means are assumed to have Equal variance is 0.012425421

According to my source, if I read this number right the p-value says we can reject H0 with about 98.8% confidence

For reference the means are

Pitt Street 62.27% in Zone
Corporation Street 77.93% in the zone

Comparing calories burned though, Mean of Pitt Street is 91kcal and Mean of Corporation Street is 94kcal
p-value is 0.266521767 which again if I have got this correct is that you only have 73% confidence in rejecting H0

So with the limited data set here, although the p-values say I am most likely more in the zone, there is no evidence I am burning more calories, if this is the case why?

I checked the mean times of the duration of walking up

Pitt Street takes 11 minutes 6 seconds

Corporation Street takes 10 minutes 59 seconds

p-value = 0.829270108

So no way can re reject the null hypothesis this time, with this amount of data, pitiful that it is, there is no evidence I am taking any longer or shorter time going either route.

So am still none the wiser

In summary the data collected from the monitor

13927

The t-test according to Excel, if I read it right, says there is a difference in the proportion of time I am in the sacred zone (What ever that is), however there is no evidence to say there is a difference between the two routes in time taken or calories burned. :ehE

So apart from the obvious of collecting more data sets, which I do try to do, given my earlier caveats over methodology of use of the heart monitor, where do we go from here?

:think:

Swift
2010-Nov-11, 07:08 PM
The t-test according to Excel, if I read it right, says there is a difference in the proportion of time I am in the sacred zone (What ever that is), however there is no evidence to say there is a difference between the two routes in time taken or calories burned. :ehE

So apart from the obvious of collecting more data sets, which I do try to do, given my earlier caveats over methodology of use of the heart monitor, where do we go from here?

:think:
I would agree with your conclusions. As far as where do you go, I guess that depends on what is the point of these tests? Are you trying to pick among two different routes to walk? Are you trying to get your heart rate up or burn calories?

And in either case, you are only talking about 100 kcal for either route. If the point, for example, is to try to lose weight, the bigger question I would think is not which route do I pick, but how many times a day (or a week) can you walk either of the routes. Whether you burn 97 or 99 isn't going to make much difference.

Sticks
2010-Nov-11, 07:17 PM
The idea was, since I walk to and from work every day, which of the walking routes would burn most calories for loosing weight.

Someone said the more you were in the "Zone" the more you were burning fat which as a proportion of time you are going up Corporation Street, but with there being no difference in calorific value of either route or time taken, I wondered what was going on.

Swift
2010-Nov-11, 08:12 PM
Maybe the best take-away is that it doesn't make a big difference, so if you get bored with one route, you can take the other.

Happy walking.

grant hutchison
2010-Nov-11, 10:24 PM
I agree with Swift: it's a short walk, it doesn't involve many calories, and there's not much difference either way. I'm always careful to teach people that, in medical statistics, there's a difference between "statistically significant" and "clinically significant". A difference can be strongly statistically significant, but too small to worry about; I think this is one of those differences.

That said, I'll protest again that you shouldn't be doing a t-test on those "in the zone" percentages. The percentage scale cuts off top and bottom, limiting the scope for normal distribution tails.
If you're interested in the proportion of the time you're in the zone, you've got nominal data, and you should use a chi-squared test. I reconstructed the raw values in seconds from your second table, and added them up. For Pitt Street, you're in the zone a total of 2072s, and out of the zone for 1258s; for Corporation Street, you're in the zone for 2041s and out for 596s. A chi-squared test shows a highly significant (p<0.001) difference, with the proportion of time in the zone being higher for Corporation Street.
But if you're interested in the actual length of time spent in the zone per journey (which seems like a more useful statistic), you should take the raw values in seconds (which are much more likely to be normally distributed than those percentages are) and do your t-test on those. Assuming equal variance, that gives p=0.003, with Corporation Street being the higher mean.

A statistically significant difference in "zone time" without a correspondingly significant difference in calories probably indicates that you are spending most of your "zone time" only just above the threshold heart rate, and most of your "non-zone time" only just below the threshold heart rate.

Grant Hutchison

Sticks
2010-Nov-11, 11:00 PM
Thanks for that, I really need to locate my old statistics books when i get the chance.

Sticks
2010-Nov-12, 05:31 PM
With apologies to Grant re the % in Zone part

I finally managed to get an equal number of data points for each route, and added maximum heart rate

looking at all the data in, so far seems to confirm that there is no difference between the two routes, and even the infamous "In-Zone %" p-value looks less significant now

13931

orionjim
2010-Nov-12, 08:20 PM
I would do it for at least two more weeks. A major part of the variation you are seeing could be a "day of week" effect. I was taught to always plot the data in time order... Take a look:

http://www.e-huh.com/baut/dataplot.jpg

Again, like everyone else has said, "you need more data".

Jim