PDA

View Full Version : Math help, too (contest over)

George
2005-Feb-11, 03:25 AM
[Edit - Contest is afoot! Find the Y4 value from this.... post (http://www.badastronomy.com/phpBB/viewtopic.php?p=414773&amp;highlight=#414773) ]

I would appreciate any help on a better way to interpolate my data points.

Take a smooth curve such as y=1/x^2. If I have scattered data points on the curve and want to interpolate, what is the best way.

There is no equation obtainable, to my knowledge, of the curve in question. I have used simple linear proportions to arrive at new data points between existing data points. However, by somehow combing the next inner and outer set of points, a more accurate value should be obtained. I can sense a better way, but I sure don't remember what to do. :-?

01101001
2005-Feb-11, 09:35 AM
I'm not gonna work it out myself, but it seems that, just like 2 points can define a straight line for linear interpolation, 3 points could define a 2nd-degree polynomial in x, a parabola, that you could use for interpolation. I suspect 4 points define a 3rd-degree polynomial.

Google yields seems to yield plenty of hits on parabolic interpolation and cubic interpolation.

papageno
2005-Feb-11, 11:08 AM
Maybe a linear fit to y = a + bx, where x = 1/r^2?
Or maybe a linear fit on a log-log plot (if the data span more than an order of magnitude)?

Disinfo Agent
2005-Feb-11, 12:33 PM
I would appreciate any help on a better way to interpolate my data points.

Take a smooth curve such as y=1/x^2. If I have scattered data points on the curve and want to interpolate, what is the best way.

There is no equation obtainable, to my knowledge, of the curve in question. I have used simple linear proportions to arrive at new data points between existing data points. However, by somehow combing the next inner and outer set of points, a more accurate value should be obtained. I can sense a better way, but I sure don't remember what to do. :-?
From your post, it's not clear to me whether you're dealing with a problem of numerical interpolation, or statistical modelling.

I suggest that you do some research on "numerical interpolation techniques" and "linear and non-linear regression / curve fitting", and see what makes the most sense in the problem you have (there are many ways to do this...)

jfribrg
2005-Feb-11, 03:03 PM
My first thought in reading the post was that non-linear regression is what is needed, but on closer inspection, I'm not sure. If you have the curve y=1/x^2, then you already have the equation. The various scattered points, assuming the all lie on the curve, have no bearing. If you have a set of points and want the equation of the curve that comes closest to fitting, then regression is appropriate. Linear regression will give you a straight line. Non-linear regression will give you a curved line. If you have a bunch of points and want a curve that goes through each one, then look into splines. The most useful are cubic splines. Bezier curves and some other curve-fitting techniques may also be useful.

George
2005-Feb-11, 03:30 PM
From your post, it's not clear to me whether you're dealing with a problem of numerical interpolation, or statistical modelling.

I suggest that you do some research on "numerical interpolation techniques" and "linear and non-linear regression / curve fitting", and see what makes the most sense in the problem you have (there are many ways to do this...)

Thanks. I learned...

The process of finding the coefficients for the fitting function is called curve fitting; the process of estimating the outcomes in between sampled data points is called interpolation; whereas the process of estimating the outcomes beyond the range covered by the existing data is called extrapolation.
http://www.efunda.com/math/num_interpolation/num_interpolation.cfm#Polynomial
I did not realize the difference between the two terms. It seems, inerpolation is for finding points between known points, extrapolation for finding points beyond the given data points.

I am needing both, actually. However, since my curve is almost a straight line, I think a linear approach will suffice (at least initially). My web search reveals cumbersome methods which allow results for dealing with much tougher curves. To incorporate them would be, at this time, "too much squeeze for the juice".

Thanks all.

Disinfo Agent
2005-Feb-11, 03:56 PM
[...] since my curve is almost a straight line, I think a linear approach will suffice (at least initially) [...]
Any spreadsheet will do a linear least squares fit to the data for you...

Kristophe
2005-Feb-11, 03:57 PM
Do you need to do it by hand, or can you use a computational aid? 'Cause there's a range of graphing calculators that will do linear regressions for you. I carry my TI-83 with me just about everywhere I go.

If the data is almost a perfect line, and the number of points is reasonably low, just use the least-squares method.

ngc3314
2005-Feb-11, 04:43 PM
Do you need to do it by hand, or can you use a computational aid? 'Cause there's a range of graphing calculators that will do linear regressions for you. I carry my TI-83 with me just about everywhere I go.

If the data is almost a perfect line, and the number of points is reasonably low, just use the least-squares method.

By the way - this doesn't make it into many textbooks (at least in the fields I run into), but linear least-squares fitting works for the weighted sum of any arbitrary set of functions (including non-analytic ones - think fitting a galaxy spectrum as the sum of star spectra). I include the setup in some course notes (PDF form) at http://www.astr.ua.edu/keel/techniques/noise.pdf

(One of these days I'll actually remember the BBCode way of aliasing a URL in the post...)

George
2005-Feb-11, 05:47 PM
It seems ironic to me that I had to always do math to learn math even though math may be man's most pure scientific thought. Must get my "hands dirty" to learn purity. :)

Here's some data...

point..... x................ y

1........ 486.1 ....... 1.63208
2........ 488.0 ....... 1.63178
3........ 496.5 ....... 1.63046
4........ 501.7 ....... ?
5........ 514.7 ....... 1.62790
6........ 532.0 ....... 1.62569
7........ 546.1 ....... 1.62408

Solve for y4. (I know the answer.)

Winner gets a line of assorted icons from me [or Candy, if she is willing and around here] :)

My goal is to obtain a y value for x = 500.

I used the y = mx + b linear approach. Actually, I used an even more simple method, but it's the same (as my wife says - exactly the same, but different)

[Edit: I've added an inner and outer data point. FWIW, this is the index of refraction of my prisms for the given wavelengths. I must "interpolate" to get this to match my irradiance data. I would tell you the project, but it's over most people's heads - at least around noon time. :wink: ]

[Edit 2: Guessing allowed but only one per person]

Candy
2005-Feb-11, 09:18 PM
Winner gets a line of assorted icons from me [or Candy, if she is willing and around here] :)
:D

Candy
2005-Feb-11, 10:25 PM
1........ 486.1 ....... 1.63208
2........ 488.0 ....... 1.63178
3........ 496.5 ....... 1.63046
4........ 501.7 ....... ?
5........ 514.7 ....... 1.62790
6........ 532.0 ....... 1.62569
7........ 546.1 ....... 1.62408

Solve for y4.
I used Excel, and I got 1.62982. 8-[

George
2005-Feb-11, 10:54 PM
I used Excel, and I got 1.62982. 8-[
That is a statistical fit, isn't it?

It is less accurate than the simple linear method I mentioned. :-?

George
2005-Feb-12, 04:54 AM
If the data is almost a perfect line, and the number of points is reasonably low, just use the least-squares method.

By the way - this doesn't make it into many textbooks (at least in the fields I run into), but linear least-squares fitting works for the weighted sum of any arbitrary set of functions (including non-analytic ones - think fitting a galaxy spectrum as the sum of star spectra). I include the setup in some course notes (PDF form) at http://www.astr.ua.edu/keel/techniques/noise.pdf

Thanks to both of ya'll. I should have not used the phrase "scattered data points along the curve" as I have no points which are above or below the actual curve. I simply want to find data points along a curve from the data points known, without knowing the equation for the function.
I just figured there would be some simple method for this and feared it was too dumb a question (especially in light of Mike's encouragement for a dreaded "inverse smartest persons" thread. :P )

(One of these days I'll actually remember the BBCode way of aliasing a URL in the post...)
[One of these days I'll improve my "aliasing" problem with reasing and answering posts. :P :-? ]

Disinfo Agent
2005-Feb-12, 03:28 PM
Solve for y4. (I know the answer.)

Winner gets a line of assorted icons from me [or Candy, if she is willing and around here] :)
How about awarding the one who gets closer to the answer? :wink:

My goal is to obtain a y value for x = 500.

I used the y = mx + b linear approach. Actually, I used an even more simple method, but it's the same (as my wife says - exactly the same, but different)
Which points did you use? All of them?

George
2005-Feb-12, 09:15 PM
Solve for y4. (I know the answer.)

Winner gets a line of assorted icons from me [or Candy, if she is willing and around here] :)
How about awarding the one who gets closer to the answer? :wink:
Agreed. Let's make the award on Friday (if Candy agrees). However, with any duplicate answers, the award will go to the first entry.

My goal is to obtain a y value for x = 500.

I used the y = mx + b linear approach. Actually, I used an even more simple method, but it's the same (as my wife says - exactly the same, but different)
Which points did you use? All of them?[/quote]
I used only the points adjacent to the point in question. [The slope will be most accurate with the closer adjacent points.]

However, the rate of slope changes around the point in question should help produce a better answer. So, some sort of differentation should be available to help. I have assumed there is something somewhat simple that could be used to improve the simple linear approach.

Candy
2005-Feb-12, 11:38 PM
Let's make the award on Friday (if Candy agrees). However, with any duplicate answers, the award will go to the first entry.
I'll be around. I'm taking the week off starting tomorrow. :D

A Thousand Pardons
2005-Feb-13, 04:36 AM
1.629710837 is what I get.

PS: oops, missed this

However, by somehow combing the next inner and outer set of points
If you are going to limit yourself to just those two points, then the simple linear interpolation is about the best you can do.

With the given six points, you can find a fifth degree polynomial that fits the data exactly--but it might blow up outside the range, especially if the "real" curve is not a fifth degree polynomial (you mentioned 1/x^2).

One way to go about it is to guess the types of relationships that might be involved. In other words, you suppose that you might see 1/x^2 and 1/x terms as well as x and x^2--but no terms with anything higher.

Construct a matrix with your y values and a column for each of x, x^2, 1/x, and 1/x^2. Now, solve for the linear combination of those values that best fits the data. The number I gave above was just the assumption that the curve had only 1/x^2 terms.

George
2005-Feb-13, 06:07 AM
1.629710837 is what I get.

PS: oops, missed this

However, by somehow combing the next inner and outer set of points
If you are going to limit yourself to just those two points, then the simple linear interpolation is about the best you can do.

With the given six points, you can find a fifth degree polynomial that fits the data exactly--but it might blow up outside the range, especially if the "real" curve is not a fifth degree polynomial (you mentioned 1/x^2).
The 1/x^2 curve was to serve as only an example of the curve's smoothness (as opposed to a bunch of scattered points needing a "fit").
There may be some equation out there for prisms which generates a function for the variation in the index of refraction per wavelength. But, I don't know of one. So, I must interpolate with the empircal data given. :-?

One way to go about it is to guess the types of relationships that might be involved. In other words, you suppose that you might see 1/x^2 and 1/x terms as well as x and x^2--but no terms with anything higher.

Construct a matrix with your y values and a column for each of x, x^2, 1/x, and 1/x^2. Now, solve for the linear combination of those values that best fits the data. The number I gave above was just the assumption that the curve had only 1/x^2 terms.
That seems hard. The curve itself is almost linear, when plotted, except for the range near 400 nm where the index of refraction increases faster than in the 600 nm range.

BTW, my linear calculation was slightly different than yours. #-o However, your's is closer to being correct. :)

[edit: You are in the lead =D> ]

George
2005-Feb-13, 06:09 AM
Let's make the award on Friday (if Candy agrees). However, with any duplicate answers, the award will go to the first entry.
I'll be around. I'm taking the week off starting tomorrow. :D

8) Have some fun. =D>

A Thousand Pardons
2005-Feb-13, 11:30 AM
That seems hard.

It's a standard way of dealing with it, though, so there may be packages available to do it for you. Basically, if you have m data points and want to include n-1 terms (1, x, x^2, ... x^(n-1), or you can even use 1/x^2 terms) you build an mxn matrix of values, multiply a nxm matrix times a mxn to get a nxn matrix that you invert and use to interpolate on other values. If you restrict it to x^n type terms, then this method gives you the standard polynomial regression, and if m=n then the fit is exact (two points determine a line, three a quadratic, etc).

BTW, my linear calculation was slightly different than yours.
Mine wasn't really linear though. :) I used the above method where my terms were (1,1/x^2). I misread the OP.

Maksutov
2005-Feb-13, 12:46 PM
Statistically speaking, what you need is the application of Chebyshev Polynomials. (http://mathworld.wolfram.com/ChebyshevPolynomialoftheFirstKind.html) Really good for something beyond power series polynomial curve-fitting.

George
2005-Feb-13, 08:38 PM
That seems hard.

It's a standard way of dealing with it, though, so there may be packages available to do it for you. Basically, if you have m data points and want to include n-1 terms (1, x, x^2, ... x^(n-1), or you can even use 1/x^2 terms) you build an mxn matrix of values, multiply a nxm matrix times a mxn to get a nxn matrix that you invert and use to interpolate on other values. If you restrict it to x^n type terms, then this method gives you the standard polynomial regression, and if m=n then the fit is exact (two points determine a line, three a quadratic, etc).

Ok. This seems it make some sense. I suppose one should guess the best fitting term to get the best result. I still think a more simple method will suffice.

BTW, my linear calculation was slightly different than yours.
Mine wasn't really linear though. :) I used the above method where my terms were (1,1/x^2). I misread the OP.
Well, it works better than the linear approach. :)

George
2005-Feb-13, 08:41 PM
Statistically speaking, what you need is the application of Chebyshev Polynomials. (http://mathworld.wolfram.com/ChebyshevPolynomialoftheFirstKind.html) Really good for something beyond power series polynomial curve-fitting.
Impressive but burdensome for my wimpy talents. :-?

Want to try your hand at winning the prestigious award? :)

A Thousand Pardons
2005-Feb-13, 09:29 PM
Ok. This seems it make some sense. I suppose one should guess the best fitting term to get the best result. I still think a more simple method will suffice.

It's actually pretty simple :)

If you want to brute force find the fifth degree polynomial that fits those six points exactly, just define a function f(x) that consists of the sum of six terms, one for each data point. Each term has a factor of x minus an x value, for all values except the data point, times one final factor that is whatever value is necessary to make the result equal that data point's y value. Because of the way we built this function, for each data point, all the terms will be zero, except the one corresponding term. The result will be a fifth degree polynomial.

Another way to do it, is start with a*x^5 + b*x^4 ... e*x + f = f(x), and one by one plug in all six data points. You'll have six linear equations, with six unknowns (a,b,c,d,e,f), which can be solved to find the polynomial.

The relative sizes of the x and y values make me cringe, though.

George
2005-Feb-14, 03:29 PM
Ok. This seems it make some sense. I suppose one should guess the best fitting term to get the best result. I still think a more simple method will suffice.

It's actually pretty simple :)

If you want to brute force find the fifth degree polynomial that fits those six points exactly, just define a function f(x) that consists of the sum of six terms, one for each data point. Each term has a factor of x minus an x value, for all values except the data point, times one final factor that is whatever value is necessary to make the result equal that data point's y value. Because of the way we built this function, for each data point, all the terms will be zero, except the one corresponding term. The result will be a fifth degree polynomial.
Mush. Verbalized math is an oxymoron in my brain (reason #3 why I chose engineering :) )

Another way to do it, is start with a*x^5 + b*x^4 ... e*x + f = f(x), and one by one plug in all six data points. You'll have six linear equations, with six unknowns (a,b,c,d,e,f), which can be solved to find the polynomial.
Meat for a carnivore. Thanks. I can handle this.

Nevertheless, I still figured another approach would be available. For instance, we know the curve is negative slope. It is also "smooth". Therefore, the linear interpolation, from points 3 and 5, would clearly produce a result for y4 which would be too high in value (the curve must be under the line). Also, the slope produced between points 2 and 3 must extend below the y4 point. Y4 is now, somewhat, cornered. Points 5 and 6 will also form a sloped line that would extend below y4, improving our determination of y4. Further refinement would come from the percent difference in slopes applied, in weighted form, to the percent of x4 from x3 and also x5. In other words, the slope between the two adjacent points would be a good starting point and the refinement would come from the amount of adjacent slopes (weighted pull, if you will). Sounds messy, but it would be remedial math.

I am guessing something less crude is avaliable. Perhaps your approach is the best, however.

George
2005-Feb-14, 08:17 PM
Current Contest Standings

[Top 2 shown only (for reasons not disclosed)]

1..... A Thousand Pardons
2..... Candy

=D> =D>

8-[

[The post with the question is.... here (http://www.badastronomy.com/phpBB/viewtopic.php?p=414773&amp;highlight=#414773)

Candy
2005-Feb-15, 08:21 PM
Current Contest Standings

[Top 2 shown only (for reasons not disclosed)]

1..... A Thousand Pardons 1.629710837
2..... Candy 1.62982

=D> =D>

8-[

[The post with the question is.... here (http://www.badastronomy.com/phpBB/viewtopic.php?p=414773&amp;highlight=#414773)

Well, I will officially end the contest.
The answer is y4 = 1.62969.
ATP wins hand down.

To keep in the spirit of the BABB, the official prize is...
http://www.clicksmilies.com/s0105/party/party-smiley-040.gif

A Thousand Pardons
2005-Feb-15, 08:25 PM
a wobble?! thanks, it's what I always wanted...except, I already have one

Candy
2005-Feb-15, 08:32 PM
a wobble?! thanks, it's what I always wanted...except, I already have one
How about a sports car, then?

http://www.clicksmilies.com/s0105/auto/car-smiley-002.gif

George
2005-Feb-15, 08:37 PM
a wobble?! thanks, it's what I always wanted...except, I already have one
How about a sports car, then?

http://www.clicksmilies.com/s0105/auto/car-smiley-002.gif

:lol:

Let's not get carried away. If we offer too much, then the second place winner may want something, too. :)

I suppose at least this is in order....

First Place..........Thousand Pardons....... =D> =D> =D> =D>
Second Place....Candy.............................. =D> =D>

Presenter's award (Candy) ....... 8) 8) =D> =D> :o :) =D> =D>

George
2005-Feb-20, 05:58 AM
Another way to do it, is start with a*x^5 + b*x^4 ... e*x + f = f(x), and one by one plug in all six data points. You'll have six linear equations, with six unknowns (a,b,c,d,e,f), which can be solved to find the polynomial.
Meat for a carnivore. Thanks. I can handle this.

UPDATE:

I used your 5th degree polynomial with 1/x as the base. It nailed it dead on!

The numbers were all ugly. It suddenly came back to me, with a little help from Quatro Pro Help, to take the inverse of the matrix and multiply by the y-value matrix. It went slick. I hadn't done one like this before and it's been over 30 years since I did any matrix. :)

A Thousand Pardons
2005-Feb-20, 08:17 PM
I used your 5th degree polynomial with 1/x as the base. It nailed it dead on!
Cool.

Mathematically, that solution is equivalent to using x instead of 1/x, although there may be computational problems introduced that are not as severe in one or the other.

George
2005-Feb-20, 11:38 PM
I used your 5th degree polynomial with 1/x as the base. It nailed it dead on!
Cool.

Mathematically, that solution is equivalent to using x instead of 1/x, although there may be computational problems introduced that are not as severe in one or the other.

I knew the curve was somewhat parabolic, it was similar to 1/x vs. just x. I assumed it would yield better results, but, that was just a hunch.

It's a treat to sense gears engage after 30 yrs. of cryogensis. :) That might be an encouragement to the young whipper-snappers (should they live so long). :wink:

A Thousand Pardons
2005-Feb-21, 01:42 PM
I was wrong anyway. :) Not sure what I was thinking. The polynomial with a*x^5 +...+ f has a y-intercept of f, whereas one with a*(1/x)^5 will blow up at x=0. Probably very close for intermediate values though.

George
2005-Feb-21, 02:28 PM
I was wrong anyway. :) Not sure what I was thinking. The polynomial with a*x^5 +...+ f has a y-intercept of f, whereas one with a*(1/x)^5 will blow up at x=0. Probably very close for intermediate values though.
I realized this but was not concerned. I already knew what the curve looked like at the data points known. It looks like a gentle sloping section of a parabola and the values were not close to the asymptote.

Since the accuracy was at least 0.001% with the test data (contest data) using your 5th polynomial approach, my confidence is high for the other points, too. I also extrapolated using this method as I sensed the equation would not mind if I did this in lieu of interpolation. :wink: