Talk:Floating-point arithmetic

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
(Redirected from Talk:Floating-point number)

imprecise info about imprecision of tan(gens)[edit]

imho that: 'Also, the non-representability of π (and π/2) means that an attempted computation of tan(π/2) will not yield a result of infinity' - is misleading as it's more a problem of the cos approximation not yielding '0' for pi()/2, if you replace cos(x) with sin(x-pi()/2) for that range you get a nice #DIV/0! for tan(pi()/2),

as well sin(pi()) not resulting in '0' can be corrected by replacing sin(x) with -sin(x-pi()) for that range,

not sure if it holds, but if you reduce all trig. calculations on the numerical values of sin in the first quadrant - what imho is possible - the results may come out quite fine ... greatly neglected by calc, ex$el and others ... — Preceding unsigned comment added by 77.0.177.112 (talk) 01:26, 11 March 2021 (UTC)[reply]

No, the tan floating-point function has nothing to do with the cos floating-point function. Vincent Lefèvre (talk) 11:32, 11 March 2021 (UTC)[reply]
hello @Vincent, sorry for objecting ... imho (school math) and acc. to wikipedia (https://en.wikipedia.org/wiki/Trigonometric_functions, esp. 'Summary of relationships between trigonometric functions' there) "tan(x) = sin(x) / cos(x)", once you get a proper cos at pi()/2 [use sin(pi()/2-x), same reference], you can calculate a proper tan with overflow (#DIV/0! in 'calc'),

perhaps it won't work 'in IEEE' (then it's a weakness there), but developers or users can achieve proper results once they have proper sin values for the first quadrant, — Preceding unsigned comment added by 77.0.177.112 (talk) 14:03, 11 March 2021 (UTC)[reply]

"tan(x) = sin(x) / cos(x)" is a mathematical definition on the set of the real numbers. This has nothing to do with a floating-point specification. — Vincent Lefèvre (talk) 18:23, 11 March 2021 (UTC)[reply]
hello @Vincent, what's going on? there is a possibility to get math correct results, and you don't want it be posted?

0.: despite you think it a 'non floating point specification' you agree that the formulas hold and achieve correct results?

1.: the wikipedia article does not! state that there is any special 'floating-point-tangens' specification (and imho there isn't any), but states 'that an attempted computation of tan(π/2) will not yield a result of infinity', and that's simply only true for some attempts, by calculating sin() and cos() you can get the correct overflow,

2.: 'mathematical definition on the set of the real numbers', yes, but what in that contradicts applying it on float or double figures as they are a subset of reals? some representations and results will have small deviations, that's the tradeoff for the speed of floats, but the basic math rules should hold as long as there aren't special points against it (as there are against e.g. associative rule) , pi(), pi()/2, pi()/4, 2*pi() and so on are not exact in floats or doubles ... as well as they are not! exact in decimals, despite that we calculate infinity for tan(pi()/2) in decimals, and thus we can! do the same in doubles (and floats?),

3.: plenty things in this world suffer from small deviations in fp-calculations ... we should start correcting them instead of getting the prayer mill 'fp-math is imprecise' going again and again,

4.: i am meanwhile slightly annoyed when 'fp-math is imprecise' is pushed again and again with wrong reasons, fp-math has weaknesses and 'you have to care what you do with it' is true and well known since Goldberg, but this does not forbid to achieve correct results with good algorithms, on the contrary, Goldberg and Kahan explicitly recommend it (because they did not see floating point numbers as a special world in which own laws should apply but as tools to be able to process real world tasks as fast and as good as possible),

5.: the article states that a correct calculation of tan(x) at pi()/2 is impossible as a result of the representation of pi() being imprecise, i'd show: a: it's not impossible, b: the representation of pi() isn't an issue against good results,

agree? if not please with clear definitions and sources ... — Preceding unsigned comment added by 77.3.16.116 (talk) 16:29, 12 March 2021 (UTC)[reply]

Well, reading the beginning of what you said, about using sin(x-pi/2) in the implementation, yes, due to the cancellation in the subtraction, one could get a division by 0 and an infinity. I've clarified the text by saying "assuming an accurate implementation of tan". This would disallow implementations that do such ugly things. Even using sin(x)/cos(x) in the floating-point system to implement tan(x) would be a bad idea, due to the errors on sin and on cos, then on the division. And for 5, you misread the article (it implicitly assumes no contraction of the tan(pi/2) expression, but this is quite obvious to me). The article does not say that computing tan at the math value π/2 is impossible, it just says that the floating-point tan function will never give an infinity, because its input cannot be π/2 exactly (or kπ+π/2 exactly, k being an integer). — Vincent Lefèvre (talk) 03:44, 13 March 2021 (UTC)[reply]
hello @Vincent,
- what do you refer to with: 'the floating-point tan function'? i could only find ('C')-library implementations and recomendations for FPGA's, and 'The asinPi, acosPi and tanPi functions were not part of the IEEE 754-2008 standard because they were deemed less necessary' in 'https://en.wikipedia.org/wiki/IEEE_754',
- "assuming an accurate implementation of tan": that sounds misleading and imho an attempt to stick to 'fp-math is imprecise' despite there are correct solutions, duping them as 'not accurate',
- 'due to the errors on sin and on cos,': if you - or everyone - implement(s) the trig functions as proposed, and! respectively takes that function / that part of the quadrant that has less error one will get ... 'good results',
- 'implementations that do such ugly things': opposite ... IEEE or '(binary) fp-math' or 'reducing accuracy by limiting to small amount off digits' is doing 'ugly things' with math in general, most of countermeasures rely on 'dirty tricks', i'd suggest letting the mill 'fp-math is imprecise' phase out, and using instead 'we are intelligent beings, we can recognize difficulties and deal with them' ... or the ' ... at least we try to',
- 'Even using sin(x)/cos(x) in the floating-point system to implement tan(x) would be a bad idea, due to the errors on sin and on cos, then on the division.' - don't think that simple, it's well known which trig-function has weaknesses in which range(s) (calculated by approximations or taylor series or similar), pls. consider using substitutions only for that ranges ... — Preceding unsigned comment added by 77.10.180.117 (talk) 11:09, 13 March 2021 (UTC)[reply]
The tan function (tangent) is included in the IEEE 754 and ISO C standards, for instance. The sentence "The asinPi, acosPi and tanPi functions..." is not about the tan function; moreover, this is historical information, as these functions are part of the current IEEE 754 standard as explained. My addition "assuming an accurate implementation of tan" is needed because some trig implementations are known to be inaccurate (at least for very large arguments), so who knows what one can get with such implementations... — Vincent Lefèvre (talk) 12:16, 13 March 2021 (UTC)[reply]

Lead section edits[edit]

I edited the lead section to try to tidy it up in the following ways:

- Previously the opening sentence was "In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision." I found this opaque (what is "arithmetic using formulaic representation"?) and oblique (it doesn't tell you what a floating-point number is, it only talks about an attempted "trade-off"). I think Wikipedia articles should open by defining the thing at hand directly, rather than talking around it. Therefore, the new opening sentence explicitly describes floating-point representation: "In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precisison, called the mantissa, scaled by an integer exponent of a fixed base."

- Both "significand" and "mantissa" are used to describe the non-exponent part of a floating-point number, but "mantissa" is far more common, so I think it's the better choice. (Google: "floating-point mantissa" yields 672,000 results; "floating-point significand" yields 136,000 results).

- Previously, the topic of the large dynamic range of floating-point numbers was mentioned twice separately; these mentions have been merged into a single paragraph.

- The links for examples of magnitude are changed to point to the actual examples mentioned (galactic distances and atomic distances).

Feel free to discuss here. — Ka-Ping Yee (talk) 23:31, 12 October 2022 (UTC)[reply]

Schubfach is not WP:OR[edit]

I'm not quite sure why some of you consider Schubfach as WP:OR. Several implementations have been around for several years already, in particular it has been already adopted to Nim's standard library a year ago and working fine. It's true that the article is not formally reviewed, but honestly being published in a peer-reviewed conference/journal does not necessarily give that much of credit in this case. For example, one of the core parts (minmax Euclid algorithm) of the paper on Ryu contains a serious error, and this has been pointed out by several people, including Nazedin (a core contributor to Schubfach) if I recall correctly.

The main reason why Schubfach paper has not been published in a peer-reviewed journal, as far as I remember, is not because the work has not been verified, rather simply because the author didn't feel any benefit of going through all the paper works for journal publishing (things like fitting into the artificial page limit). The reason why it is still not accepted in OpenJDK (is it? even if it's not merged yet, it will make it soon) is probably because of lack of human resource who can and are willing to review the algorithm, and submitting the paper to a journal does not magically create such a human resource. (Of course they will do some amount of review, but it is very very far from being perfect, which is why things like the errors in the Ryu paper have not been caught in the review process.)

The point is, Schubfach as an algorithm has already been completed a long time ago, like in 2017 as far as I believe, and at least two implementations (one in Java and one in C++) have been around at least since 2019, and the C++ one has been adopted to the standard library of a fairly popular language (Nim), and you can even find several more places where it has been adopted (Roblox, a very popular game in US, for example). So what really is a difference from Ryu? The only difference I can tell is that Ryu has a peer-reviewed journal paper, but as I elaborated, that isn't that big difference as far as I can tell. You also mentioned about new versions of the paper, and I felt like as if you think Schubfach is sort of a WIP project. If that's the case, then no, the new versions are just minor fixes/more clarifications rather than big overhauls. If Ryu paper were not published in a journal, probably the author of Ryu would have done the same kind of revisions (and fixed the error mentioned).

In summary, I think at this point Schubfach is definitely an established work which has no less credibility compared to Ryu and others. 2600:1700:7C0A:1800:24DF:1B93:6E37:99D2 (talk) 01:09, 10 November 2022 (UTC)[reply]

In the mean time, I've learned by e-mail that the paper got a (possibly informal) review by serious people. So, OK to re-add it, but it is important to give references showing that it is used. And please, give the latest version of the paper and avoid typos in the WP text. And instead of "Apparently", try to give facts (i.e., what is really meant by "apparently"). Thanks. — Vincent Lefèvre (talk) 01:26, 10 November 2022 (UTC)[reply]

Digits of precision, a confusing early statement[edit]

I have removed the portion after the ellipses from the following text formerly found in the article: "12.345 is a floating-point number in a base-ten representation with five digits of precision...However, 12.345 is not a floating-point number with five base-ten digits of precision." I recognize the distinction made (a number with 5 base-ten digits of precision vs. a base-ten representation of a number with five digits of precision) and I suspect the author intended to observe that a binary representation of 12.345 would not have five base-ten digits of precision, but I can't divine what useful thing is intended to have been communicated there, so I've removed it. If I'm missing something obvious in the interpretation of this line, I suspect many others could, and encourage a more direct explanation if it's replaced. john factorial (talk) 18:44, 24 July 2023 (UTC)[reply]

The sentence was made nonsensical by this revision by someone who mistook 12.3456 for a typo rather than a counterexample: https://en.wikipedia.org/w/index.php?title=Floating-point_arithmetic&diff=prev&oldid=1166821013
I have reverted the changes, and added a little more verbiage to emphasize that 12.3456 is a counterexample. Taylor Riastradh Campbell (talk) 20:56, 24 July 2023 (UTC)[reply]