xref: /OK3568_Linux_fs/kernel/Documentation/arm/nwfpe/notes.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593SmuzhiyunNotes
2*4882a593Smuzhiyun=====
3*4882a593Smuzhiyun
4*4882a593SmuzhiyunThere seems to be a problem with exp(double) and our emulator.  I haven't
5*4882a593Smuzhiyunbeen able to track it down yet.  This does not occur with the emulator
6*4882a593Smuzhiyunsupplied by Russell King.
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunI also found one oddity in the emulator.  I don't think it is serious but
9*4882a593Smuzhiyunwill point it out.  The ARM calling conventions require floating point
10*4882a593Smuzhiyunregisters f4-f7 to be preserved over a function call.  The compiler quite
11*4882a593Smuzhiyunoften uses an stfe instruction to save f4 on the stack upon entry to a
12*4882a593Smuzhiyunfunction, and an ldfe instruction to restore it before returning.
13*4882a593Smuzhiyun
14*4882a593SmuzhiyunI was looking at some code, that calculated a double result, stored it in f4
15*4882a593Smuzhiyunthen made a function call. Upon return from the function call the number in
16*4882a593Smuzhiyunf4 had been converted to an extended value in the emulator.
17*4882a593Smuzhiyun
18*4882a593SmuzhiyunThis is a side effect of the stfe instruction.  The double in f4 had to be
19*4882a593Smuzhiyunconverted to extended, then stored.  If an lfm/sfm combination had been used,
20*4882a593Smuzhiyunthen no conversion would occur.  This has performance considerations.  The
21*4882a593Smuzhiyunresult from the function call and f4 were used in a multiplication.  If the
22*4882a593Smuzhiyunemulator sees a multiply of a double and extended, it promotes the double to
23*4882a593Smuzhiyunextended, then does the multiply in extended precision.
24*4882a593Smuzhiyun
25*4882a593SmuzhiyunThis code will cause this problem:
26*4882a593Smuzhiyun
27*4882a593Smuzhiyundouble x, y, z;
28*4882a593Smuzhiyunz = log(x)/log(y);
29*4882a593Smuzhiyun
30*4882a593SmuzhiyunThe result of log(x) (a double) will be calculated, returned in f0, then
31*4882a593Smuzhiyunmoved to f4 to preserve it over the log(y) call.  The division will be done
32*4882a593Smuzhiyunin extended precision, due to the stfe instruction used to save f4 in log(y).
33