Bug #482

Some shape rotations are bad with NL_NO_ASM

Added by kervala over 10 years ago. Updated almost 9 years ago.

Status:Closed Start date:04/11/2009
Priority:Normal Due date:
Assignee:kervala % Done:

100%

Category:NeL: General
Target version:Version 0.7.0

Description

When we compile NeL with NL_NO_ASM, some shapes doesn't rotate as intended (its scale is also modified).

floortestcase.7z - Test Case for OptFastFloor (7.6 kB) sfb, 06/16/2009 08:06 pm

fast_floor.diff (1.7 kB) Magnifier kervala, 07/26/2009 10:58 pm

History

#1 Updated by kervala over 10 years ago

Bug occurs in misc/fast_floor.h OptFastFloor function. The C++ equivalent code is wrong in some rare cases.

#2 Updated by sfb over 10 years ago

Attached is a test case to prove the problem and below is the output.

This strikes another important bit of information. OptFastFloor doesn't exist on Linux (or other platforms) so these objects with bad rotations will aways rotate incorrectly on non-Windows platforms unless a non-assembly, cross-platform variation of OptFastFloor is developed and all #ifdef NL_OS_WINDOWS references surrounding OptFastFloor calls are removed.

C:\Projects\floortestcase\build\bin\Release>floorTestCase.exe
INF  6e50 main.cpp 37 main floorTestCase.exe : starting comparison loop.
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.0 OptFstFlr: 0 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.1 OptFstFlr: 0 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.2 OptFstFlr: 0 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.3 OptFstFlr: 0 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.4 OptFstFlr: 0 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.5 OptFstFlr: 0 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.6 OptFstFlr: 1 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.7 OptFstFlr: 1 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.8 OptFstFlr: 1 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 0.9 OptFstFlr: 1 Std: 0.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.0 OptFstFlr: 1 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.1 OptFstFlr: 1 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.2 OptFstFlr: 1 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.3 OptFstFlr: 1 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.4 OptFstFlr: 1 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.5 OptFstFlr: 2 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.6 OptFstFlr: 2 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.7 OptFstFlr: 2 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.8 OptFstFlr: 2 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 1.9 OptFstFlr: 2 Std: 1.0
INF  6e50 main.cpp 46 main floorTestCase.exe : Orig: 2.0 OptFstFlr: 2 Std: 2.0

#3 Updated by kervala over 10 years ago

Thanks a lot for this test-case :)

#4 Updated by Spex about 10 years ago

Looking at the output this "floor" is more like a "round" towards even numbers; replacing it with a real floor() isn't obviously not working then.

Unfortunately, C++ doesn't define (yet?) any standard way how to handle rounding. The fast "floor" routines are based on manipulating the FP environment, which can be done using C99 functions as well.

http://msdn.microsoft.com/en-us/library/aa289157(VS.71).aspx -- Microsoft Visual C++ Floating-Point Optimization

http://www.kernel.org/doc/man-pages/online/pages/man3/fenv.3.html -- fenv based functions according to C99; no idea how much overhead the actual rounding functions have (aka if they are fast or bring problems like FP state switches on each call, which is why these OptFastXXX functions exist in the first place)

#5 Updated by kervala about 10 years ago

I succeeded to fix it using SSE intrinsics under Windows, I post it as a patch for the moment since it could exist a better way to handle it.

As these functions are inline and they are available on all CPU with amd64 instructions (all CPU with AMD64 instructions have both SSE and SSE2), I suppose we don't need to add a check for the presence of SSE instruction.

#7 Updated by kervala about 10 years ago

  • Status changed from New to Validated

#8 Updated by kervala about 10 years ago

  • Status changed from Validated to Assigned
  • Assignee set to kervala

#9 Updated by kervala about 10 years ago

  • Status changed from Assigned to Resolved
  • % Done changed from 0 to 100

Applied in changeset r1681.

#10 Updated by sfb about 10 years ago

  • Status changed from Resolved to Closed

Thanks kervala, this looks fantastic! I'm closing this issue.

#11 Updated by kervala almost 9 years ago

  • Project changed from NeL to Ryzom
  • Category deleted (Misc)
  • Target version deleted (Version 0.7.0)

#12 Updated by kervala almost 9 years ago

  • Category set to NeL: General
  • Target version set to Version 0.7.0

Also available in: Atom PDF