zaterdag 20 april 2019

faster sinus

https://gamedev.stackexchange.com/questions/4779/is-there-a-faster-sine-function
https://web.archive.org/web/20100613230051/http://www.devmaster.net/forums/showthread.php?t=5784

float Globals::SinusPrecision(float x) {
    // high precision sine
    float sin;

    //always wrap input angle to -PI..PI
    if (x < -3.14159265f) {
        x += 6.28318531f;
    }
    else {
        if (x > 3.14159265f) {
            x -= 6.28318531f;
        }
    }

    if (x < -3.14159265f || x > 3.14159265f) {
        int w = 1;
    }

    if (x < 0)
    {
        sin = 1.27323954f * x + .405284735f * x * x;

        if (sin < 0)
            sin = .225 * (sin * -sin - sin) + sin;
        else
            sin = .225 * (sin * sin - sin) + sin;
    }
    else
    {
        sin = 1.27323954f * x - 0.405284735f * x * x;

        if (sin < 0)
            sin = .225f * (sin * -sin - sin) + sin;
        else
            sin = .225f * (sin * sin - sin) + sin;
    }

    return sin;
}
Of deze http://web.archive.org/web/20110925033606/http://lab.polygonal.de/2007/07/18/fast-and-accurate-sinecosine-approximation/

CPU cache en snelheid

Snelheden van geheugen toegang van de CPU:

> CPU register: 1 clockcycle
> L1 Cache: 3 clockcycles.
> L2 Cache: 15 clockcycles.
> L3 Cache: 60 clockcycles.
> DRAM: 150 clockcycles.

L1 cache is dus de snelste en is soms wel 320 k groot. Behoorlijk dus.
L1 cache doet 64 bytes per line. Dit betekent dat iedere keer dat DRAM wordt opgehaald er automatisch 64 bytes in de cache terecht komen.
Het is dus belangrijk om ervoor te zorgen dat gegevens die je nodig hebt, sequentieel achter elkaar staan.

dinsdag 16 april 2019

Correcting "Pseudo 3D Planes aka MODE7"

The formula David uses in his Programming Pseudo 3D Planes aka MODE7 (C++) is not correct.
He uses:

   for (int y = 0; y < ScreenHeight() / 2; y++)
   {
      float fSampleDepth = (float)y / ((float)ScreenHeight() / 2.0f);
      float fStartX = (fFarX1 - fNearX1) / (fSampleDepth) + fNearX1;
      float fStartY = (fFarY1 - fNearY1) / (fSampleDepth) + fNearY1;
      float fEndX = (fFarX2 - fNearX2) / (fSampleDepth) + fNearX2;
      float fEndY = (fFarY2 - fNearY2) / (fSampleDepth) + fNearY2;

      for (int x = 0; x < ScreenWidth(); x++)
      {
      ...

In the comments a guy named JohnnyCrash pointed out that David does not draw the Frustum, but from infinity till the far plane. Also he doesn't expect to see weird curvature when you change the near plane.

There is a fellow dutchman who wrote a very nice explanation about mode7. Read it.
Which will result in the correct way of drawing:

float space_y = 100.0f;
float scale_y = 200.0f;
int horizon = 40; 
for (int y = 0; y < Graphics::ScreenHalfDIMy; y++) {
 float distance = space_y * scale_y / (y + horizon);

 float fStartX = fWorldX + (cosf(fWorldAngle + fFoVHalf) * distance);
 float fStartY = fWorldY - (sinf(fWorldAngle + fFoVHalf) * distance);
 float fEndX = fWorldX + (cosf(fWorldAngle - fFoVHalf) * distance);
 float fEndY = fWorldY - (sinf(fWorldAngle - fFoVHalf) * distance);

 for (int x = 0; x < Graphics::ScreenDIMx; x++) {
  float fSampleWidth = (float)x / (float)Graphics::ScreenDIMx;
  float fSampleX = fStartX + ((fEndX - fStartX) * fSampleWidth);
  float fSampleY = fStartY + ((fEndY - fStartY) * fSampleWidth);

  Uint32 pixelColor = racetrackARGB.GetPixel(fSampleX, fSampleY);
  screenARGB.SetPixel(x, y, pixelColor);
 }
}

Which results in the following screenshot, which draws the 3d mode7 at the top and the 2d with frustum at the bottom: