Embedded World 2023 - STM32 CORDIC CO-PROCESSOR

That´s a better range:

now with the CORDIC doing 6 cycles →

Iterations:   6284
elapsed_ticks:   5970

Yeah, 4294967272 is -24 read as an unsigned number, so it looks like you’re subtracting the larger number from the smaller number. it should be elapsed_ticks = stop_ticks-start_ticks.

2 Likes

Ok, I just found out.

Thx. Now, lets do this right. Ill run your precision function after.

For reference:

1,000,000 microseconds in a second.

Only for objects with the same frame of reference - otherwise, according to Einstein, our clocks run divergently…

Right, time is relative. Eternity is therefore in the center of the singularity before the Big Bang?

Maybe in the Sun?

Now with SFOC →

Iterations:   6284
elapsed_ticks:   24161

So according to these calculations, with the proper timing, the CORDIC is 4 times faster using int32_t.

That would be cool!

If it turns out it is as precise (or better!) then we could really think about using it :slight_smile:

Perhaps a good approach would be to make the _sin() function weakly bound, like the hardware-dependent functions, with the default implementation being the lookup table. Then on MCUs with better options we can add a MCU-specific _sin() method to optimize it.

Will I then get my badge ?

1 Like

@dekutree64

SimpleFOC sin RMS error x 1000000: ovf
CORDIC sin RMS error x 1000000: 707058

WTF! Did I order auto-complete quotes !?

  • 1e3f
SimpleFOC sin RMS error x 1000: 32259760.00
CORDIC sin RMS error x 1000: 707.1

Well, that would look promising :slight_smile:

Can you post the whole code, then I can test it out and try including something for the next release?

Or you can make it into an official pull-request of course, if you prefer.

I think I will clone the repo and start work on the 8pwm stepper implementation. This is a new chapter for me, I hope to learn some.

How do I point to my local GitHub\Arduino-FOC ?

You can just clone the githib into your PlatformIO Project’s lib folder… you can also symlink it there…

I wrote up a test program myself, because I was intrigued by your results, but mine are different…

Starting...
Initializing CORDIC...
CORDIC initialized.


Timing CORDIC vs stdlib sin vs SimpleFOC Sine calculations...

CORDIC:
CORDIC Time (us) for 3217 steps: 5940

SimpleFOC _sin:
SimpleFOC _sin time (us) for 3217 steps: 865

stdlib sin:
stdlib sin time (us) for 3217 steps: 118

Comparing accuracy...
RMS difference between CORDIC and stdlib: 0.68513519
RMS difference between SimpleFOC and stdlib: 0.00161161
Test complete.

I’m running the input between 0 and PI, because that’s the common input range of both simplefoc _sin and CORDIC. I’m not convinced my conversion between float and Q31 is correct :frowning: it’s hard to check.
So I’m not sure my accuracy calculations are correct yet.

What’s surprising me is that the math.h built-in sin() function is actually the fastest :exploding_head:

Probably there’s just some error in my code…

Iterations:   6284
elapsed_ticks:   9604

time per iterasion:   1

A stepper has 50 pole pairs. If we run it at 50Khz, the CORDIC can do around 17 calculations points per 50Khz timer. Each calculation being 6 iterations / cycles

π = 3,1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 342117

That should make the CORDIC run at 4.8Mhz. Maybe its running on the 48Mhz clock?

Do you think it possible to clock it with the eternal grand clock 24Mhz?

So I am not sure what’s different or where I’ve gone wrong… but for me the results come out like this:

Starting...
Initializing CORDIC...
CORDIC initialized.


Timing CORDIC vs stdlib sin vs SimpleFOC Sine calculations...

CORDIC:
CORDIC Time (us) for 3217 steps: 6358

SimpleFOC _sin:
SimpleFOC _sin time (us) for 3217 steps: 865

stdlib sin:
stdlib sin time (us) for 3217 steps: 118

Comparing accuracy...
RMS difference between CORDIC and stdlib: 1.51433241
RMS difference between SimpleFOC and stdlib: 0.00161161
Test complete.

CORDIC code is this:

#include "Arduino.h"
#include "stm32g4xx_hal_cordic.h"
#include "common/foc_utils.h"

CORDIC_HandleTypeDef thisCordic;

#define float2q31(v)	\
            ((v < 0.0) ? (int32_t)(0x80000000 * (v) - 0.5) \
            : (int32_t)(0x80000000 * (v) + 0.5))

#define q312float(v)	\
            ((float)(v) / 0x80000000)



void CORDIC_Config(void) {
    __HAL_RCC_CORDIC_CLK_ENABLE();
    CORDIC_ConfigTypeDef sConfig;
    thisCordic.Instance = CORDIC;
    if (HAL_CORDIC_Init(&thisCordic) != HAL_OK) {
        Error_Handler();
        Serial.println("CORDIC init error");
    }

    sConfig.Function = CORDIC_FUNCTION_SINE;  
    sConfig.Precision = CORDIC_PRECISION_6CYCLES;
    sConfig.Scale = CORDIC_SCALE_0;
    sConfig.NbWrite = CORDIC_NBWRITE_1;
    sConfig.NbRead = CORDIC_NBREAD_1;
    sConfig.InSize = CORDIC_INSIZE_32BITS;
    sConfig.OutSize = CORDIC_OUTSIZE_32BITS;
    if (HAL_CORDIC_Configure(&thisCordic, &sConfig) != HAL_OK) {
        /* Channel Configuration Error */
        Error_Handler();
        Serial.println("CORDIC config error");
    }
}

float cordic_sin(float input) {
    int32_t input32;
    int32_t output32;

    input32 = float2q31(input/_PI);

    if (HAL_CORDIC_Calculate(&thisCordic, &input32, &output32, 1, 1000) != HAL_OK) {
        /* Processing Error */
        Error_Handler();
        Serial.println("CORDIC calc error");
    }

    return q312float(output32)*_PI;
}

And the test-loop looks like this:

#include <Arduino.h>
#include <SimpleFOC.h>
#include "common/foc_utils.h"
#include "./hal_cordic.h"

void setup() {
  Serial.begin(115200);
  while (!Serial);
  delay(3000);
  Serial.println("Starting...");
  Serial.println("Initializing CORDIC...");
  CORDIC_Config();
  Serial.println("CORDIC initialized.");
  Serial.println();
  Serial.println();
}

void loop() {

  Serial.println("Timing CORDIC vs stdlib sin vs SimpleFOC Sine calculations...");
  Serial.println();

  Serial.println("CORDIC:");
  float step = 1/1024.0f;
  int steps = 0;
  long ts = micros();
  for (float i = 0.0; i < _PI; i+=step) {
    float res = cordic_sin(i);
    steps++;
  }
  long ts_end = micros();
  Serial.print("CORDIC Time (us) for ");
  Serial.print(steps);
  Serial.print(" steps: ");
  Serial.println(ts_end - ts);

  Serial.println();
  Serial.println("SimpleFOC _sin:");
  steps = 0;
  ts = micros();
  for (float i = 0.0; i < _PI; i+=step) {
    float res = _sin(i);
    steps++;
  }
  ts_end = micros();
  Serial.print("SimpleFOC _sin time (us) for ");
  Serial.print(steps);
  Serial.print(" steps: ");
  Serial.println(ts_end - ts);

  Serial.println();
  Serial.println("stdlib sin:");
  steps = 0;
  ts = micros();
  for (float i = 0.0; i < _PI; i+=step) {
    float res = sin(i);
    steps++;
  }
  ts_end = micros();
  Serial.print("stdlib sin time (us) for ");
  Serial.print(steps);
  Serial.print(" steps: ");
  Serial.println(ts_end - ts);

  Serial.println();
  Serial.println("Comparing accuracy...");
  float rmsdiff1 = 0.0f;
  float rmsdiff2 = 0.0f;
  steps = 0;
  for (float i = 0.0; i < _PI; i+=step) {
    float diff1 = 0.0f;
    float diff2 = 0.0f;
    float res1 = cordic_sin(i);
    float res2 = _sin(i);
    float res3 = sin(i);

    diff1 = res3 - res1;
    if (diff1>1.0) {
      Serial.print("CORDIC vs stdlib at i=");
      Serial.print(i,8);
      Serial.print(": ");
      Serial.println(diff1, 8);
    }

    diff2 = res3 - res2;
    if (diff2>1.0) {
      Serial.print("SimFOC vs stdlib at i=");
      Serial.print(i,8);
      Serial.print(": ");
      Serial.println(diff2, 8);
    }

    rmsdiff1 += diff1*diff1;
    rmsdiff2 += diff2*diff2;
    steps++;
  }
  rmsdiff1 = sqrt(rmsdiff1/steps);
  rmsdiff2 = sqrt(rmsdiff2/steps);
  Serial.print("RMS difference between CORDIC and stdlib: ");
  Serial.println(rmsdiff1, 8);
  Serial.print("RMS difference between SimpleFOC and stdlib: ");
  Serial.println(rmsdiff2, 8);

  Serial.println("Test complete.");
  while(1);
}

If you can spot any errors, I’d love to know!

But otherwise it kind of looks like that CORDIC is both less accurate and slower than the other two. And it really looks like we should be using the stdlib sin() function.

I’m running on an STM32F474RE, by the way.

The docs recommend this:


CORDIC->WDATA = q31_value;
cordic_sine = CORDIC->RDATA;
cordic_cosine = CORDIC->RDATA;

start_ticks = SysTick->VAL;

Iterations:   6284
elapsed_ticks:   131352
time per iterasion:   20

Lets elaborate the time. So we take a number 1545570 and ad 20 system clocks running 168Mhz = 1545590. Those two numbers 1545590 - 1545570 or 1545570 - 1545590 gives either 20 or -20. So it is for the SysTick counter. The number above seam about right. Actually does the docs say around 70 clocks.

if you use the micros() function you just get a value in microseconds… that seems simpler to me…

Clearly, I’m still somehow passing the values incorrectly to the CORDIC… the accuracy results can’t be trusted yet. But it is doing the calculations, so the timing results should be valid.

168Mhz / 20 Clocks per calculation and conversion = 8.4Mhz

Ah, I got it! :partying_face:

When the result is not an angle, you don’t scale the result by π - doh!

Now we see that CORDIC is a lot more accurate than SimpleFOC’s _sin():

Starting...
Initializing CORDIC...
CORDIC initialized.


Timing CORDIC vs stdlib sin vs SimpleFOC Sine calculations...

CORDIC:
CORDIC Time (us) for 3217 steps: 6575
Result: 2048.00

SimpleFOC _sin:
SimpleFOC _sin time (us) for 3217 steps: 942
Result: 2047.98

stdlib sin:
stdlib sin time (us) for 3217 steps: 2710
Result: 2048.00

Comparing accuracy...
RMS difference between CORDIC and stdlib: 0.00000046
RMS difference between SimpleFOC and stdlib: 0.00161161
Test complete.

That was with 6 CORDIC cycles.

If we drop it down to 3 cycles:

Timing CORDIC vs stdlib sin vs SimpleFOC Sine calculations...

CORDIC:
CORDIC Time (us) for 3217 steps: 6575
Result: 2047.08

SimpleFOC _sin:
SimpleFOC _sin time (us) for 3217 steps: 942
Result: 2047.98

stdlib sin:
stdlib sin time (us) for 3217 steps: 2710
Result: 2048.00

Comparing accuracy...
RMS difference between CORDIC and stdlib: 0.01762736
RMS difference between SimpleFOC and stdlib: 0.00161161
Test complete.

Interestingly the accuracy gets noticeably worse, but the time does not improve. That’s probably due to the polling mode we’re using the CORDIC in, I’m guessing.

Increasing the cycles to 15 (the max) also does not make the time worse, but also does not improve the accuracy over 6 cycles. Maybe the polling function just waits for the max time it can take, I’ll have to look how it works.

Note also that the stdlib sin() is now slower than the SimpleFOC _sin() - that makes a lot more sense. I assume very much this is because I’m now doing something with the result (I’m just adding them up, actually) - so before, because I wasn’t using the result, it seems the compiler was just removing that function call… now that I’m using it, the code actually calls sin(), and we see that it is slower than SimpleFOC - but not as slow as the CORDIC in polling mode.

1 Like

Do you regard a fart vibrating at 8.4Mhz slow ?

CORDIC

Approximate Range (Hz)
MAN 64-23,000
DOG 67-45,000
CAT 45-64,000

365,217 times higher freq. then MAN´s hearing.
#QuanTumRealmStuff