Embedded World 2023 - STM32 CORDIC CO-PROCESSOR

Just print it to the serial console and ensure the console has a high enough buffer on the PC side … then copy-paste to Excel…

Obviously you have to the precision test separate from the speed test :slight_smile:

But your code is even better I think :slight_smile:

OK, tight loop it is :slight_smile:

This is the way!

Im a bit stuck at this conversion to int32_t provided by ST:

value_q31 = (int31_t)(angle*0x80000000);

the (int31_t) is undefined ?

I assume it’s just a typedef of a regular 32-bit integer, so casting to int or long would probably be fine.
It looks like the angle also needs to be divided by _2PI before converting.
And the error statistics may be wrecked by the final iteration of the loop if the converted angle wraps around to negative… better make the condition angle < _2PI, or even angle < _2PI - 0.001

Ok, this function from the ST video example (PART2) gives the correct input value for the CORDIC but its slow.

int32_t input_q31_sin = f32_to_q31(fmod(angle, 2.0f * pi) / (2.0f * pi)) << 1;

It outputs

327680000

This function on the other hand (now compiling):

int32_t input_q31_sin2 = (int)(angle*0x80000000);

Outputs:

1029437056

This is timing the CORDIC and the conversion using math.h

elapsed_ticks:   1641

What should we do without AI :sparkles:

Maybe my pi value is incorect ? #define pi 3.1415926535897932384626433
You are correct. The value of pi you are using is more precise than the one I used in my previous responses. Here's the updated code for the second function:

c
Copy code
#define pi 3.1415926535897932384626433f
float angle = 0.4793689847f;
int32_t input_q31_sin2 = (int)((angle / (2.0f * pi)) * 2147483647.0f);
if (input_q31_sin2 < 0) {
    input_q31_sin2 += 2147483648;
}
input_q31_sin2 <<= 1;
Here, we are using your more precise value of pi to scale the input angle by 1/(2*pi) to get the fractional part of the input angle, then multiplying it by the maximum 31-bit signed integer value 2147483647 to get the fixed-point representation in Q31 format. We are then checking if the resulting Q31 value is negative, and if so, adding 2147483648 (2^31) to wrap around to the correct positive value. Finally, we are shifting the resulting Q31 value left by 1 to get the fixed-point representation in Q31 format.

For an input angle of 0.4793689847 radians, the output of this function should be 327680000, which matches the output of the first function.

I hope this helps. Let me know if you have any further questions.

To compare the speed it probably doesn’t matter what values are used, but to compare the accuracy we should probably put more thought into it.

For example the step-size of 0.001 is probably not ideal, probably 1/1024 would be a better choice. Ideally it should be something both float32 and q31 (whatever that is) can natively represent without introducing additional error.

This is with conversion from radian float → CORDIC calculation and back to float:

Radian:   0.4793690145
elapsed_ticks:   63
Cordic_sine:   989611008.00
Cordic_cosine:   1905873408.00
converted_Cordic_sine:   0.46082354
converted_Cordic_cosine:   0.88749146
Thank you very much, you are amazing at numbers. Will this work inside this function: for (float angle = -_2PI; angle <= _2PI; angle += 0.001, count++)
Yes, the updated code for the second function should work inside the loop you provided:

c
Copy code
#define pi 3.1415926535897932384626433f
for (float angle = -_2PI; angle <= _2PI; angle += 0.001, count++) {
    int32_t input_q31_sin2 = (int)((angle / (2.0f * pi)) * 2147483647.0f);
    if (input_q31_sin2 < 0) {
        input_q31_sin2 += 2147483648;
    }
    input_q31_sin2 <<= 1;
    // Use input_q31_sin2 in your calculations here
}
Here, we are using the updated code for the second function to convert each angle value in radians to the corresponding Q31 fixed-point representation. We are then checking if the resulting Q31 value is negative, and if so, adding 2147483648 (2^31) to wrap around to the correct positive value. Finally, we are shifting the resulting Q31 value left by 1 to get the fixed-point representation in Q31 format. You can then use the resulting Q31 value in your calculations inside the loop.

I hope this helps. Let me know if you have any further questions.

@dekutree64

Are you sure it should be -_2PI to _2pi ? not just -pi to pi?

or 0 to _2PI…

Are you sure -pi to pi is not more appropriate compared to -_2PI to _2PI (_2PI represents 2*pi). This is for calculations with the CORDIC of the STM32G4 MCU, testing the speed.
→ OpenAI
Yes, you are correct. Using the range of -pi to pi instead of -2pi to 2pi is more appropriate when working with the CORDIC algorithm in the STM32G4 MCU.

I wonder what the compiler makes out of this stuff… it seems to me this could be optimised a lot. A multiplication by 2^31 - 1 is like adding 31 to the exponent and subtracting the value.

I think looking at this conversion it might take some cycles to all that maths, quite apart from the CORDIC itself…

@runger brings up a good point. The second function for converting a floating-point value to Q31 fixed-point format that we have been discussing may not be the most efficient implementation in terms of speed, as it involves several arithmetic operations. As @runger suggests, one potential optimization is to use a bit-shift operation to convert the float to an int, which would avoid the need for multiplication and division. Here is an updated version of the second function that uses bit-shifting:

int32_t float_to_q31(float input) {
    int32_t q31 = (int32_t) (input * 2147483648.0f);
    q31 = (q31 > 0) ? (q31 << 1) : (-q31 << 1);
    return q31;
}

In this implementation, we first multiply the input float by 2^31 (i.e., 2147483648) to obtain a 32-bit integer that is equivalent to the Q31 fixed-point representation. We then shift this integer left by 1 to obtain the final Q31 value.

To handle negative input values, we use a ternary operator to check if the integer is positive or negative. If positive, we simply shift it left by 1. If negative, we take the absolute value (i.e., negate it) and then shift it left by 1. This ensures that the resulting Q31 value is correct and that the sign bit is set appropriately.

Using this implementation may provide a small speed boost compared to the original implementation, but keep in mind that the actual performance may depend on factors such as the specific microcontroller being used, the compiler optimizations enabled, and other factors.

That’s just removed the division from the code :rofl:
The input would now already have to be scaled by 1/2PI… really, while ChatGPT is a totally cool and hugely amusing tool, it can’t be trusted to solve concrete problems, only “generalized” ones, and only when it can provide guidance based on language…

You are correct that my previous suggestion for the loop assumes that the input to the float_to_q31 function is already scaled by 1/2π. As @runger pointed out, my previous suggestion removed the division from the code, but it did not properly scale the input angle.

To properly scale the input angle, you can modify the loop as follows:

float angle;
int32_t q31_value;

for (angle = -M_PI; angle <= M_PI; angle += 0.001) {
    q31_value = float_to_q31(angle / (2.0f * M_PI));
    // do something with q31_value
}

q31_value = float_to_q31(angle / (2.0f * pi));

The speed did not change. It looks cleaner though…

Radian:   0.4793690145
elapsed_ticks:   63
Cordic_sine:   989611008.00
Cordic_cosine:   1905873408.00
converted_Cordic_sine:   0.46082354
converted_Cordic_cosine:   0.88749146

Do you think the angle = _normalizeAngle(angle); function messes up the test somehow?

hmm… its exiting the loop.

Iterations:   1
elapsed_ticks:   780

Maye I should be timing this with millis(); not → stop_ticks = SysTick->VAL;

Yeah, you’ll need to use a temporary for the result of _normalizeAngle since it will turn -pi into +pi and immediately terminate the loop.

ok, like this:

angle2 = angle;

Ok, now testing with millis();

And with the temporary → SFOC:

Iterations:   6284
elapsed_time:   4294967272

This was done like this:

for (angle = -pi; angle <= pi; angle += 0.001) {
angle2 = angle;
angle2 = _normalizeAngle(angle2);
float _ca = _cos(angle2);
float _sa = _sin(angle2);
count++;

}

Edit: that cant be right. There must be something wrong with millis();

Edit x2: Here is the result with SysTick();

Iterations:   6284
elapsed_ticks:   153087

It looks like it takes 24ms but you’re doing the subtraction the wrong way around.

do you mean this →

elapsed_ticks = start_ticks-stop_ticks;

Could you plz elaborate?

Im back to SysTick();

Im getting this result with the CORDIC now →

Iterations:   6284
elapsed_ticks:   143585

Conversion back and forth…

with just 2 cycles for the CORDIC we get:

Iterations:   6284
elapsed_ticks:   137743
 

I wonder how it will perform with q1.15 ?

Hey,

use the function micros: long timestamp = micros();

As the name says, that’s a microsecond accuracy timestamp, as best the MCU can manage. If you time over 1000s of iterations like you’re doing, this will give a pretty accurate measure.

[Edit] and another option is just to toggle a pin at the start of the loop: val = 1-val; digitalWrite(10, val); // toggle pin 10

and then use your oscilloscope to show the horizontal width of the resulting square wave - that will probably be most precise :slight_smile:

1 Like