PID integrator windup reset

That’s because the integral_prev term is incremented once per call to move() - so at a rate of, lets say 5kHz main loop speed, this translates to many additions over the course of an hour. Even if they are small, it can add up, and I’ll note that the I-gains used are often very much higher than 0.0000001…

The precision problems will arise not around 0 but rather when the integral_prev becomes very large. This is an inherent problem when working with floating point numbers and trying to do maths with quantities which have very different magnitude. There are situations where integral_prev could become “stuck” at a high value even if the error takes the other sign :frowning: I admit this would be a rare case, but I still don’t like it being possible…

1 Like

Is this really the case? Doesn’t the anti windup limit kick in way earlier?

// antiwindup - limit the output
integral = _constrain(integral, -limit, limit);
   
...

// saving for the next pass
integral_prev = integral;

Whoa, complicated. I haz no clue. Either I have to leave it up to smart people like you to make libraries with stuff that works or I would have to resort to hacky things like converting everything to integers and back again, just glad this isn’t my job right now ;-). From my limited experience it seems like things are a little out of hand, talented people can make things work but there aren’t very many such people and they don’t have that much time. Maybe converting to integers and back isn’t such a bad idea, if it makes things more intuitive?

Yeah, you’re right, that’s not practically ever going to happen, since this way of doing it limits the integral_prev the limit…

I confused myself with the code vs. your situation - phew!

So then anyway the windup is limited, its just a question of bringing it back down in certain situations… have to think more about it.

I tested the motor.PID_velocity.reset();, but the behavior change remains the same. I actually thought this was the solution, but it seems to be something else? Is there a direct way to obtain the integral value, to double check if it is properly reset? Or do I need to make the variable global in the library?

Anyways, thanks a lot for all the input, I really appreciate it. I will keep you guys updated on the progress, but experimenting takes some time, as it is a slow developing behavior change :wink:.

Maybe it needs some derivative factor? But I think SimpleFOC’s handling of the derivative needs improvement. I’ve always had trouble whenever I added even a tiny bit of D, and comparing SimpleFOC’s PID to this article I bookmarked a long time ago, I think I know why Improving the Beginner’s PID – Introduction « Project Blog
It’s the “derivative kick”, a spike in the output whenever you change the setpoint (i.e. constantly in angle mode). The solution is to subtract D*(input - lastInput) instead of adding D*(setpoint - input). Unfortunately not easy to fix, since SimpleFOC’s PID takes the error as its argument rather than the new input and calculating the error internally.

This is exactly why I don’t understand why the I component can grow so much that the numerical resolution matters.

The idea was to compute the I component first and if the error was zero and integral_prev is high, then reduce the I term (and thus the next integral_prev) by a small offset Ioffset. The same Offset should then be added to the error. After the I term, calculate the P term, which should consequently see a larger error and thus react stronger. The nonsense with that is that it will only work if the I term is very high and thus operating in a range which actually the P term should handle, something which should not happen with proper gains. Perhaps it is a matter of trying it out, but currently I have no hardware which I can let spin freely, it is all built into my simulator.

Back to the original question:

I could also well imagine that it is a numerical problem when the error is caclulated. Since the position is accumulated over time, the total number of sensor steps gives a very large number after 30min. This means the sum is very high compared a single step in position, which may lead to round-off errors in some places. I did not yet go through the code to check it everywhere, but it is certainly a possibility.

Hey,

So perhaps something like a P’ term, whose meaning is the inverse of P - e.g. if the error is small the P’ is large, and is applied to integral_prev?

I should probably look into the literature on this rather than dreaming about it…

I think this is a very good explanation. In fact, I recall fixing numerical precision problems a while back through the introduction of full_rotations + float angle, and float maths throughout the rest of the code.

But I also remember there were a few TODOs remaining, and if IIRC they affected position mode. The thinking at the time was that is was less likely people would run the motor for hours in the same direction in position mode than velocity mode or the other modes. Shows you how can never anticipate how things will really be used in actual applications :blush:

Yup, my memory is still working:

    case MotionControlType::angle:
      // TODO sensor precision: this calculation is not numerically precise. The target value cannot express precise positions when
      //                        the angles are large. This results in not being able to command small changes at high position values.
      //                        to solve this, the delta-angle has to be calculated in a numerically precise way.

and this one:

  // TODO sensor precision: the shaft_angle actually stores the complete position, including full rotations, as a float
  //                        For this reason it is NOT precise when the angles become large.
  //                        Additionally, the way LPF works on angle is a precision issue, and the angle-LPF is a problem
  //                        when switching to a 2-component representation.

So @Grizzly you’re absolutely right and this precision problem exists in position mode. The question is does it explain @Jornel 's behaviour? I have a hunch we’ll find it does, but I have not thought it through yet…

1 Like

I guess it is either that, but I also got a new suspicion about my encoder skipping steps or making up steps caused by noise (or a little of both, but effectively drifting in one direction), which also would create an error growing over time. The latter I will check tonight using the index pulse. I will keep you posted.

Edit: Btw forgot to mention that I use the ABI signals of the as5047p and not the SPI, as there was too much noise due to a relative long cable between encoder and shield and the fact that I use one cable containing both the motor wires and encoder signal wires. Next iteration will most probably have two cables: one for motor and one for encoder signals.

I have a case where a vehicle needs to run the wheels in a position mode for many days at a time. Position mode is a lot more accurate than velocity as it allows to very accurately position the vehicle with a lot of stop-and-go, e.g. advance to position X at A speed, stop, go to position Y at B speed, stop, then go to position Z at C speed, stop, for a number of days, may be months, in a mission-critical application where shutdown / reset is impossible (it breaks the use case). The error being integrated will become significant eventually. Also, most S-curve profiles run in a position mode, not velocity mode. An application such as a specialized electric vehicle which requires very accurate positioning runs the wheels for hours at a time in one direction only. All my cases, I’ve never ran the motor in velocity mode, always position.

Even if we use an integer to count the rotations and a float for the angle, at 1000rpm in a position mode we will flip the integer in a matter of days. There has to be a different way of ensuring the max / min are never reached.

Cheers,
Valentine

Fair enough - the assumption can be regarded as thoroughly proved wrong :smiley:

So we should prioritise fixing it. I think it is quite possible to do, but perhaps has more impact on the codebase than some of the other changes that were made in that context.

The changes so far have fixed velocity mode, and could so far be restricted mainly to the way the sensor works. To fix the position mode problem, we have to change the type of motor.shaft_angle, or else change the way the delta-angle is calculated to ensure precision.

Well, at the moment we use int32 for the full rotations, and a float for the angle within the full rotation. With a 32 bit integer, and 1000rpm, you could rotate for 1491 days before overflowing… from my point of view it’s reasonable to expect the solution to be reset once every 4 years, or for the user to deal with it in code themselves if they need >4y continuous operation.

Oh, yeah, you are right, I counted 1000rps, not rpm, you are right. For 1000rpm I’m off by a couple zeros :slight_smile:
I guess the integer will work. Also for a 64bit integer in a more-modern STM32 MCU we can use uint64_t, though I’m not sure if performance would be an issue? 18,446,744,073,709,551,615 would be enough, I think, even in the most aggressive cases.

Is it signed or unsigned? We can keep one extra variable for the direction? Or keep the float sign as a direction?

Cheers,
Valentine

Signed, as the position can be positive or negative depending on how you turn the motor. Handling the sign separately would complicate things.

Switching to a 64 bit int would be worth checking if there is much impact on performance. if it is negligible, then it might be worth just doing it at the expense of 4 bytes to bury this range problem once and for all, because even at 100000RPM any driver/motor will have crumbled to dust before the 64bit int is overrun…

64bit needs two operations instead of one (retrieve two 32bit from the memory, perform an operation twice, store two 32bits back), so I’d imagine the impact will be negligible, compared to all other things that happen in the loop.

I was just thinking that there are 2 fairly good PID controllers for arduino that I used previously, though they did have issues. If they aren’t up to snuff it would make sense to treat them as a separate project, fork or improve them, and then use that. I have made this mistake myself too many times, reinventing stuff because I figured it was fairly simple but then encountering all kinds of issues.

Both open source and technology get their power largely, imo, because of how they add up over time. Like gravity, they are very weak in the near term and it may seem like next to nothing, it is hard to get anything done especially economically. However the electrical force is 10^42 or so times stronger than gravity, yet gravity rules the universe precisely because it always adds up. Technology is a massive force in our world, but only because of things adding up over a long time and a lot of labor hours by a lot of people… same for open source tech?

The angle low-pass filter will still have precision problems, but it’s not normally used anyway is it? If we add an early-out if(alpha==0) return x; then it should be safe.

The output ramp in the PID is also subject to this issue, but also usually not used on angle I think. So mostly the error calculation in BLDCMotor P_angle( shaft_angle_sp - shaft_angle ) is the problem (yet another place where fixed-point angles would naturally eliminate the issue). Interestingly if we made the fix for derivative kick that I mentioned previously, it would move that subtraction into the PID and potentially make it more difficult to fix.

One possibility would be to provide a “safe wrap” function, where you specify a range like 10000pi and if the angle is greater, subtract range, or less than -range add range. That will wrap back to near zero so it will be a long time before wrapping again. This would have to be an optional function for the user to call, because angles passed to motor.move will also have to be aware when it happens. Some sensors would have to wrap their internal representations (encoder pulses, hall sectors, etc.) at the same time, so it would mean adding a new virtual function.

I tried to check the potential encoder drift by taking the modulus of the current position everytime the index pulse comes by, to check if the index position value shifts in time. However, I have some trouble implementing that.

I thought it was was as simple as adding an interrupt to the index pin D4 on the stm but this seems to randomly be triggered. Is this caused because in the library there is also an interrupt attached to it? So is there maybe a library function I could use to monitor the index pulse? Anyone who can help me out?

re-implementing it to operate on a 2-component representation will be a bit of a pain. I wonder if it is sufficient for it to operate on the delta-Angle.

Another question is how to handle the case when delta-Angle itself is too large, which can happen for example if the motor turns in position mode for a day, and then you set a 0 target.

This is not an issue from my point of view if we can keep the delta-Angle small. If not, then I think things get a bit complicated.