GGPO style 2p game, client gets increasing frequency of rollbacks over time

Hi! We’re building a GGPO style 2p game. It’s timing sensitive and we’ve chosen to do it as a GGPO style implementation. The game is deterministic so the solution works well but we have a question about a scenario that we need some help with.

Sometimes during a session the client seems to get an increasing amount of rollbacks, making the game stuttering for that player to the point where it becomes unplayable. I guess the Simulation Frames might be starting to drift apart? Any ideas what normally could be causing this? We’ve followed the guide here as a basis Determinism, Prediction and Rollback | coherence Documentation

Any other settings apart from the ones mentioned in that guide that we should be using to make sure everything is synced correctly? Like the CoherenceBridge → Adjust Simulation Frame?

Thanks!

Hey!

If the controlTime on the bridge is enabled then the client should be able to eventually catch up with frames.

Nothing obvious comes to mind.

Are you using the CoherenceInputSimulation? If so, I’d start by adding the possibility to dump the debug data on all participating clients (either manually or automatically upon detecting the bad state). This will give you insight into information like the simulation frame difference between the clients at a given real world time and when the inputs were produced versus when they were actually received, triggering misprediction.

Ideally this data would come from the same machine, so we’re not adding error caused by the system clock, although nowadays with automatic sys clock sync a drift of over few milliseconds is rarely the case.

Thanks. Yes we’re using the CoherenceInputSimulation.
We did some tests today with setting CoherenceBridge → Adjust Simulation Frame to true, and it seemed to give us better results, but maybe that was just coincidental. What does the Adjust Simulation Frame do in our case? The documentation doesn’t mention much about it.

We also tested increasing the initial input delay from 3 to 10. That seemed to fix it as well but it made the game almost unplayable as it’s timing sensitive. Is there a way to automatically adjust the input delay/frames when detecting multiple misses/rollbacks?

We haven’t seen this behaviour (not that I remember anyway) when playing on the same LAN or on the same machine, only when connecting online (two different machines online in the Stockholm area).

We haven’t tried the debug dumping yet, but we do have our own logging of the frames where we can which frame it receives and misses. We’ll try the debug dump and see if it gives us any more info.

Great find! I completely forgot that this option exists. It was created specifically for inputs, but somehow we forgot to add it to the guide.

It’s basically making sure that all clients have the same simulation frame at the same real-world time (as it is adjusted by ping).

By default, clients are regulating their frame to match the server frame as-received. That means, if it took the server 1 second (or 50 frames) to deliver the packet, client will be 50 frames behind the server in real-world time. Another client however could be as little as 1-2 frames behind. For GGPO this situation is disastrous, as input produced by the first client will already by 50 frames behind, and it will take it additional ~52 frames to reach the second client, totaling to a whooping 102 frame gap.

To illustrate, this is how clients see their frames in real-world time. Without adjusting for ping:

RS ClientA ClientB ClientC
Latency 0 0 100 200
Frame 100 100 95 90

With adjusting for ping:

RS ClientA ClientB ClientC
Latency 0 0 100 200
Frame 100 100 100 100

With “Adjust simulation frame” all clients will have exactly the same frame at the same real-world time, significantly reducing the earlier mentioned gap.

It would be interesting to see the ping of both players. 3 frame delay at 50Hz is 60ms, which means that the sum of client pings have to be under 60ms to not run into prediction. If you’re using our EU cloud server, it sits in Frankfurt, so that’s already 30ms. Add to that both clients processing time, Replication Server processing time, receiving client average received-to-process delay (10ms), and we’re probably getting close to 50-60ms. So any ping worse than 30ms for those clients pretty much guarantees running into prediction (and so potentially rollbacks).

1 Like

Ok thanks @Filip ! So just to be sure, it will adjust, and sync that adjustment to the other client (in our case we always only have 2 clients).

Any other things we should consider, like any solution to automatically adjust the input delay frames based on the network conditions or do you think that’s not needed when doing the simulation frame adjustments as above?

The client adjusts the simulation frame by ping, so inputs produced by two clients at the same real-world time will be for the same simulation frame (each input sent carries the simulation frame stamp).

Any other things we should consider, like any solution to automatically adjust the input delay frames based on the network conditions or do you think that’s not needed when doing the simulation frame adjustments as above?

Adjusting the simulation frame by ping improves the situation, but it can’t fix ping. If a time required for input to travel from one client to the other is greater than the input delay, then there’s still going to be prediction.

As described in the previous thread, the decision to automatically adjust the delay based on ping is a tough one. Some fighting games have done it, but that led to player frustration, because input timing is critical in those games, and for some players it’s better to have long but predictable delay rather than possibly shorter but unpredictable. I recall one game that simply gave players the option to enable variable delay so it was up to the user to decide.

There’s another caveat with variable delay, and that’s increasing and shrinking the input stream. In normal circumstances player produces a steady stream of inputs. However, when you increase the delay, you’re effectively injecting an input into the stream. For example, if it’s frame 10, the delay is 2, and I’ve just produced input for frame 12, if in this very frame we bump the delay to 3, next frame (11) I’ll be producing an input for frame 14. That means we have to “conjure” an input for frame 13. You could do something as simple as copying the input from frame 12, but whether this is a viable option or not depends on the game. If that input contained “shoot” action, player could shoot twice, despite pressing the shoot button only once. Similar problem arises in the opposite direction. If we reduce the delay from 3 back to 2, on frame 12 we would again produce the input for frame 14. Should we in this case simply ignore the second input? Merge it with the previous? Use only the new one? Tough questions :slight_smile:

All in all, I’d say this is more of a design decision, than a technical one. Keep in mind that the visuals can be detached from the underlying game logic, which means that rollback and resimulation do not have to result in a “teleport” - in case of small errors you could smoothly correct the visuals over time, similarly to how it’s done in FPS reconciliation. If you smooth out the resimulation, and solve the problem of variable input stream from above, then the variable delay is likely to give the best visual results.

If however this brings too many complications, then I’d probably do something in between, which is measure the ping at the start of the session, and set the delay based on the ping for the whole duration of the session. It’s rather rare for the ping to increase for longer periods. Just make sure that the ping is measured over “some” period of time (not just one sample) and any spikes are discarded. Also remember to set the delay for both players based on the sum of their latencies (one way) - after all, the time for input to reach one or the other is based on the pings of both of them.

Hope this helps!

1 Like

Thanks for the detailed answers! It sounds like an initial check/negotiation to decide the input delay is the way to go in our case then.

Hi @Filip , we have a related issue we’d like to ask about. Posting it here in this thread so you have the above context, but let me know if you’d rather like me to create a new thread for it.

We’re having an issue that occurs from time to time where one of the two clients freeze for a moment 2-3s, some external factor probably not related to the game, and then when the game resumes the client has missed some frames and starts to spam the log with the following exceptions

Exception in handler. caller=OnFixedNetworkUpdate exception=System.ArgumentOutOfRangeException: Tried to store invalid frame. Expected: 88177989509, Was: 88177989732
Parameter name: state

Full stack trace:
14:31:38.251 (coherence) NetworkTime: Exception in handler. caller=OnFixedNetworkUpdate exception=System.ArgumentOutOfRangeException: Tried to store invalid frame. Expected: 88177989509, Was: 88177989732
Parameter name: state
at Coherence.Toolkit.SimulationStateStore1[TState].Add (TState& state, System.Int64 simulationFrame) [0x0007a] in ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Toolkit/SimulationStateStore.cs:96 at Coherence.Toolkit.CoherenceInputSimulation1[TState].SaveState (System.Int64 simulationFrame) [0x00007] in ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Toolkit/CoherenceInputSimulation.cs:323
at Coherence.Toolkit.CoherenceInputSimulation1[TState].FixedNetworkUpdate () [0x0019f] in ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Toolkit/CoherenceInputSimulation.cs:306 at (wrapper delegate-invoke) <Module>.invoke_void() at Coherence.Core.NetworkTime.Step (System.Double currentTime, System.Boolean stopApplyingServerSimFrame) [0x0019d] in ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Core/NetworkTime.cs:248 logId=CoreNetworkTimeExceptionInHandler UnityEngine.Debug:LogError (object,UnityEngine.Object) Coherence.Log.Targets.UnityConsoleTarget:Log (Coherence.Log.LogLevel,string,System.ValueTuple2<string, object>,Coherence.Log.Logger) (at ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Log/LogTargets/UnityConsoleTarget.cs:86)
Coherence.Log.Logger:BuildAndPrintLog (Coherence.Log.LogLevel,string,System.ValueTuple2<string, object>[]) Coherence.Log.Logger:LogImpl (Coherence.Log.LogLevel,string,System.ValueTuple2<string, object>)
Coherence.Log.Logger:Error (Coherence.Log.Error,System.ValueTuple2<string, object>[]) Coherence.Log.UnityLogger:Error (Coherence.Log.Error,System.ValueTuple2<string, object>)
Coherence.Core.NetworkTime:Step (double,bool) (at ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Core/NetworkTime.cs:252)
Coherence.Toolkit.CoherenceBridge:ReceiveFromNetworkAndUpdateTime () (at ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Toolkit/CoherenceBridge.cs:1360)
Coherence.Toolkit.CoherenceBridge:ReceiveFromNetwork () (at ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Toolkit/CoherenceBridge.cs:1256)
Coherence.Toolkit.PlayerLoop.CoherenceLoop/CoherenceReceiver:ReceiveFromNetwork () (at ./Library/PackageCache/io.coherence.sdk@0c19ca391290/Coherence.Toolkit/PlayerLoop/CoherenceLoop.cs:310)

Since it’s happening inside the Coherence library we have no control over it and can’t wrap it in try-catches etc from what we can see. So it seems the temporarily frozen client has missed storing the simulation state for a couple of frames and therefor is unable to store any more and the game breaks. Any idea how we can avoid that, either by detecting and fixing, or avoiding it somehow?

Hey!

Most likely this is happening due to a time reset. Is this by any chance preceded by the following warning?

Detected unexpected time reset (client simulation frame drifted too far away from the server’s frame). Disconnecting.

Whenever we detect large time diff we reset the time to prevent very long catch-up periods. The problem is, this derails the input system which in fixed mode requires consecutive frames.

The quickest solution we can deliver for now is, adding an option to completely disable time resets. This might have some unforeseen consequences on synced variables, but if you’re relying fully on inputs that should fix the issue.

You can test if that helps by embedding the coherence package and adding an early return in the NetworkTime.Reset() function. If it does help, we can then add the option to disable time resets in the SDK next version.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.