Jump to content

Unjoinable servers break game


Duion

Recommended Posts

I have a bug where sometimes there are unjoinable servers and after you tried to join such a server you cannot join any server anymore, so basically the game is frozen after that and you can only press menu buttons and only restarting the game changes that. The unjoinable server however stays that way and if you try to join it the problem occurs again.


What I think is happening is, that sometimes after someone or you joined a server, the slot is not properly cleared, so the server thinks you are still on it and of course does not let you join again, but the game itself also thinks a serverconnection is running and does not let you join another game. The server in the list then is listed with +1 human player, that does not get cleared and stays broken and when someone tries to join it it adds +1 more human player, but for the person joining the game breaks, so he has to quit the game and try again, which may repeat the cycle.


What makes it so hard to test is, that you need a server on the internet and a client to join the server and a lot of patience to wait till something breaks and the bug occurs. I cannot open multiple instances of the game myself to test it, because it seems to be locked by the IP, meaning you need real different people to test it with real different IPs.


My current solution is to just restart the servers every X hours, so the bug never occurs, since the longer a server runs and the more players are on it the sooner it occurs. Bots work as well to cause the bug, which I now realized, since I have more bots on the servers by default and the bug happens much sooner. The bots simulate a real player connection, so they appear on the server list as players etc.


So what I think is happening is that at some point server connections are not getting cleared on the dedicated servers and either the server becomes unjoinable after you tried to join it, or it completely gets removed from the list by the master server since it reports some unlegit number of players and also becomes unjoinable.


Well this all sounds very complicated, but this is a potential game breaking bug and I don't really know where to start, so maybe someone here has an idea, I also provide a short console.log of what happens when I try to join such a bugged server and join it again:


 

==>trace(1);

Console trace enabled.

Leaving ConsoleEntry::eval() - return

Entering ToggleConsole(1)

Entering [CanvasCursorPackage]GuiCanvas::popDialog(Canvas, ConsoleDlg)

Entering ConsoleDlg::onSleep()

Leaving ConsoleDlg::onSleep() - return

Entering [CanvasCursorPackage]GuiCanvas::checkCursor(Canvas)

Entering showCursor()

Leaving showCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::checkCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::popDialog() - return

Leaving ToggleConsole() - return

Entering ToggleConsole(0)

Leaving ToggleConsole() - return

Entering JoinServerDlg::join(JoinServerDlg)

Server query canceled.

Adding a pending connection

Sending Connect challenge Request

Leaving JoinServerDlg::join() - return

Got Connect challenge Response

Sending Connect Request

Connection established 25566

Entering GameConnection::onConnectionAccepted(25566)

Leaving GameConnection::onConnectionAccepted() - return

Entering GameConnection::setLagIcon(25566, 1)

Leaving GameConnection::setLagIcon() - return

Mapping string: ServerMessage to index: 0

Mapping string: MsgClientJoin to index: 1

Mapping string: %1 has joined the server. to index: 2

Entering clientCmdServerMessage(23 MsgClientJoin, 70 has joined the server., , 1.00527e+06, , 0, , , , 0)

Entering defaultMessageCallback(23 MsgClientJoin, 70 has joined the server., , 1.00527e+06, , 0, , , , 0, , )

Entering onServerMessage( has joined the server.)

Entering playMessageSound( has joined the server.)

Leaving playMessageSound() - return -1

Entering ChatHud::addLine(ChatHud, has joined the server.)

Leaving ChatHud::addLine() - return

Leaving onServerMessage() - return

Leaving defaultMessageCallback() - return

Entering handleClientJoin(23 MsgClientJoin, 70 has joined the server., , 1.00527e+06, , 0, , , , 0, , )

Entering PlayerListGui::updatePlayerInfo(PlayerListGui, 25568)

Leaving PlayerListGui::updatePlayerInfo() - return

Leaving handleClientJoin() - return

Leaving clientCmdServerMessage() - return

Entering GameConnection::setLagIcon(25566, 0)

Leaving GameConnection::setLagIcon() - return

Entering ToggleConsole(1)

Entering [CanvasCursorPackage]GuiCanvas::pushDialog(Canvas, ConsoleDlg, 99)

Entering ConsoleDlg::onWake()

Leaving ConsoleDlg::onWake() - return

Entering [CanvasCursorPackage]GuiCanvas::checkCursor(Canvas)

Entering showCursor()

Leaving showCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::checkCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::pushDialog() - return

Entering updateConsoleErrorWindow()

Leaving updateConsoleErrorWindow() - return

Leaving ToggleConsole() - return

Entering ToggleConsole(0)

Leaving ToggleConsole() - return

Entering ToggleConsole(1)

Entering [CanvasCursorPackage]GuiCanvas::popDialog(Canvas, ConsoleDlg)

Entering ConsoleDlg::onSleep()

Leaving ConsoleDlg::onSleep() - return

Entering [CanvasCursorPackage]GuiCanvas::checkCursor(Canvas)

Entering showCursor()

Leaving showCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::checkCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::popDialog() - return

Leaving ToggleConsole() - return

Entering ToggleConsole(0)

Leaving ToggleConsole() - return

Entering JoinServerDlg::join(JoinServerDlg)

scripts/gui/joinServerDlg.cs (151): Cannot re-declare object [serverConnection].

scripts/gui/joinServerDlg.cs (152): Unable to find object: '0' attempting to call function 'setConnectArgs'

scripts/gui/joinServerDlg.cs (153): Unable to find object: '0' attempting to call function 'setJoinPassword'

scripts/gui/joinServerDlg.cs (154): Unable to find object: '0' attempting to call function 'connect'

Leaving JoinServerDlg::join() - return

Entering toggleJoinServerDlg()

Entering [CanvasCursorPackage]GuiCanvas::popDialog(Canvas, JoinServerDlg)

Entering [CanvasCursorPackage]GuiCanvas::checkCursor(Canvas)

Entering showCursor()

Leaving showCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::checkCursor() - return

Leaving [CanvasCursorPackage]GuiCanvas::popDialog() - return

Entering JoinServerDlg::query(JoinServerDlg)

Entering onServerQueryStatus(start, Querying master server, 0)

Leaving onServerQueryStatus() - return

No master servers found in this region, trying IP:88.198.65.149:28002.

Requesting the server list from master server IP:88.198.65.149:28002 (2 tries left)...

Leaving JoinServerDlg::query() - return

Leaving toggleJoinServerDlg() - return

Received server list packet 1 of 1 from the master server (4 servers).

Pinging Server IP:88.198.65.149:28000 (3)...

Pinging Server IP:88.198.65.149:28003 (3)...

Pinging Server IP:88.198.65.149:28004 (3)...

Pinging Server IP:88.198.65.149:28005 (3)...

Link to comment
Share on other sites

Pretty sure this was brought up a long while back, IIRC correctly the bug somehow was with the player count of a server not being correctly decremented in the master server, not sure if it was determined why, or why it even mattered since I thought it was only a display number and that in the end the server should accept/reject the connection based on it being full


So while the client should handle this gracefully (are we sure part of this issue is not the 60-second timeout when things go slightly squiffy?) its the root cause of the problem that needs to be addressed.

 

Entering clientCmdServerMessage(23 MsgClientJoin, 70 has joined the server., , 1.00527e+06, , 0, , , , 0)
Entering defaultMessageCallback(23 MsgClientJoin, 70 has joined the server., , 1.00527e+06, , 0, , , , 0, , )

this looks weird to me, i never like it when numbes get displayed 'incorrectly'

Link to comment
Share on other sites

Yes, I discussed that player count issue before, but that "only" caused servers to disappear from the server list that the master server generated, now there also exist servers that are still listed by the master servers, but are not joinable and that break the game once you tried to join them, since the serverConnection object is not deleted and the game still thinks a game is running and does not let you join another, forcing you to exit the game to reset it.


Regarding that message, yes that looks weird, but I don't really understand what is going on there, only that it is a message function with lots of arguments and one of them is a weird number that looks like a floating point imprecision. The numbers are IDs that are assigned to certain things, network optimizatoin or so, one number is the client ID, in this case "70". Behind that are a lot of arguments that are mostly empty and some have 0 assigned to them. I don't really know where to look to debug that or to find out what is being send there.


The bug is also hard to replicate since it needs a server that runs for a long time and needs real human players from time to time, bots seem to speed up the time the bug happens, the more bots the faster, but ultimatively real human players are needed to cause it.


If someone wants to test the bug for himself, I could offer to host a server and let it go bad, meaning not restart it for a long time until it breaks, but the problem there is, at some point it also may disappear from the list. At first I could prevent the bug from happening by restarting the servers every day or twice a day, later I went down to 8 hours and now I had to go down to even 4 hours to prevent the bug from happening in most cases, but there is still a slight chance it will happen.

Link to comment
Share on other sites

This is one of the reasons why I'm planning on having a bunch of analytical data in my game, should be able to match all the numbers up, ideally something to graph out the numbers and then see what is causing the issue. I mean theres a plethora of reason why running analytics is fun, but finding problems is a good side effect .

Link to comment
Share on other sites

Who cares about the details I need a solution that works.


I think I need some kind of function that checks if the connection is valid, on the server and on the client and in case the connection process gets interrupted, bugged, timed out or whatever the connection is reset and the slot is cleared on the server.

I know it is not just the master server since you can join servers directly through IP and this also fails.

The join and exit processes are pretty bulletproof, if you test them short term, but if you let the server run for a day and then come back after a few people played on it a few times the server somehow gets broken.

I already reset the servers after the last human left, but something does not seem to get cleared in that process sometimes.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...