customer call me for help, and I can't connect to him.
First I did try from RDM, but get "Could not resolve" error, try with wayknow client, the same error.
Customer open wayknow client at his computer and check for green dot "ready" - it was.
He close wayknow client, and open again - the same error.
Debug logs from both computers included.
Problem was resolved by restart customer computer. Probably it'll start work after waynow service restart, but customer don't have admin rights at computer.
My apologies for the slow response; I wanted to involve the team responsible for the Wayk Den infrastructure to get their feedback on this.
We think that an interruption in the end-users network sometime after 9:30 (local time) caused the machine to lose it's connection to Wayk Den. From our side, we can see that the machine was unreachable, and, in the end it was marked as offline - after which you would receive the "could not be resolved" error when trying to connect.
The network interruption was probably only short, and your user may not have even noticed.
However, it is apparent that on the Wayk Now side, the interruption was not detected - the service considered itself to still be connected, hence the green light and "Ready" label. As long as the service considers itself connected, it of course won't try to reconnect. Your theory that restarting the service would've resolved the problem is likely correct, and rebooting the machine had the same effect.
I'm looking at the possible reasons for this (there are a few layers that can be responsible for triggering the disconnection - TCP, TLS, WebSockets, and within the application code) to work out why this occurs. I'll post back here once I have an update on that.
Thanks for both the detailed bug report and for your patience,
most of my customers have configure dual-wan routers with auto-switch. Just checked logs at customer site, and didn't notice any network issue.
Up time at main connection is more then 4 days.
Error like this I've seen from time to time and even did try turn off IDS at few customers, but as you can see I still have issues.
There was time when I though it can be ESET Endpoint Protection, but at another customer I've Symantec Norton Antivirus and last week did the same problem... user see green dot, but I can't connect to him. (other one the this from my post now).
I'm looking forward what I can check at my site.
Can you tell me what this logs mean:
2019-12-18 09:28:09 NowService::main [INFO] - Received service event: SessionLogon(9)
2019-12-18 09:28:09 NowService::main [DEBUG] - SESSION_LOGON(9)
2019-12-18 09:28:09 NowService::main [INFO] - Received service event: SessionLock(9)
2019-12-18 09:28:17 NowService::main [INFO] - Received service event: SessionUnlock(9)
I start this computer today morning and at logs I see only those info, even if I have debug logs enabled. - only four lines for wayknow service start?
cmd > systeminfo
System Boot Time: 12.12.2019, 06:50:04
So... windows tells me that I didn't run computer today morning,but week ago. Hmm I don't use sleep and hibernation here - only shutdown.
When I check logs for wayknow service from this date I see true debug logs like this:
2019-12-12 06:50:19 common::logging [INFO] - the client id is c1c4ec82-aec2-432e-87fa-056206b5f5af
2019-12-12 06:50:19 common::logging [DEBUG] - 0 crash reports pending
2019-12-12 06:50:19 common::logging [DEBUG] - 0 of 0 completed crash reports have been uploaded
2019-12-12 06:50:19 common::logging [DEBUG] - requested C:\Program Files\Devolutions\Wayk Now\crashpad_handler.exe to start asynchronously (automatic uploads: 1)
2019-12-12 06:50:19 NowService::main [INFO] - service started
2019-12-12 06:50:19 NowService::main [INFO] - called with args: ["WaykNowService"]
2019-12-12 06:50:19 common::logging [DEBUG] - socket listening on ipc://wayk_session_control
2019-12-12 06:50:19 common::logging [DEBUG] - socket listening on ipc://wayk_service_control
2019-12-12 06:50:19 NowService::main [INFO] - ipc services started
[...] and much more lines.
It's a "feature" from Windows 10 called "Fast startup" (it's enabled by default) - I turn it off, and did shutdown.
System info tells me:
System Boot Time: 18.12.2019, 11:26:23
I did try connect and connection was successful.
Before shutdown I can't connect to this computer with fast startup enable.
Now your turn - you need to invent how wayknow service can work with this feature enabled
I'll later at my customer and I'll check if my customer have this option enabled too.
Tests from udate1 and update2 I did make at another computers where I did have similar problems, but I don't know if it the same problem from original post.
Thanks for the detailed feedback.
The session events you point to come from WTS (Windows Terminal Services) and indicate session changes (logon, logoff, connect/disconnect from session, etc).
But it sounds like you have that figured out anyway - this "Fast Startup" feature seems to explain things.
Here's how it works from our side:
Wayk Now talks to Wayk Den using WebSockets, which is just a protocol on top of TCP / TLS. Part of the protocol are the "ping" and "pong" control messages. On the server side, we send the client a ping and expect to receive a pong in a reasonable time (I believe the timeout is 2 minutes). If we miss 4 pings, the connection is closed on the Wayk Den side, and trying to connect on that machine will change from UNREACHABLE to NOT_FOUND ("couldn't be resolved...").
On the Wayk Now side of things, we rely on the TCP-level disconnection to tell us that we are no longer connected. By default, TCP connections are kept "alive" for a very long time by the OS (even with keep-alives enabled, the default settings are generally very conservative because TCP is lossy and the defaults allow for a *very* lossy network )
So that TCP disconnection could be the network going down (which seems like the most regular explanation, but in your case you can show it's not the issue) or the machine going into some kind of hybrid sleep-hibernation (which you've demonstrated is probably the case here!).
From our side, it clearly appears we have an issue in how we are configuring the TCP keep alives on the WebSocket connection; it may be that we need to tune this differently for the "fast startup" shutdowns. We are working on that and I'll post back here once I have some update.
Thanks again for the detailed feedback, it was very helpful for figuring this out.
I did check my customer computer from first post - fast startup was enabled (as mentioned at ms doc's - turned on default) and systeminfo shows more then two week up time. Customer every day turn computer on/off.
Thanks for the update. I will post back here once I have some more news from our side.
A small update; we have made some adjustments in how we handle lost connections at the transport level. I believe this should fix the responsiveness of the Wayk Den "ready" state (that is, after a fast-boot, the dropped connection will be properly noticed and the reconnection should occur). Those changes will be in the next release, which should be in January - but I'll post back here with a firm update on that.
Thanks and kind regards,
Wayk Now 2020.1.0 is now available for download. It contains the fix mentioned above and you should see the responsiveness of the Wayk Den status significantly improved, including in fast boot scenarios. Please let us know if you notice ongoing issues or problems, especially with connectivity.
Thanks and kind regards