New releases seem a lot more stable these days, and that we like for sure. Things are generally more reliable but one thing I have seen is that from time to time the Bastion server will stop responding (via URL on SSL) but the VM OS still be available by RDP. Typically all I have to do is remote in and restart the Bastion services and it comes right back up. While its not a difficult thing to do it does make me nervous about larger scale deployment scenarios.
Are there any reasons why this would be happening?
Hi Tyson
I'm happy to hear that you've seen improvements in reliability.
To be clear - are you saying the Agent (i.e. the machine you are remotely connecting to) becomes unresponsive? So you can remote in and restart the "Wayk Now Unattended Service" on the Agent machine, and connectivity is restored?
If I've misunderstood, please let me know. Otherwise, such a case would be a high priority bug for us and I'd like to understand why that's occurring.
First, please make sure you're running the latest Agent 2021.1.1. It included a fix for a similar condition.
Next, when you notice that you had the problem please save a diagnostic archive from the machine. Just launch Wayk Agent and choose Help > Export Diagnostics. You'll be prompted for a location to save the diagnostic file - you can send us that .zip file either by PM or to wayk@devolutions.net.
Often we can piece together the issue by examining the logs, however what would be most helpful would be a core dump of the unresponsive service.
procdump.exe nowservice
It will capture a minidump of the service and save it, printing the path in the command window. You can send us that file either by PM or to wayk@devolutions.net.
(I've assumed a basic familiarity with cmd.exe; if you need more detailed instructions or have some questions please let me know. We're working to streamline this and make it easier to capture these detailed diagnostics via the Wayk Agent GUI in the future).
Thanks and kind regards,
Richard Markievicz
Hello again Tyson
I did discuss this with my colleagues on the Wayk Bastion team; it's really not clear to us if your describing an issue (as I thought) connecting to the Agent which you resolve by restarting the unattended service on the Agent itself *or* if you are describing a case where you restart Wayk Bastion itself to restore connectivity.
If it's the latter, can you confirm the version of Wayk Bastion you are running?
Thanks and kind regards,
Richard Markievicz
My fault totally, It is the bastion server itself that becomes unreachable from time to time, not the endpoint agents. I have a monitor going that tries to hit the URL on port 443 every 5 mins or so to keep an eye on the service so we know if its down. When that stops responding we find that we are unable to hit the web interface of the server but can RDP into the server and run a Restart-WaykBastion and the server becomes reachable again.
EDIT: we have seen this happen on versions 2021.1.3, and some of the earlier ones. We just updated the server to 2021.1.4 when it became unavailable this time.
Hi Tyson
Thanks for the update.
We just updated the server to 2021.1.4 when it became unavailable this time.
Do you mean; you had this issue after updating to 2021.1.4?
Thanks and kind regards,
Richard Markievicz
No, we had the issue and then while the server was down, we updated to 2021.1.4
Hi Tyson
Ok, thanks for clarifying. In that case, please monitor things and let us know if you see future issues. But the problem sounds like an issue that was present in 2021.1.2 and 2021.1.3; and the release notes mention it for 2021.1.4:
Like I said please let us know if you have any further issues.
Thanks and kind regards,
Richard Markievicz
Thanks for the quick attention to this one, I'll report back if it happens again.
Ok chiming back in. Our bastion server stopped responding this morning. I have not been in a position to attempt an RDP into it yet but if I can, what action should I attempt?
Hi Tyson,
When you say it stopped responding, do you mean you can't access the web interface at all ? Or you mean that you can't create session via the Wayk Bastion ?
If you could send us the Wayk Bastion server log, I would appreciate. You can run the command Export-WaykBastionLogs, it should create a zip file. You can send it to me via direct message or send it to ticket@devolutions.net.
I will look at it and let you know if I can find the reason of that issue.
Thank you and best regards,
François Dubois
You are correct its like it was previously, web interface stopped responding altogether. We keep a persistent monitor on the Bastion URL and when it stopped responding we knew it was down again. I was able to RDP into the server however and run the log export commandlet. Now I just need to figure out the best way to get a zip file out of windows server core :) do you typically recommend installing an FTP service on there or something else?
Logs sent! Should I keep the server in its current state, or can I restart the Wayk services to bring it back online? Right now its not a production server so it being unusable is not the end of the world if there are other diagnostics you want to pull from it.
Hi Tyson,
From the log that you sent, all containers were stopped. From what I understood, you didn't stop the Wayk Bastion server before you generated the log, am I right ? In the containers list, I can see all containers with that status Exited (4294967295) 21 hours ago Do you know if something special occurred 21 hours before you got the logs ?
I'm still looking the log, but I don't see anything special other than the server has been stopped. You can restart it, we have all logs from all containers.
Best regards
François Dubois
Ok cool, ran a Restart-WaykBastion and the web console became available again.
That is interesting about the containers stopping. I don't have a good answer as to why that would happen. We set up the server according to the documentation as a VM within Azure. Is there something as part of Azure management that would somehow cause the containers to stop?
We run a persistant external monitor that attempts to load the URL for the bastion server every 5 mins or so. As with this instance and in the past, when it fails we know we need to RDP into the VM and restart the bastion services. As far as anything special, not that I am aware of. We generally do not touch the Azure services for this VM and just let it run. As far as I can see, no-one was in there at the time the server became unreachable.
From my monitor it looks like the server stopped responding to requests at 2021-05-11 22:48:40, does that line up with what you are seeing in the logs?
Hi Tyson,
I just remembered that we have already seen something similar if the server was running in VM Ware. Is it your case ? Here is our small doc about that : https://docs.devolutions.net/wayk/bastion/docker-installation.html#virtual-machine
Best regards,
François Dubois
Ok cool I will read that doc now. Do you think this would apply though since our VM is a pure Azure one?
No, this issue was only if the server was in VMWare, but it is not your case from what I understand. And the server was stopping every 30 minutes, but you, it stopped, but after many days, right ? So it is not related. In fact, the server was not stopping, it was crashing and restarting but since the order is important, it was an issue. Here, the server was stopped and it didn't restart. So it looks like it didn't crash...
I'm still looking, I will keep you posted
Best regards,
François Dubois
Yeah last connection loss I see on the server was on 4/16 so it stayed up for almost a month. Thanks for the help so far!
Hello Tyson,
My colleague gave me a new path to investigate and it is probably the issue. Is it possible that your host was restarted ? I reproduce something similar if I restart my host. So I suspect your Azure VM to have restarted. You could probably have a look in the event viewer to see if it is the case. But my other question would be : Did you register your Wayk Bastion server as a service ? You can register Wayk Bastion as a service. The service will call `Start-WaykBastion` when it starts so even if your host reboots, WaykBastion will be back as soon as your VM will be back. Here is the doc how you can do that : https://docs.devolutions.net/wayk/bastion/index.html#system-service
Let me know if it makes sense in your situation and if your server was registered as a service. It could explain what you have seen.
Best regards,
François Dubois
I think this may have been the case. In the event logs on the server, I could see that a reboot did indeed happen around the time the server became unavailable. Bastion was NOT set to run as a service and how now been set up to run as one. We tried a couple of reboots and found Bastion came back up afterwards. Will continue to monitor but this is probably it.
Thank you!
Awesome, don't hesitate to contact us again if you have any other issues or questions!
Best regards,
François Dubois