DVLS Gateway - Failed to start error

DVLS Gateway - Failed to start error

avatar

Came in this morning to find both of our gateways offline, with the following message in the logs:

2026-02-03T12:59:25.114904Z ERROR devolutions_gateway: Failed to start error="failed to bind listener: tcp://REDACTED:8181 -> tcp://0.0.0.0:8181: failed to bind TCP socket: Address already in use (os error 98)"

Any hints on where to look for the root cause?

Thanks!
Gateway Version
-------------------------
devolutions-gateway_2025.3.3-1_amd64.deb

System Info
-------------------------
OS: Ubuntu 24.04.3 LTS x86_64
Host: VMware Virtual Platform None
Kernel: 6.14.0-37-generic
Uptime: 21 days, 16 hours, 18 mins
Packages: 838 (dpkg), 5 (snap)
Shell: bash 5.2.21
Resolution: 1280x768
Terminal: /dev/pts/0
CPU: Intel Xeon Gold 5220R (4) @ 2.194GHz
GPU: 00:0f.0 VMware SVGA II Adapter
Memory: 376MiB / 7941MiB

All Comments (5)

avatar

Hi,

That error means Devolutions Gateway tried to bind to port 8181 (0.0.0.0:8181), but the port was already in use at that moment. In practice, it’s usually one of these:

  1. Another process/service is listening on 8181
  2. A previous devolutions-gateway process didn’t fully exit and is still holding the socket
  3. A restart happened quickly and the previous listener wasn’t released yet (less common, but possible)


You can check who is listening on 8181 with these commands:

  • sudo ss -ltnp 'sport = :8181', or
  • sudo lsof -nP -iTCP:8181 -sTCP:LISTEN


Notes:

  • Gateway v2025.3.3 includes retry logic for transient “address already in use” bind errors. If it still ends up offline, it usually means the port stayed busy longer than the retry window or is persistently occupied.
  • If the “ss/lsof” output shows a different service owning 8181, that service will need to be moved to a different port (or Gateway’s listening port adjusted).
  • If the output shows devolutions-gateway itself owning the port while a restart is happening, it suggests the previous instance didn’t exit cleanly or restarts are overlapping.


Let us know the results.

Best regards,

Benoit Cortier

avatar

I'm assuming the issue is your number 2.

A previous devolutions-gateway process didn’t fully exit and is still holding the socket



Our gateways are purpose-built virtual servers, with no additional services installed.

sudo ss -ltnp 'sport = :8181'

State              Recv-Q             Send-Q                         Local Address:Port                         Peer Address:Port             Process
LISTEN             0                  64                                   0.0.0.0:8181                              0.0.0.0:*                 users:(("devolutions-gat",pid=3045022,fd=22))

sudo lsof -nP -iTCP:8181 -sTCP:LISTEN

COMMAND       PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
devolutio 3045022 root   22u  IPv4 10113238      0t0  TCP *:8181 (LISTEN)


avatar

Given your description, 2 would make sense.
The commands are showing that a Gateway instance is using the port; as expected.
You can use these commands again in the future if you hit the problem again.

Now, if you hit either 2 (a previous devolutions-gateway process didn’t fully exit and is still holding the socket) or 3 (a restart happened quickly and the previous listener wasn’t released yet), the error is typically considered transient, and Devolutions Gateway is supposed to handle that by retrying. Gateway v2025.3.3 will retry 10 times with an interval of 10 seconds, and shows a warning when doing so, with a message looking like:

Failed to bind tcp://… retrying in 10s


As of today, this only happen when the "address already in use" error is detected. Did you observe this warning in the logs, or did the Devolutions Gateway simply stopped without retrying at all?

Best regards,

Benoit Cortier

avatar

Thanks for the follow-up.

It did try, but ultimately, I had to restart the process manually.

2026-02-03T12:59:15.112674Z  WARN devolutions_gateway::service: Failed to bind tcp://0.0.0.0:8181; retrying in 10s error="failed to bind TCP socket: 
Address already in use (os error 98)" count=10
avatar

Thank you for the confirmation!

So I understand that 100 seconds were not enough for recovering the port. I’m a bit surprised that it would take longer, in practice it should not take more than 1 minute even if the service is killed, but I heard this can vary based on the actual kernel behavior and sysctl settings.

I’ll improve the retrial logic with some kind of progressive backoff on a longer period.

Benoit Cortier