This documentation is for sysadmins to figure out what to do when things go wrong. If you don't have the required accesses and haven't been trained for such situation, you might be better off just trying to wake up someone that can deal with them. See the how-to-get-help documentation instead.

Specific situations

Server down

If a server is non-responsive, you can first check if it is actually reachable over the network:

ping -c 10 server.torproject.org

If it does respond, you can try to diagnose the issue by looking at Nagios and/or Grafana and analyse what, exactly is going on.

If it does not respond, you should see if it's a virtual machine, and in this case, which server is hosting it. This information is available in ldap (or the web interface, under the physicalHost field). Then login to that server to diagnose this issue.

If the physical host is not responding or is empty (in which case it is a physical host), you need to file a ticket with the upstream provider. This information is available in Nagios:

  1. search for the server name in the search box
  2. click on the server
  3. drill down the "Parents" until you find something that ressembles a hosting provider (e.g. hetzner-hel1-01 is Hetzner, gw-cymru is Cymru, gw-scw-* are at Scaleway, gw-sunet is Sunet)

What follows are per-provider instructions:

Hetzner robot (physical servers)

  1. Visit the Heztner Robot server page (password in tor-passwords/hosts-extra-info)
  2. Select the right server (hostname is the second column)
  3. Select the "reset" tab
  4. Select the "Execute an automatic hardware reset" radio button and hit "Send". This is equivalent to hitting the "reset" button on a computer.
  5. Wait for the server to return for a "few" (2? 5? 10? 20?) minutes, depending on how hopeful you are this simple procedure will work.
  6. If that fails, Select the "Order a manual hardware reset" option and hit "Send". This will send an actual human to attend the server and see if they can bring it back online.

If all else fails, Select the "Support" tab and open a support request.

Hetzner Cloud (virtual servers)

  1. Visit the Hetzner Cloud console (password in tor-passwords/hosts-extra-info)
  2. Select the project (usually "default")
  3. Select the affected server
  4. Open the console (the >_ sign on the top right), and see if there are any error messages and/or if you can login there (using the root password in tor-passwords/hosts)
  5. If that fails, attempt a "Power cycle" in the "Power" tab (on the left)
  6. If that fails, you can also try to boot a rescue system by selecting "Enable Rescue & Power Cycle" in the "Rescue" tab

If all else fails, create a support request. The support menu is in the "Person" menu on the top right of the page.

Cymru

Open a ticket by writing support@cymru.com.

Sunet

TBD

Support policies

Please see /tsa/policy/tpa-rfc-2-support/