Skip to content. | Skip to navigation

Personal tools

Navigation

You are here: Home / Wiki / Kb74

Kb74

Emulab FAQ: Setting Up a New Emulab: How is the reboot timeout controlled?

Emulab FAQ: Setting Up a New Emulab: How is the reboot timeout controlled?

When an experiment is swapping in, and a node fails to reboot properly, you will often see the following messages in the swapin log:

	Still waiting for pc218 - it's been 1 minute(s).
	Still waiting for pc218 - it's been 2 minute(s).
	Still waiting for pc218 - it's been 3 minute(s).
	Still waiting for pc218 - it's been 4 minute(s).
	*** Giving up on pc218 - it's been 4 minute(s).

There are two variables in the Emulab database that control how long the system will wait for a node to reboot before giving up on it and declaring failure:

  • bios_waittime: The node_types table has a slot to indicate how long the bios typically takes to go from reset, to the point where it it loads the PXE kernel. This number is typically set in the range of 60-120 seconds.
  • reboot_waittime: The os_info table (where OSIDs are stored) has a slot to indicate how long the operating system takes to reach multiuser mode. This number also includes the Emulab portion of the node self-configuration, which can add several minutes if tarballs or RPMs are scheduled to be loaded via the NS file. For our generic FreeBSD and Linux images, this number is typically set to 120 seconds. This number is of course dependent on the processor speed of your nodes. Slower nodes will require a longer timeout value.

Please note that the reboot_waittime is stored in each OSID entry in the database, and defaults to 120 seconds when an new OSID is created. You will need to change all of the entries in your database if you decide to change this number on your testbed. For example, to change all of the existing Linux images from 120 to 100 seconds:

	mysql> update os_info set reboot_waittime=100 where os='Linux'