Setting up a basic Selenium Grid is pretty good documented and has a lot of examples on the internet. After you setup your Selenium Grid and daily run tests against it. You might run in to some issue’s just like I have. The Grid setup is relative stable, but I would run into one of the following problems every other month.
- Selenium Node Java processes run out of memory
- Browsers sometimes crash and are not closed correctly
- Selenium HUB Java processes stops responding totaly
- Node Operating System out of memory
This led to restarting of the nodes and hub when all the tests where failing. I was not very happy with restarting everything manual. Also I wanted a good way to update the Selenium Grid version and settings from a single central location, since updating a lot of nodes manual is tiresome.
Here we have a number of systems with different roles
- Continuous Integration Server (for example Jenkins): Starts and runs the tests
- Grid HUB: Communicates the test steps to the nodes
- Grid nodes: Performs the tests against the real browsers. All nodes have a SSH server installed. I use a Cygwin setup for the Windows nodes.
- SMB: Central file share, contains the configs, shell scripts and the Selenium software (jars and additional third party drivers)
We put the configs, shell scripts (for starting hub and nodes) and the selenium software on the central file share. Mount the central file share on all the nodes/hub and setup the operating system to run the start-up shell scripts just after the systems is booted (you might need to configure auto login first). This means that a clean reboot of every machine leads to a fresh grid situation. The nodes automatically wait and connect to the grid hub.
The next step is to create a C.I. job which resets the Selenium Grid on certain intervals. One problem is that the Selenium Grid currently does not offer a graceful shutdown. Which means when you shutdown any element of the grid the currently running test will fail. To tackle this we need to make sure no tests are running. For Jenkins we use the Exclusive Execution Plugin to put Jenkins in maintenance mode, then it waits for all other jobs to finish and it runs the job marked as exclusive. After the exclusive job is finished it returns Jenkins to normal mode. Our Selenium Grid restart job executes the following steps:
- Shutdown the Grid HUB to prevent any new tests to start, by hitting the shutdown url with wget: http://yourhubip:4444/lifecycle-manager?action=shutdown
- Restart the Grid Nodes. We SSH into each node and send the reboot command (for windows its shutdown /r )
- Sleep for 10 minutes to let the machines finish the reboot (still need to find a better way to check if the nodes are back)
- Start the Grid HUB in the background (over SSH we send this command: echo “sh -c ‘cd /mnt/shared/selenium; nohup sh start_hub.sh &'” | at now +1 min )
Now we scheduled the job to run every night when all the developers are sleeping. Everyday we have a fresh grid setup to work against, the joy!
If we want to upgrade the Selenium version, we just update the jars in the central location and run the Jenkins grid restart job.
Not sure if this the most optimal setup, but I hope this post gives an idea of how you could create a pretty stable selenium grid setup.