Setting up Ray Cluster
- Install
lambda-stackto get gpu running with same version across all the local machines. - Install
rayusingpip install rayon all the local machines with same versions. - Remove firewall rules for
rayto work properly. - Generate ssh keys using
ssh-keygenand copy the public key to all the local machines usingssh-copy-idcommand. passwordless ssh is good forrayto work properly with out config-file. - Run the ray-cluster.yaml config file using
ray upcommand.
Important : Lambda Stack don't support Power PC's so we might need to install the nvidia-graphics driver and nvidia-docker runtime manually.
Steps
-
Download the nvidia-gpu-driver nvidia-driver with the following options.

-
Install the nvidia-gpu-driver
sudo chmod +x <nvidia-driver>.run
sudo ./<nvidia-driver>.run
- Install the nvidia-docker container toolkit.