Only few implementations of MPI support CUDA-aware MPI. I recommend to use the latest version (> 4.0) of CUDA-aware Open MPI.