> It is only safe to run the command if none of the nodes is recovering, you > said you have node failure event all the time, so there isn't any window for > you to run the command. Your target are system with 1000 nodes . Let's say average lifetime of a node about 3 years. So you get about one node failure per day. But recovery and cleanup actions can take several hours, so it is quite hard to find a window on such system? |