Water resource management is one of the important issues for most governmental and private agencies around the globe. many mathematical and heuristic optimization or simulation techniques have been developed and applied to capture the complexities of the problem; however, most of them suffered from the curse of dimensionality. Q-learning as a popular and simulation-based method in Reinforcement Learning (RL) might be an efficient way to cope well practical water resources problems because of being model-free and adaptive in a dynamic system. However, it might have trouble for large scale applications. In this chapter, we are going to introduce a new type-II opposition of Q-Learning technique in a single and multiple-reservoir problem. The experimental results at the end of chapter will confirm the contribution of the opposition scheme in speeding up the learning process especially at the early stage of learning and making it more robust at the end. These results are promising for large-scale water resources applications in the future.