We consider the adaptive stochastic problem for a system described by a controlled Markov chain with a finite number of states. The novelty of the approach is in using of adaptation technique for optimization of the system with unknown distribution of the cost function. This approach is applicable to the Internet congestion control with active users, motivating our study. In fact, we consider a controlled homogenous finite Markov chain, introduce assumptions of processes independence, stationarity of positive random losses and a bound for their second moments, and regularity. The uncertainty is that the mean loss matrix is undefined. The related convex stochastic optimization problem as well as ideas of Mirror Descent Method give us the ability to design control algorithm and to prove the upper bound for the difference between expectation of time-averaged losses and their theoretical minimum. The main term of this upper bound, i.e. $O(sqrt{(N/T)ln(NK)})$, slightly depends on the state number $K$ which is standing inside the logarithm; here $N$ and $T$ stand for the number of control actions and the time horizon, respectively.