Network analysis is becoming one of the most active research areas in statistics. Significant advances have been made recently on developing theories, methodologies and algorithms for analyzing networks. However, there has been little fundamental study on optimal estimation. In this paper, we establish optimal rate of convergence for graphon estimation. For the stochastic block model with k clusters, we show that the optimal rate under the mean squared error is n.¹ log k + k²/n². The minimax upper bound improves the existing results in literature through a technique of solving a quadratic equation. When $k\, \leqslant \,\sqrt {n\,\log \,n} $, as the number of the cluster k grows, the minimax rate grows slowly with only a logarithmic order n.¹ log k. A key step to establish the lower bound is to construct a novel subset of the parameter space and then apply Fano's lemma, from which we see a clear distinction of the non-parametric graphon estimation problem from classical nonparametric regression, due to the lack of identifiability of the order of nodes in exchangeable random graph models. As an immediate application, we consider nonparametric graphon estimation in a Holder class with smoothness .. When the smoothness . . 1, the optimal rate of convergence is n.¹ log n, independent of ., while for . . (0, 1), the rate is n-2./(.+1), which is, to our surprise, identical to the classical nonparametric rate.