This paper considers capacity control under customer choice behavior.
The problem is solved by least squares approximate policy iteration.
Constraints on value function approximations are proposed to facilitate learning.
Numerical results demonstrate the superior performance to existing methods.