A Support Vector Machine is a linear classifier that attempts to maximise the margin (ie. the distance between the classifier and the nearest training datum). Although SVMs are not Bayes-efficient, in practice they often generalize well and are particularily useful in conjunction with the kernel trick, which allows the classifier to work in a large (even infinite) feature space. The standard tutorial on SVM's is Burges' A tutorial on Support Vector Machines for Pattern Recognition. Shivani Agarwal has a very good introductory presentation on SVM's.
Training an SVM requires the solution of a very large quadratic programming problem (finding lagrange multipliers for each data point) which is often intractable. Sequential Minimal Optimization approaches the solution by solving for two l.m.'s (keeping the others fixed) at each step, and doing hill-climbing. Because there are only two variables, the QP can be solved analytically, making the inner loop of the training program very fast.
SMO is competitive with other SVM training methods such as Projected Conjugate Gradient "chunking" and in addition is easier to implement. SMO is the work of John Platt at Microsoft Research, and he maintains a reference page on the topic here.
Finally, there is also an SVM mailing list.
Add to del.icio.us