Thanks to precomputation, the generic implementation is faster. Don't even define a .mult_base placeholder for sandy2x Avoid two indirections for fixed base multiplication until another implementation possibly exists.