Preview scored under 10%, while Claude 3.5 scored below 25%  in the ARC-AGI benchmark - the best test to determine AGI ...