It is not the case that Benchmark performance on tasks like retrieval or translation measures task-completion, not semantic understanding in Frege's sense of grasping truth-conditions.
?Set your confidence on the premises below to see your aggregate.
No one has weighed in yet. Be the first to share reasons for or against this statement.