There is no reason to have a multidimensional iota for codegen.
This should be canonicalized to a single dimensional iota followed
by a broadcast. Changing iota to on a single dimension and a broadcast
substantially simplifies implementing iota operations.
PiperOrigin-RevId: 320095470
Also add a localized `mlir-hlo-opt` binary for the testing of
tensorflow/compiler/mlir/hlo/... ; this directory is intended to be self-contained
and depend only on MLIR.
PiperOrigin-RevId: 319878984