We should upcast F16 to F32 to prevent precision loss.
E.g. sinh(-9) would evaluate to -4042 previously instead of -4052.
This allows to enable the MLIR generated kernel for F16 type.
PiperOrigin-RevId: 377901896
Replace deprecated methods in lhlo_fuse_linalg.cc. The new structured op interface has been introduced in https://reviews.llvm.org/D103394.
PiperOrigin-RevId: 377875452
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49970
1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim
2, add a flag `convert-to-lmhlo-only` to seperate following two case:
- hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo
counterparts, do not apply any optimization (e.g. elide any
buffer copy). Buffer optimization is not easy in dynamic
shape world especially when involving control flow, thus we
leave this to another dedicated pass.
- hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo
ops (e.g. reshape) to memref dialect directly and Lowers others
to their lmhlo counterparts.
Copybara import of the project:
--
562bd65a368f6194405c4ae6900e3b4388a5ec03 by Wenyi Zhao <reyizero@gmail.com>:
[MLIR][DISC] bufferize DynamicReshape and DynamicBroadcastInDim
1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim
2, add a flag `convert-to-lmhlo-only` to seperate following two case:
- hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo
counterparts, do not apply any optimization (e.g. elide any
buffer copy). Buffer optimization is not easy in dynamic
shape world especially when involving control flow, thus we
leave this to another dedicated pass.
- hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo
ops (e.g. reshape) to memref dialect directly and Lowers others
to their lmhlo counterparts.
PiperOrigin-RevId: 377603395
Adds import/export/verifier support as well.
Also makes `channel_handle` uniform across mhlo.all_reduce and mhlo.all-gather.
PiperOrigin-RevId: 377323468
Fix usage of default constructor. Instead, always use the parameterized
constructor and make the maximum supported rank explicit.
PiperOrigin-RevId: 377037155
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49598
This PR implements logic for lowering memref.tensor_load ops that are
inserted during `mhlo-legalize-to-lmhlo`
Copybara import of the project:
--
80eb377af4e02182e1aecc943a41ca5d7d1c2100 by Wenyi Zhao <reyizero@gmail.com>:
[MLIR][DISC] legalize tensor_load inserted during hlo-to-lhlo conversion
This PR implements logic for lowering memref.tensor_load ops that are
inserted during `mhlo-legalize-to-lmhlo`.
--
ac452fe3dcd591211cd5c59be9189fe2f7153b41 by Wenyi Zhao <reyizero@gmail.com>:
minor fix
--
6b36017f8632a06adbc3e05a62975fa641d0260f by Wenyi Zhao <reyizero@gmail.com>:
minor refine
--
846005cc76d0033112e47825c2e9a97790b6925f by Wenyi Zhao <reyizero@gmail.com>:
minor fix
--
f6a4becaa287d5ca323b2d152a4d0ae053730fd9 by Wenyi Zhao <reyizero@gmail.com>:
fix
--
5555749f60f7fce8f57962860ef65efccf0362ba by Wenyi Zhao <reyizero@gmail.com>:
fix
--
8873b9b6d9315c1199ca9f7c133ecf377ecd2fa6 by Wenyi Zhao <reyizero@gmail.com>:
fix
PiperOrigin-RevId: 376942547
The maximum supported target rank of 5 is sufficient for all operations but
`select`. Make the maximum target rank configurable in the rank specialization.
This reduces the number of generated kernels for operations that don't require
it.
PiperOrigin-RevId: 376822496
Add the first MLIR-generated kernel that relies on an in-TF lowering. Fusion for
this kernel relies on the generalized rank specialization for operation groups.
PiperOrigin-RevId: 376805435
Replace the previously used `TransformUnrankedHloPass` which rank-specializes
only one operation at a time. The new generalized rank specialization clusters
compatible operations and rank-specializes them collectively.
PiperOrigin-RevId: 376127752
Take advantage of the fact that scalars are already ranked and that they are
neutral elements to broadcasting. Do not reshape scalars, do not consider them
for broadcasting, and materialize ranked operations on scalars accordingly.
PiperOrigin-RevId: 375968371
Rank specialization cases can be applied to all argument tensors of smaller
ranks than the expected maximum rank. This is crucial if all operands are
effectively scalars and the maximum reduced rank is 0.
PiperOrigin-RevId: 375712020