Commit Graph

285 Commits

Author SHA1 Message Date
A. Unique TensorFlower a6b8882739 Integrate LLVM at llvm/llvm-project@b650778dc4
Updates LLVM usage to match
[b650778dc4ac](https://github.com/llvm/llvm-project/commit/b650778dc4ac)

PiperOrigin-RevId: 380565709
2021-06-21 06:40:22 -07:00
A. Unique TensorFlower d4a7901284 Integrate LLVM at llvm/llvm-project@366df11a35
Updates LLVM usage to match
[366df11a3539](https://github.com/llvm/llvm-project/commit/366df11a3539)

PiperOrigin-RevId: 380081103
2021-06-17 17:29:07 -07:00
A. Unique TensorFlower 470ac45f45 [MLIR][HLO] Remove unused pass `TransformUnrankedHloPass`
The pass was replaced by the new generalized rank specialization and the two
passes `mhlo-rank-specialization-cluster` and `mhlo-rank-specialization-to-scf`.

PiperOrigin-RevId: 379935562
2021-06-17 05:20:49 -07:00
Adrian Kuegel 376da8592f Add MLIR generated SignOp GPU kernel for complex types.
PiperOrigin-RevId: 379924456
2021-06-17 03:56:58 -07:00
Adrian Kuegel 73ed8cbf82 Add MLIR generated NegOp GPU kernel for complex types.
PiperOrigin-RevId: 379905236
2021-06-17 01:30:51 -07:00
Mehdi Amini 8c8e81cb69 Fix pass definition to inherit from the TableGen generated base class (NFC)
PiperOrigin-RevId: 379860210
2021-06-16 19:05:11 -07:00
Wenyi Zhao 88cc0c6c46 PR #50271: [MLIR][DISC] Bufferize GatherOp and DynamicGatherOp
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50271

support hlo-to-lhlo conversion for GatherOp and DynamicGatherOp
Copybara import of the project:

--
117a1b1bcaac7ecc5224b02863eede5c1b9618fe by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] Bufferize GatherOp and DynamicGatherOp

PiperOrigin-RevId: 379801972
2021-06-16 13:47:56 -07:00
Wenyi Zhao 34dc5f2a79 PR #50020: [MLIR][DISC] support fusion on buffer
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50020

This pass implements the logic to group kLoop/kInput fusion patterns on
buffer level. The reason for this is that we can avoid a lot of
headaches to handle `shape-only` consumers specially (e.g. memref.dim,
shape.shapeOf) since shapes are already resolved in buffer world. It may
be better to move this pass to tensor level after more shape
inference/constraint infras are ready on mhlo level.
Copybara import of the project:

--
e31f8344b59aa9860097197585215ea1689b8ff4 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] support fusion on buffer

This pass implements the logic to group kLoop/kInput fusion patterns on
buffer level. The reason for this is that we can avoid a lot of
headaches to handle `shape-only` consumers specially (e.g. memref.dim,
shape.shapeOf) since shapes are already resolved in buffer world. It may
be better to move this pass to tensor level after more shape
inference/constraint infras are ready on mhlo level.

--
35f2eb2791241b0ab5db1ddcaf1b4006278ddccf by Wenyi Zhao <reyizero@gmail.com>:

fix

--
923c8d61f7fe00a2a0df22d5be396508f0667964 by Wenyi Zhao <reyizero@gmail.com>:

fix sanity check failure

PiperOrigin-RevId: 379743424
2021-06-16 09:51:29 -07:00
A. Unique TensorFlower 82696f8598 [MLIR][HLO] Annotate `mhlo.clamp` and `mhlo.select` as element-wise broadcasting
The operations allow for a limited form of broadcasting which allows some
operands to be scalars. As such they are neither strictly `Elementwise`, nor
`Broadcasting`. They do fulfill the requirements for `BroadcastingElementwise`
though.

PiperOrigin-RevId: 379719961
2021-06-16 07:59:26 -07:00
Feiwen 3afbe312f8 PR #49919: [MLIR][DISC] pattern conversion from tf2mhlo: ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49919

We are porting our MLIR-based dynamic shape compiler to tf community (From OP def, Patttern, to Optimization pass, etc).
This is the 5th PR about tf2mhlo pattern conversion, which including ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic.
The rest pattern conversions we will add:
- ConvertSqueezeOpxxx
- ConvertStridedSliceOpxxx
- ConvertPrintOp
Copybara import of the project:

--
21b3c3eb05b12956bcdb8b98cc54d9371dbf034d by azazhu <azazhu@gmail.com>:

[MLIR][DISC] pattern conversion from tf2mhlo: ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic

--
634630a4e2e426357290650bd579b35efecab5b3 by azazhu <azazhu@gmail.com>:

[MLIR][DISC] refine ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic

--
39a2bedd6dafb369ae960c5197b7a352bfdfbc80 by azazhu <azazhu@gmail.com>:

add RealDynamicSliceOp's canonicalize and fix CI

--
a1c38dd0963d602ed4812da0d77a096a95920ddb by azazhu <azazhu@gmail.com>:

fix CI for ConvertUnpackOpDynamic

--
5a8b4eb389ed6dc554104356c37f2f1550802b8c by azazhu <azazhu@gmail.com>:

fix typo in ConvertSigmoidGradOpDynamic

PiperOrigin-RevId: 379521079
2021-06-15 10:33:32 -07:00
Chris Jones 5fbdac34a9 [XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect.
PiperOrigin-RevId: 379455720
2021-06-15 03:55:19 -07:00
Adrian Kuegel 399dae666d Add MLIR generated ExpOp GPU kernel for complex types.
We lower lmhlo::ExpOp to mlir::complex::ExpOp for complex types.

PiperOrigin-RevId: 379432147
2021-06-15 00:45:45 -07:00
Wenyi Zhao 7f94bd923b PR #50236: [MLIR][DISC] Bufferize TransposeOp and ConcatenateOp
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50236

support hlo-to-lhlo conversion for TransposeOp and ConcatenateOp
Copybara import of the project:

--
62860e717f2a14fbd3ddfb634aa6ff132d245a72 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] Bufferize TransposeOp and ConcatenateOp

--
ce2ff57c1edee1172cd2f36346cc0b34ec1c7467 by Wenyi Zhao <reyizero@gmail.com>:

fix

PiperOrigin-RevId: 379330954
2021-06-14 12:37:45 -07:00
Wenyi Zhao 23ebbb28d1 PR #50191: [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50191

DISC is a e2e flow, including both compiler side and runtime side. For
runtime side, we have different targeting environments (e.g. tensorflow,
pytorch, or sometimes even a standalone binary). In order to simplify
the design of the compiler side, we design a Runtime Abstraction Layer
(RAL) to sperate the compiler side and runtime side. Thus the compiler
side only need to target RAL itself and it is the responsibility of RAL
to handle the differences between different targeting environments.

One of the most important functions of RAL is to manage stateful
resources. To this end, it provides a context object, and hides all
stateful operations behind this context, thus the compiler side itself
doesn't need to care about the resource initialization. For example, a
kernel must be loaded before it can be launched on GPU. However, the
loading operation should only be taken once during the whole lifetime of
the context in order to achieve the best performance. Based on the
initialization-free interfaces provided by RAL, compiler side can focus
on its core optimization logic and lets the RAL to manage the resource
status.

The context mentioned above is passed as a parameter to the entry
function and all RAL APIs should always use the context as their first
argument. This CR also provides a pass to help to ensure this property.
The pass rewrites the entry function to make sure their first argument
is the context. For entry function, the pass also rewrites its inputs
and outputs. To be concrete, all the original inputs and outputs of the
entry function are received from and sent to RAL through a sequence of
RAL API calls correspondingly. The motivation behind this is to hide the
implementation details of I/Os. This design may also potentially enable
partial execution of the compiled module when some of the inputs are
ready.
Copybara import of the project:

--
c4f20a89aed71181e75bcc5265723b88bde23240 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect

DISC is a e2e flow, including both compiler side and runtime side. For
runtime side, we have different targeting environments (e.g. tensorflow,
pytorch, or sometimes even a standalone binary). In order to simplify
the design of the compiler side, we design a Runtime Abstraction Layer
(RAL) to sperate the compiler side and runtime side. Thus the compiler
side only need to target RAL itself and it is the responsibility of RAL
to handle the differences between different targeting environments.

One of the most important functions of RAL is to manage stateful
resources. To this end, it provides a context object, and hides all
stateful operations behind this context, thus the compiler side itself
doesn't need to care about the resource initialization. For example, a
kernel must be loaded before it can be launched on GPU. However, the
loading operation should only be taken once during the whole lifetime of
the context in order to achieve the best performance. Based on the
initialization-free interfaces provided by RAL, compiler side can focus
on its core optimization logic and lets the RAL to manage the resource
status.

The context mentioned above is passed as a parameter to the entry
function and all RAL APIs should always use the context as their first
argument. This CR also provides a pass to help to ensure this property.
The pass rewrites the entry function to make sure their first argument
is the context. For entry function, the pass also rewrites its inputs
and outputs. To be concrete, all the original inputs and outputs of the
entry function are received from and sent to RAL through a sequence of
RAL API calls correspondingly. The motivation behind this is to hide the
implementation details of I/Os. This design may also potentially enable
partial execution of the compiled module when some of the inputs are
ready.

--
1991d4f80ab6087943956e1c0fec4940a22ab08d by Wenyi Zhao <reyizero@gmail.com>:

fix

PiperOrigin-RevId: 379317586
2021-06-14 11:27:43 -07:00
Rahul Joshi a6011d0279 [HLO] Add AllReduceScatter to MHLO and LMHLO dialects.
PiperOrigin-RevId: 379296198
2021-06-14 09:37:07 -07:00
Wenyi Zhao 8388303fd2 PR #50211: [MLIR][DISC] Bufferize RealDynamicSliceOp and ReduceOp
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50211

support hlo-to-lhlo conversion for RealDynamicSliceOp and ReduceOp
Copybara import of the project:

--
c417b336670a1fc256f7026dfe8080e46d13d79a by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] Bufferize RealDynamicSliceOp and ReduceOp

PiperOrigin-RevId: 378972113
2021-06-11 16:33:15 -07:00
Jacques Pienaar 95ba03534f Allow variadic operands/result in MHLO while
This just adds support for it in the op, but keeps the production/uses as is (e.g., single tensor or tuple) matching what XLA export requires. In follow up here, would be to add pass for export to retuple and then the canonical form could be changed. Tuple'ing given control flow via regions & multi-result operations does not add representational power and all the get_tuple_element ops obscure the computation.

The old form allowed single tensor or tuple. The new variadic number of tensor or tuples as tuples may be nested, so the input could have (Tensor<..>, Tuple<Tensor<...>, Tuple<...>, ...>, Tensor<...>) and HLO_Tensor doesn't allow Tuples.

PiperOrigin-RevId: 378934388
2021-06-11 13:08:28 -07:00
Wenyi Zhao 6660234d80 PR #50100: [MLIR][DISC] Bufferize DynamicIotaOp and DynamicPadOp
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50100

support hlo-to-lhlo conversion for DynamicIotaOp and DynamicPadOp
Copybara import of the project:

--
c3aae94954e35d3f8ad265f619ef9765665a5115 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] Bufferize DynamicIotaOp and DynamicPadOp

--
adc6996d70b804d61310d56a33fac975d70c8636 by Wenyi Zhao <reyizero@gmail.com>:

minor

PiperOrigin-RevId: 378733284
2021-06-10 14:20:45 -07:00
A. Unique TensorFlower 14093b7906 [XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect.
PiperOrigin-RevId: 378681070
2021-06-10 10:27:22 -07:00
Chris Jones 968226b9d7 [XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect.
PiperOrigin-RevId: 378640706
2021-06-10 06:54:42 -07:00
Adrian Kuegel b6d8160611 Add Broadcasting and BroadcastingElementwise traits to ConstantLikeOp.
This allows to include such ops in rank specialization clusters.

PiperOrigin-RevId: 378380915
2021-06-09 05:09:26 -07:00
A. Unique TensorFlower c47869f931 [MLIR][HLO] Rename `move-up-dynamic-broadcasts-for-fusion` to `broadcast-propagation`
PiperOrigin-RevId: 378102608
2021-06-08 01:51:10 -07:00
Wenyi Zhao ade873a5e0 PR #49970: [MLIR][DISC] bufferize DynamicReshape and DynamicBroadcastInDim
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49970

1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim

2, add a flag `convert-to-lmhlo-only` to seperate following two case:
   - hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo
     counterparts, do not apply any optimization (e.g. elide any
     buffer copy). Buffer optimization is not easy in dynamic
     shape world especially when involving control flow, thus we
     leave this to another dedicated pass.

   - hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo
     ops (e.g. reshape) to memref dialect directly and Lowers others
     to their lmhlo counterparts.
Copybara import of the project:

--
562bd65a368f6194405c4ae6900e3b4388a5ec03 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] bufferize DynamicReshape and DynamicBroadcastInDim

1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim

2, add a flag `convert-to-lmhlo-only` to seperate following two case:
   - hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo
     counterparts, do not apply any optimization (e.g. elide any
     buffer copy). Buffer optimization is not easy in dynamic
     shape world especially when involving control flow, thus we
     leave this to another dedicated pass.

   - hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo
     ops (e.g. reshape) to memref dialect directly and Lowers others
     to their lmhlo counterparts.

PiperOrigin-RevId: 377603395
2021-06-04 15:36:03 -07:00
A. Unique TensorFlower db05388a3c Integrate LLVM at llvm/llvm-project@da3ed58b97
Updates LLVM usage to match
[da3ed58b97c1](https://github.com/llvm/llvm-project/commit/da3ed58b97c1)

PiperOrigin-RevId: 377432380
2021-06-03 20:45:18 -07:00
A. Unique TensorFlower aba16adfa5 Add `mhlo.all_gather` op to MHLO dialect.
Adds import/export/verifier support as well.
Also makes `channel_handle` uniform across mhlo.all_reduce and mhlo.all-gather.

PiperOrigin-RevId: 377323468
2021-06-03 10:45:29 -07:00
A. Unique TensorFlower fe42a08fc9 Use channel_handle for ChannelHandles in MHLO ops. This makes the naming of these properties consistent across these ops.
PiperOrigin-RevId: 377309518
2021-06-03 09:49:47 -07:00
A. Unique TensorFlower 75a1c450ea [MLIR][KernelGen] Fix Windows build failure
Fix usage of default constructor. Instead, always use the parameterized
constructor and make the maximum supported rank explicit.

PiperOrigin-RevId: 377037155
2021-06-02 05:34:44 -07:00
wyzhao 968d4b8709 PR #49598: [MLIR][DISC] legalize tensor_load inserted during hlo-to-lhlo conversion
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49598

This PR implements logic for lowering memref.tensor_load ops that are
inserted during `mhlo-legalize-to-lmhlo`
Copybara import of the project:

--
80eb377af4e02182e1aecc943a41ca5d7d1c2100 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] legalize tensor_load inserted during hlo-to-lhlo conversion

This PR implements logic for lowering memref.tensor_load ops that are
inserted during `mhlo-legalize-to-lmhlo`.

--
ac452fe3dcd591211cd5c59be9189fe2f7153b41 by Wenyi Zhao <reyizero@gmail.com>:

minor fix

--
6b36017f8632a06adbc3e05a62975fa641d0260f by Wenyi Zhao <reyizero@gmail.com>:

minor refine

--
846005cc76d0033112e47825c2e9a97790b6925f by Wenyi Zhao <reyizero@gmail.com>:

minor fix

--
f6a4becaa287d5ca323b2d152a4d0ae053730fd9 by Wenyi Zhao <reyizero@gmail.com>:

fix

--
5555749f60f7fce8f57962860ef65efccf0362ba by Wenyi Zhao <reyizero@gmail.com>:

fix

--
8873b9b6d9315c1199ca9f7c133ecf377ecd2fa6 by Wenyi Zhao <reyizero@gmail.com>:

fix

PiperOrigin-RevId: 376942547
2021-06-01 16:27:56 -07:00
A. Unique TensorFlower d1828625ab [MLIR][KernelGen] Make maximum supported rank in rank specialization configurable
The maximum supported target rank of 5 is sufficient for all operations but
`select`. Make the maximum target rank configurable in the rank specialization.
This reduces the number of generated kernels for operations that don't require
it.

PiperOrigin-RevId: 376822496
2021-06-01 06:54:31 -07:00
A. Unique TensorFlower b32f885ad7 [MLIR][KernelGen] Enable cluster rank specialization
Replace the previously used `TransformUnrankedHloPass` which rank-specializes
only one operation at a time. The new generalized rank specialization clusters
compatible operations and rank-specializes them collectively.

PiperOrigin-RevId: 376127752
2021-05-27 02:44:31 -07:00
Mehdi Amini af01c08ce6 Allow tuple as results of mhlo.custom_call op (NFC)
PiperOrigin-RevId: 376009518
2021-05-26 12:56:10 -07:00
Adrian Kuegel a847109ac7 Support complex types when converting HLO multiply op.
We can lower it to the MulOp in the complex dialect.

PiperOrigin-RevId: 375675079
2021-05-25 04:35:34 -07:00
Adrian Kuegel 5816920258 Support complex types when converting HLO divide op.
We can lower it to the DivOp in the complex dialect.
Also add tests to hlo-legalize-to-linalg.mlir for CompareOp lowering of complex
types. These were forgotten in a previous commit.

PiperOrigin-RevId: 375669125
2021-05-25 03:43:46 -07:00
Adrian Kuegel 758ae7da6b Support complex types when converting HLO compare op (EQ/NE).
We can lower it to the EqualOp / NotEqualOp in the complex dialect.

PiperOrigin-RevId: 375655092
2021-05-25 01:54:27 -07:00
wyzhao b93e54d8a4 PR #49454: [MLIR][DISC] Upgrade to use the new `reifyReturnTypeShapes` interface.
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49454

The new interface is more safe to be used during dialect conversion
(e.g. converting from tensor world to buffer world).
Copybara import of the project:

--
a6968072d59bec3c3bbaef0121d297e807c37c91 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] Upgrade to use the new `reifyReturnTypeShapes` interface.

The new interface is more safe to be used during dialect conversion
(e.g. converting from tensor world to buffer world).

--
55e7c6b7f2f99b99e226645a57e2433fae3e90ed by Wenyi Zhao <reyizero@gmail.com>:

minor fix

PiperOrigin-RevId: 375500273
2021-05-24 10:11:55 -07:00
Feiwen a7884196f5 PR #49228: [MLIR][DISC] porting dynamic shape related OPs to mhlo and lmhlo dialect
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49228

We are porting our MLIR-based dynamic shape compiler to tf community (From OP def, Patttern, to Optimization pass, etc).
This is the first PR, which including some dynamic shape OPs def in mhlo and lmhlo dialect.
For mhlo dialect, we add:
- HLO_RealDynamicSliceOp
- HLO_DynamicPadOp
- HLO_DynamicGatherOp
- HLO_DynamicConvOp

For lmhlo dialect, we add:
- LHLO_RealDynamicSliceOp
- LHLO_DynamicBroadcastInDimOp
- LHLO_DynamicGatherOp
- LHLO_DynamicPadOp
- LHLO_DynamicBitcastOp
- LHLO_DynamicConvOp
- LHLO_DynamicIotaOp
- LHLO_DynamicReshapeOp
- LHLO_DotGeneralOp
- LHLO_BitcastOp

Rest Ops to add:
* We will send a separate PR containing LHLO_DynamicWhileOp and LHLO_DynamicCaseOp for control flow.
* We will add a separate dedicated dialect like mhlo_ral, which including D2HOp/H2DOp/DebugPrintOp/TopKOp, etc.

Previous discussions:[RFC](https://groups.google.com/a/tensorflow.org/g/mlir/c/_X48poNcbDI/m/jCC8BWIICQAJ), [discussion_1](https://llvm.discourse.group/t/updates-on-mlir-based-dynamic-shape-compiler/2384), [Recording of meeting](https://drive.google.com/file/d/1_uEISlV5MUWdG9faKAdKlCWnPtGjRC-D/view?usp=sharing).
Copybara import of the project:

--
e22d9e61106e00a1a1c6f368cc4a03e3bd1f414c by azazhu <azazhu@gmail.com>:

[DISC]fea: porting mhlo and lmhlo OPs

--
9ec3e76290da07cbd53d7da5fa86ff67179441a1 by azazhu <azazhu@gmail.com>:

[DISC][MLIR] 1. add summary and description for dynamic OPs in mhlo and lmhlo; 2. rm InferOutputTypes; 3. add verify for RealDynamicSliceOp and DynamicPadOp

--
0d68cd135555fd935991c12456b21329e628f23f by azazhu <azazhu@gmail.com>:

[DISC][MLIR] 1.remove D2H,H2D and DebugPrint Ops from mhlo/lmhlo dialect; 2. add type constraint to DynamicPadOp and RealDynamicSliceOp; 3.refine lmhlo type constraint; 4.rename RealDynamicSliceOp as name conflict.

--
698762a77d60f6a844cb1ab3f32740d4ef3c5843 by azazhu <azazhu@gmail.com>:

[DISC][MLIR] 1. replace dyn_cast to cast 2. refine code

PiperOrigin-RevId: 375022260
2021-05-20 23:16:47 -07:00
Rahul Joshi 41f663ce47 [HLO] Adopt custom syntax for convolution dimensions and window attributes (HLO)
PiperOrigin-RevId: 374923250
2021-05-20 12:13:50 -07:00
Rahul Joshi fc88cf1ff4 [HLO] Adopt custom syntax for convolution dims and window attributes for LMHLO_GPU
PiperOrigin-RevId: 374889917
2021-05-20 09:41:48 -07:00
A. Unique TensorFlower 57aeb5ab16 Integrate LLVM at llvm/llvm-project@0316f3e649
Updates LLVM usage to match
[0316f3e64972](https://github.com/llvm/llvm-project/commit/0316f3e64972)

PiperOrigin-RevId: 374855085
2021-05-20 06:09:40 -07:00
Stella Laurenzo 0fe07e3814 Separate CHLO transforms for expanding compositions and lowering broadcasts.
* The former is typically invariant regardless of backend.
* The latter may need to be done differently depending on capabilities of the lowering target.

PiperOrigin-RevId: 374492924
2021-05-18 13:33:59 -07:00
A. Unique TensorFlower ccd70d5717 [MLIR][HLO] Add `rank-specialization-to-scf` pass
Currently the lowering is only implemented for the unary case. The n-ary case
will follow.

PiperOrigin-RevId: 374162772
2021-05-17 03:56:23 -07:00
Rahul Joshi a361253e4f [HLO] Add custom print/parse for window attributes of convolutions (in LMHLO)
PiperOrigin-RevId: 373807616
2021-05-14 09:47:25 -07:00
A. Unique TensorFlower d2cc74317c Implement constant folding for mhlo.Sign.
PiperOrigin-RevId: 373550014
2021-05-13 03:54:04 -07:00
A. Unique TensorFlower 420c42a0a1 [MLIR][HLO] Support CHLO unary operations in rank specialization clustering
PiperOrigin-RevId: 373397321
2021-05-12 10:20:43 -07:00
A. Unique TensorFlower 596918a6f1 [MLIR][HLO] Allow rank specialization clustering with `chlo.broadcast_select` op
PiperOrigin-RevId: 373379990
2021-05-12 08:56:49 -07:00
Rahul Joshi e260aa771c [HLO] Add custom print/parse for convolution dimension numbers (in LMHLO)
PiperOrigin-RevId: 373379227
2021-05-12 08:52:46 -07:00
A. Unique TensorFlower 313d24bc8f [MLIR][HLO] Add `rank-specialization-cluster` pass
Add a pass to cluster unranked C/HLO operations in one
`chlo.rank_specialization_cluster` op. The C/HLO operations are moved to the
body of the operation. Later passes can use this to rank-specialize all these
operations together.

PiperOrigin-RevId: 373336725
2021-05-12 03:46:01 -07:00
Jacques Pienaar 2ea9470515 Remove BASE_HLO_ConvOp to remove coupling between MHLO and LMHLO conv ops
PiperOrigin-RevId: 373201247
2021-05-11 11:54:44 -07:00
Itai Zukerman a4db6c57aa Removed all (most) BASE_HLO_* ops.
Moved the corresponding `summary` and `description` fields into the subclasses.
Kept BASE_HLO_ConvOp for `hasWindowReversal()'.

PiperOrigin-RevId: 373173025
2021-05-11 09:48:31 -07:00
A. Unique TensorFlower 7f7a86ad0d [MLIR][HLO] Implement `RegionBranchOpInterface` for rank specialization cluster
PiperOrigin-RevId: 373163196
2021-05-11 09:03:05 -07:00