Commit Graph

20 Commits

Author SHA1 Message Date
A. Unique TensorFlower 470ac45f45 [MLIR][HLO] Remove unused pass `TransformUnrankedHloPass`
The pass was replaced by the new generalized rank specialization and the two
passes `mhlo-rank-specialization-cluster` and `mhlo-rank-specialization-to-scf`.

PiperOrigin-RevId: 379935562
2021-06-17 05:20:49 -07:00
Wenyi Zhao 34dc5f2a79 PR #50020: [MLIR][DISC] support fusion on buffer
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50020

This pass implements the logic to group kLoop/kInput fusion patterns on
buffer level. The reason for this is that we can avoid a lot of
headaches to handle `shape-only` consumers specially (e.g. memref.dim,
shape.shapeOf) since shapes are already resolved in buffer world. It may
be better to move this pass to tensor level after more shape
inference/constraint infras are ready on mhlo level.
Copybara import of the project:

--
e31f8344b59aa9860097197585215ea1689b8ff4 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] support fusion on buffer

This pass implements the logic to group kLoop/kInput fusion patterns on
buffer level. The reason for this is that we can avoid a lot of
headaches to handle `shape-only` consumers specially (e.g. memref.dim,
shape.shapeOf) since shapes are already resolved in buffer world. It may
be better to move this pass to tensor level after more shape
inference/constraint infras are ready on mhlo level.

--
35f2eb2791241b0ab5db1ddcaf1b4006278ddccf by Wenyi Zhao <reyizero@gmail.com>:

fix

--
923c8d61f7fe00a2a0df22d5be396508f0667964 by Wenyi Zhao <reyizero@gmail.com>:

fix sanity check failure

PiperOrigin-RevId: 379743424
2021-06-16 09:51:29 -07:00
Geoffrey Martin-Noble f9f7a63870 Add missing dep on RAL pass generation
Without this I see errors about being unable to find the generated header in our project's build.

PiperOrigin-RevId: 379377718
2021-06-14 17:02:26 -07:00
Wenyi Zhao 23ebbb28d1 PR #50191: [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50191

DISC is a e2e flow, including both compiler side and runtime side. For
runtime side, we have different targeting environments (e.g. tensorflow,
pytorch, or sometimes even a standalone binary). In order to simplify
the design of the compiler side, we design a Runtime Abstraction Layer
(RAL) to sperate the compiler side and runtime side. Thus the compiler
side only need to target RAL itself and it is the responsibility of RAL
to handle the differences between different targeting environments.

One of the most important functions of RAL is to manage stateful
resources. To this end, it provides a context object, and hides all
stateful operations behind this context, thus the compiler side itself
doesn't need to care about the resource initialization. For example, a
kernel must be loaded before it can be launched on GPU. However, the
loading operation should only be taken once during the whole lifetime of
the context in order to achieve the best performance. Based on the
initialization-free interfaces provided by RAL, compiler side can focus
on its core optimization logic and lets the RAL to manage the resource
status.

The context mentioned above is passed as a parameter to the entry
function and all RAL APIs should always use the context as their first
argument. This CR also provides a pass to help to ensure this property.
The pass rewrites the entry function to make sure their first argument
is the context. For entry function, the pass also rewrites its inputs
and outputs. To be concrete, all the original inputs and outputs of the
entry function are received from and sent to RAL through a sequence of
RAL API calls correspondingly. The motivation behind this is to hide the
implementation details of I/Os. This design may also potentially enable
partial execution of the compiled module when some of the inputs are
ready.
Copybara import of the project:

--
c4f20a89aed71181e75bcc5265723b88bde23240 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect

DISC is a e2e flow, including both compiler side and runtime side. For
runtime side, we have different targeting environments (e.g. tensorflow,
pytorch, or sometimes even a standalone binary). In order to simplify
the design of the compiler side, we design a Runtime Abstraction Layer
(RAL) to sperate the compiler side and runtime side. Thus the compiler
side only need to target RAL itself and it is the responsibility of RAL
to handle the differences between different targeting environments.

One of the most important functions of RAL is to manage stateful
resources. To this end, it provides a context object, and hides all
stateful operations behind this context, thus the compiler side itself
doesn't need to care about the resource initialization. For example, a
kernel must be loaded before it can be launched on GPU. However, the
loading operation should only be taken once during the whole lifetime of
the context in order to achieve the best performance. Based on the
initialization-free interfaces provided by RAL, compiler side can focus
on its core optimization logic and lets the RAL to manage the resource
status.

The context mentioned above is passed as a parameter to the entry
function and all RAL APIs should always use the context as their first
argument. This CR also provides a pass to help to ensure this property.
The pass rewrites the entry function to make sure their first argument
is the context. For entry function, the pass also rewrites its inputs
and outputs. To be concrete, all the original inputs and outputs of the
entry function are received from and sent to RAL through a sequence of
RAL API calls correspondingly. The motivation behind this is to hide the
implementation details of I/Os. This design may also potentially enable
partial execution of the compiled module when some of the inputs are
ready.

--
1991d4f80ab6087943956e1c0fec4940a22ab08d by Wenyi Zhao <reyizero@gmail.com>:

fix

PiperOrigin-RevId: 379317586
2021-06-14 11:27:43 -07:00
A. Unique TensorFlower c47869f931 [MLIR][HLO] Rename `move-up-dynamic-broadcasts-for-fusion` to `broadcast-propagation`
PiperOrigin-RevId: 378102608
2021-06-08 01:51:10 -07:00
wyzhao 968d4b8709 PR #49598: [MLIR][DISC] legalize tensor_load inserted during hlo-to-lhlo conversion
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49598

This PR implements logic for lowering memref.tensor_load ops that are
inserted during `mhlo-legalize-to-lmhlo`
Copybara import of the project:

--
80eb377af4e02182e1aecc943a41ca5d7d1c2100 by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] legalize tensor_load inserted during hlo-to-lhlo conversion

This PR implements logic for lowering memref.tensor_load ops that are
inserted during `mhlo-legalize-to-lmhlo`.

--
ac452fe3dcd591211cd5c59be9189fe2f7153b41 by Wenyi Zhao <reyizero@gmail.com>:

minor fix

--
6b36017f8632a06adbc3e05a62975fa641d0260f by Wenyi Zhao <reyizero@gmail.com>:

minor refine

--
846005cc76d0033112e47825c2e9a97790b6925f by Wenyi Zhao <reyizero@gmail.com>:

minor fix

--
f6a4becaa287d5ca323b2d152a4d0ae053730fd9 by Wenyi Zhao <reyizero@gmail.com>:

fix

--
5555749f60f7fce8f57962860ef65efccf0362ba by Wenyi Zhao <reyizero@gmail.com>:

fix

--
8873b9b6d9315c1199ca9f7c133ecf377ecd2fa6 by Wenyi Zhao <reyizero@gmail.com>:

fix

PiperOrigin-RevId: 376942547
2021-06-01 16:27:56 -07:00
A. Unique TensorFlower 313d24bc8f [MLIR][HLO] Add `rank-specialization-cluster` pass
Add a pass to cluster unranked C/HLO operations in one
`chlo.rank_specialization_cluster` op. The C/HLO operations are moved to the
body of the operation. Later passes can use this to rank-specialize all these
operations together.

PiperOrigin-RevId: 373336725
2021-05-12 03:46:01 -07:00
A. Unique TensorFlower c217a6ef61 [MHLO] Add pass to move up dynamic broadcasts for fusion
For now, the pass only reifies the required shape computations. Moving
broadcasts will follow to allow for fusion across them.

PiperOrigin-RevId: 362033715
2021-03-10 06:21:57 -08:00
A. Unique TensorFlower 4060a86fe2 Integrate LLVM at llvm/llvm-project@2bfe27da17
Updates LLVM usage to match
[2bfe27da171e](https://github.com/llvm/llvm-project/commit/2bfe27da171e)

PiperOrigin-RevId: 357196336
2021-02-12 08:32:03 -08:00
Hanhan Wang e2d60e01ba Fix CMakeLists.txt
This is the followup of `7aa64ee0b791` The dep was added in BUILD, but not CMakeLists.txt

PiperOrigin-RevId: 353078811
2021-01-21 12:42:35 -08:00
Christian Sigg 099c130daf Fix MLIR include paths.
PiperOrigin-RevId: 347976151
2020-12-17 00:56:04 -08:00
Alexander Belyaev 3d930d08c2 [HLO] Delete LHLO memref cast ops and migrate to STD ones.
PiperOrigin-RevId: 340663578
2020-11-04 09:26:34 -08:00
Marius Brehler ff9b8c6f65 PR #44499: Add missing dep on MLIRMhloPassIncGen target
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/44499

The file `sink_constants_to_control_flow.cc` includes the header
`PassDetail.h`, which itself includes `mhlo_passes.h.inc`. The latter is
not guaranteed to be already generated since there was no dependency set
to MLIRMhloPassIncGen.
Copybara import of the project:

--
0ff51ccc88c1ba049eb2e9555afb54079bea39c9 by Marius Brehler <marius.brehler@iml.fraunhofer.de>:

Add missing dep on MLIRMhloPassIncGen target

The file `sink_constants_to_control_flow.cc` includes the header
`PassDetail.h`, which itself includes `mhlo_passes.h.inc`. The latter is
not guaranteed to be already generated since there was no dependency set
to MLIRMhloPassIncGen.

PiperOrigin-RevId: 340485068
2020-11-03 11:18:48 -08:00
A. Unique TensorFlower 08e0d09463 [MLIR][KernelGen] Rename `legalize-tanh-to-approximation` to `legalize-trigonometric-to-approximation`
To add more approximation lowerings in the future, generalize the pass name.

PiperOrigin-RevId: 333340075
2020-09-23 11:53:45 -07:00
A. Unique TensorFlower 48022987ce [XLA][MLIR] Lower `tf.Tan` and `tf.Sin` to MLHLO
Add `tan` op and lowering to CHLO dialect, move CHLO lowerings to
`chlo_legalize_to_hlo_patterns` and extend missing patterns.

PiperOrigin-RevId: 331506094
2020-09-14 02:34:52 -07:00
Ehsan Toosi ce1c8a1ebc [MLIR][LHLO] Replace lhlo-copy-removal pass with mlir-copy-removal pass
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/43137

This PR removes lhlo-copy-removal pass entirely and replace its usages with ```mlir::createCopyRemovalPass()```.

--
7ce1a06f507c8db46c6d7b43c7870cf56002e18e by Ehsan Toosi <ehsan.nadjaran_toosi@dfki.de>:

[mlir][lhlo] Replace lhlo-copy-removal pass with mlir-copy-removal pass

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/43137 from dfki-ehna:using_mlir_copy_removal 7ce1a06f507c8db46c6d7b43c7870cf56002e18e
PiperOrigin-RevId: 331498501
2020-09-14 01:22:19 -07:00
A. Unique TensorFlower a7a7184eb6 [XLA][MLIR] Lower `tf.Tan` and `tf.Sin` to MLHLO
Add `tan` op and lowering to CHLO dialect, move CHLO lowerings to
`chlo_legalize_to_hlo_patterns` and extend missing patterns.

PiperOrigin-RevId: 331128170
2020-09-11 05:05:58 -07:00
A. Unique TensorFlower 90927f6b53 [XLA][MLIR] Lower `tf.Tan` and `tf.Sin` to MLHLO
Add `tan` op and lowering to CHLO dialect, move CHLO lowerings to
`chlo_legalize_to_hlo_patterns` and extend missing patterns.

PiperOrigin-RevId: 331125286
2020-09-11 04:39:28 -07:00
Jacques Pienaar 344c500fca [mhlo] Add legalize to SCF pass
Start of pass to legalize MHLO control flow to SCF for further optimization in common form. The current version just matches a very simple instance (which also happens to occur a few times). Exposes some further canonicalization opportunities that aren't yet addressed.

PiperOrigin-RevId: 329017723
2020-08-28 15:11:58 -07:00
Mehdi Amini 701312720c Add CMake files and lit configurations, enough for `ninja check-mlir-hlo` to pass on all the tests
PiperOrigin-RevId: 325172984
2020-08-07 22:14:34 -07:00