mlir-hlo

Commit Graph

Author	SHA1	Message	Date
A. Unique TensorFlower	470ac45f45	[MLIR][HLO] Remove unused pass `TransformUnrankedHloPass` The pass was replaced by the new generalized rank specialization and the two passes `mhlo-rank-specialization-cluster` and `mhlo-rank-specialization-to-scf`. PiperOrigin-RevId: 379935562	2021-06-17 05:20:49 -07:00
Mehdi Amini	8c8e81cb69	Fix pass definition to inherit from the TableGen generated base class (NFC) PiperOrigin-RevId: 379860210	2021-06-16 19:05:11 -07:00
Wenyi Zhao	34dc5f2a79	PR #50020 : [MLIR][DISC] support fusion on buffer Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50020 This pass implements the logic to group kLoop/kInput fusion patterns on buffer level. The reason for this is that we can avoid a lot of headaches to handle `shape-only` consumers specially (e.g. memref.dim, shape.shapeOf) since shapes are already resolved in buffer world. It may be better to move this pass to tensor level after more shape inference/constraint infras are ready on mhlo level. Copybara import of the project: -- e31f8344b59aa9860097197585215ea1689b8ff4 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] support fusion on buffer This pass implements the logic to group kLoop/kInput fusion patterns on buffer level. The reason for this is that we can avoid a lot of headaches to handle `shape-only` consumers specially (e.g. memref.dim, shape.shapeOf) since shapes are already resolved in buffer world. It may be better to move this pass to tensor level after more shape inference/constraint infras are ready on mhlo level. -- 35f2eb2791241b0ab5db1ddcaf1b4006278ddccf by Wenyi Zhao <reyizero@gmail.com>: fix -- 923c8d61f7fe00a2a0df22d5be396508f0667964 by Wenyi Zhao <reyizero@gmail.com>: fix sanity check failure PiperOrigin-RevId: 379743424	2021-06-16 09:51:29 -07:00
A. Unique TensorFlower	82696f8598	[MLIR][HLO] Annotate `mhlo.clamp` and `mhlo.select` as element-wise broadcasting The operations allow for a limited form of broadcasting which allows some operands to be scalars. As such they are neither strictly `Elementwise`, nor `Broadcasting`. They do fulfill the requirements for `BroadcastingElementwise` though. PiperOrigin-RevId: 379719961	2021-06-16 07:59:26 -07:00
Chris Jones	5fbdac34a9	[XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect. PiperOrigin-RevId: 379455720	2021-06-15 03:55:19 -07:00
Wenyi Zhao	23ebbb28d1	PR #50191 : [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50191 DISC is a e2e flow, including both compiler side and runtime side. For runtime side, we have different targeting environments (e.g. tensorflow, pytorch, or sometimes even a standalone binary). In order to simplify the design of the compiler side, we design a Runtime Abstraction Layer (RAL) to sperate the compiler side and runtime side. Thus the compiler side only need to target RAL itself and it is the responsibility of RAL to handle the differences between different targeting environments. One of the most important functions of RAL is to manage stateful resources. To this end, it provides a context object, and hides all stateful operations behind this context, thus the compiler side itself doesn't need to care about the resource initialization. For example, a kernel must be loaded before it can be launched on GPU. However, the loading operation should only be taken once during the whole lifetime of the context in order to achieve the best performance. Based on the initialization-free interfaces provided by RAL, compiler side can focus on its core optimization logic and lets the RAL to manage the resource status. The context mentioned above is passed as a parameter to the entry function and all RAL APIs should always use the context as their first argument. This CR also provides a pass to help to ensure this property. The pass rewrites the entry function to make sure their first argument is the context. For entry function, the pass also rewrites its inputs and outputs. To be concrete, all the original inputs and outputs of the entry function are received from and sent to RAL through a sequence of RAL API calls correspondingly. The motivation behind this is to hide the implementation details of I/Os. This design may also potentially enable partial execution of the compiled module when some of the inputs are ready. Copybara import of the project: -- c4f20a89aed71181e75bcc5265723b88bde23240 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect DISC is a e2e flow, including both compiler side and runtime side. For runtime side, we have different targeting environments (e.g. tensorflow, pytorch, or sometimes even a standalone binary). In order to simplify the design of the compiler side, we design a Runtime Abstraction Layer (RAL) to sperate the compiler side and runtime side. Thus the compiler side only need to target RAL itself and it is the responsibility of RAL to handle the differences between different targeting environments. One of the most important functions of RAL is to manage stateful resources. To this end, it provides a context object, and hides all stateful operations behind this context, thus the compiler side itself doesn't need to care about the resource initialization. For example, a kernel must be loaded before it can be launched on GPU. However, the loading operation should only be taken once during the whole lifetime of the context in order to achieve the best performance. Based on the initialization-free interfaces provided by RAL, compiler side can focus on its core optimization logic and lets the RAL to manage the resource status. The context mentioned above is passed as a parameter to the entry function and all RAL APIs should always use the context as their first argument. This CR also provides a pass to help to ensure this property. The pass rewrites the entry function to make sure their first argument is the context. For entry function, the pass also rewrites its inputs and outputs. To be concrete, all the original inputs and outputs of the entry function are received from and sent to RAL through a sequence of RAL API calls correspondingly. The motivation behind this is to hide the implementation details of I/Os. This design may also potentially enable partial execution of the compiled module when some of the inputs are ready. -- 1991d4f80ab6087943956e1c0fec4940a22ab08d by Wenyi Zhao <reyizero@gmail.com>: fix PiperOrigin-RevId: 379317586	2021-06-14 11:27:43 -07:00
A. Unique TensorFlower	14093b7906	[XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect. PiperOrigin-RevId: 378681070	2021-06-10 10:27:22 -07:00
Chris Jones	968226b9d7	[XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect. PiperOrigin-RevId: 378640706	2021-06-10 06:54:42 -07:00
Mehdi Amini	1770ed455f	Remove unnecessary duplicated source from "lhlo" (NFC) PiperOrigin-RevId: 378291564	2021-06-08 18:07:26 -07:00
A. Unique TensorFlower	c47869f931	[MLIR][HLO] Rename `move-up-dynamic-broadcasts-for-fusion` to `broadcast-propagation` PiperOrigin-RevId: 378102608	2021-06-08 01:51:10 -07:00
Wenyi Zhao	ade873a5e0	PR #49970 : [MLIR][DISC] bufferize DynamicReshape and DynamicBroadcastInDim Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49970 1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim 2, add a flag `convert-to-lmhlo-only` to seperate following two case: - hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo counterparts, do not apply any optimization (e.g. elide any buffer copy). Buffer optimization is not easy in dynamic shape world especially when involving control flow, thus we leave this to another dedicated pass. - hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo ops (e.g. reshape) to memref dialect directly and Lowers others to their lmhlo counterparts. Copybara import of the project: -- 562bd65a368f6194405c4ae6900e3b4388a5ec03 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] bufferize DynamicReshape and DynamicBroadcastInDim 1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim 2, add a flag `convert-to-lmhlo-only` to seperate following two case: - hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo counterparts, do not apply any optimization (e.g. elide any buffer copy). Buffer optimization is not easy in dynamic shape world especially when involving control flow, thus we leave this to another dedicated pass. - hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo ops (e.g. reshape) to memref dialect directly and Lowers others to their lmhlo counterparts. PiperOrigin-RevId: 377603395	2021-06-04 15:36:03 -07:00
Jacques Pienaar	4fc2e87a42	Add mhlo python binding generator target This just invokes the generator backend & creates a filegroup. PiperOrigin-RevId: 377318653	2021-06-03 10:26:30 -07:00
wyzhao	968d4b8709	PR #49598 : [MLIR][DISC] legalize tensor_load inserted during hlo-to-lhlo conversion Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49598 This PR implements logic for lowering memref.tensor_load ops that are inserted during `mhlo-legalize-to-lmhlo` Copybara import of the project: -- 80eb377af4e02182e1aecc943a41ca5d7d1c2100 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] legalize tensor_load inserted during hlo-to-lhlo conversion This PR implements logic for lowering memref.tensor_load ops that are inserted during `mhlo-legalize-to-lmhlo`. -- ac452fe3dcd591211cd5c59be9189fe2f7153b41 by Wenyi Zhao <reyizero@gmail.com>: minor fix -- 6b36017f8632a06adbc3e05a62975fa641d0260f by Wenyi Zhao <reyizero@gmail.com>: minor refine -- 846005cc76d0033112e47825c2e9a97790b6925f by Wenyi Zhao <reyizero@gmail.com>: minor fix -- f6a4becaa287d5ca323b2d152a4d0ae053730fd9 by Wenyi Zhao <reyizero@gmail.com>: fix -- 5555749f60f7fce8f57962860ef65efccf0362ba by Wenyi Zhao <reyizero@gmail.com>: fix -- 8873b9b6d9315c1199ca9f7c133ecf377ecd2fa6 by Wenyi Zhao <reyizero@gmail.com>: fix PiperOrigin-RevId: 376942547	2021-06-01 16:27:56 -07:00
A. Unique TensorFlower	d1828625ab	[MLIR][KernelGen] Make maximum supported rank in rank specialization configurable The maximum supported target rank of 5 is sufficient for all operations but `select`. Make the maximum target rank configurable in the rank specialization. This reduces the number of generated kernels for operations that don't require it. PiperOrigin-RevId: 376822496	2021-06-01 06:54:31 -07:00
Rahul Joshi	fc88cf1ff4	[HLO] Adopt custom syntax for convolution dims and window attributes for LMHLO_GPU PiperOrigin-RevId: 374889917	2021-05-20 09:41:48 -07:00
Rahul Joshi	a361253e4f	[HLO] Add custom print/parse for window attributes of convolutions (in LMHLO) PiperOrigin-RevId: 373807616	2021-05-14 09:47:25 -07:00
Alex Zinenko	a2c9b3c9d7	[mlir] update gentbl() Bazel macro Rename `gentbl` to `gentbl_cc_library` to make it clearer which kind of rule is ultimately used. Update `gentbl_*` macros to take `tbl_outs` options as a list rather a whitespace-separated string and remove the related string handling. PiperOrigin-RevId: 373406352	2021-05-12 10:58:58 -07:00
Itai Zukerman	30779f0c2f	Added build rules for HLO and LHLO ops MD. PiperOrigin-RevId: 373373920	2021-05-12 08:23:40 -07:00
A. Unique TensorFlower	313d24bc8f	[MLIR][HLO] Add `rank-specialization-cluster` pass Add a pass to cluster unranked C/HLO operations in one `chlo.rank_specialization_cluster` op. The C/HLO operations are moved to the body of the operation. Later passes can use this to rank-specialize all these operations together. PiperOrigin-RevId: 373336725	2021-05-12 03:46:01 -07:00
A. Unique TensorFlower	96a47345cc	[MLIR][HLO] Add `rank_specialization_cluster` op to CHLO The operation will be used to cluster compatible operations that can be rank- specialized collectively. PiperOrigin-RevId: 373128557	2021-05-11 05:17:42 -07:00
dfki-jugr	6bc854f5d9	PR #48667 : [mlir-hlo] Added RegionBranchOpInterfaces to lmhlo operations. Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/48667 Added RegionBranchOpInterfaces to lmhlo operations that use regions. This is needed, since the bufferization features in MLIR have to reason about the control flow within these operations. Copybara import of the project: -- 572fd7d850a46630b812da84e9094280f89f259e by Julian Gross <julian.gross@dfki.de>: Added RegionBranchOpInterfaces to lmhlo operations. PiperOrigin-RevId: 372070825	2021-05-05 00:27:56 -07:00
Geoffrey Martin-Noble	38d0f96709	Switch deps AllPassesAndDialectsNoRegistration -> AllPassesAndDialects These targets are now identical as all registration is explicit. Temporarily leaving the old target as a (deprecated) alias while changes propagate. PiperOrigin-RevId: 366513211	2021-04-02 14:26:13 -07:00
Russell Power	3be9874d82	Remove comments for license declarations. These can get out of date/stale. PiperOrigin-RevId: 365936646	2021-03-30 17:55:42 -07:00
Adrian Kuegel	6388e8d9ee	mlir-hlo-opt: set preloadDialectsInContext to false. This requires specifying dependent dialects in several passes. PiperOrigin-RevId: 365758084	2021-03-30 01:07:14 -07:00
A. Unique TensorFlower	c54527fe88	Integrate LLVM at llvm/llvm-project@678241795c Updates LLVM usage to match [678241795c95](https://github.com/llvm/llvm-project/commit/678241795c95) PiperOrigin-RevId: 363257913	2021-03-16 13:33:00 -07:00
Rahul Joshi	9902e6ee32	[HLO] Add LMHLO CollectivePermute verification. - Extract verification of source target pairs attached to collective permute into a common helper function and use that to verify both MHLO and LMHLO variants. - Change MlirGpuTestBase::ParseMlirModule to allow returning back a failure, and use that to update the mlir_gpu_compile_test to check the new behavior. PiperOrigin-RevId: 362156962	2021-03-10 15:37:12 -08:00
A. Unique TensorFlower	c217a6ef61	[MHLO] Add pass to move up dynamic broadcasts for fusion For now, the pass only reifies the required shape computations. Moving broadcasts will follow to allow for fusion across them. PiperOrigin-RevId: 362033715	2021-03-10 06:21:57 -08:00
Stephan Herhut	cabd4d9a06	Canonicalize dynamic_broadcast_in_dim to own shape with rank narrowing on the shape to a corresponding tensor.cast. PiperOrigin-RevId: 362028291	2021-03-10 05:43:54 -08:00
Geoffrey Martin-Noble	50a516fb9c	Adopt td_library This avoids needing to list all transitive include dependencies and tracks include directories. PiperOrigin-RevId: 360779798	2021-03-03 16:11:21 -08:00
Rahul Joshi	c5f5d13930	[MLIR] Add dependency from "lhlo_ops_structs_inc_gen" to "lhlo" target. PiperOrigin-RevId: 359488020	2021-02-25 03:51:36 -08:00
Geoffrey Martin-Noble	89f7f2bd65	Lower integer matmuls to linalg PiperOrigin-RevId: 359306495	2021-02-24 09:45:07 -08:00
Rahul Joshi	5adb7c6e12	[MLIR:LHLO] Add optional call target arg mapping to LMHLO CustomCall operations. - XLA:HLO -> LMHLO conversion drops all token arguments and return values, however custom calls that users write still expect to get buffer pointers for these token types. - To be able to support this, add an optional call target argument mapping attribute to LMHLO custom calls. When this attribute is present, it indicates the number of arguments and returns that the custom call expects and also indicates which LMHLO arg() or output() maps to which arg or result number of the custom call. PiperOrigin-RevId: 358826664	2021-02-22 08:43:00 -08:00
A. Unique TensorFlower	c06de24f6c	[MLIR][CHLO] Generalize lowering with upcast to n-ary operation Allows reuse for zeta lowering now and for the polygamma lowering soon. PiperOrigin-RevId: 357739910	2021-02-16 09:47:24 -08:00
A. Unique TensorFlower	4060a86fe2	Integrate LLVM at llvm/llvm-project@2bfe27da17 Updates LLVM usage to match [2bfe27da171e](https://github.com/llvm/llvm-project/commit/2bfe27da171e) PiperOrigin-RevId: 357196336	2021-02-12 08:32:03 -08:00
Alexander Belyaev	7aa64ee0b7	[MLIR] Migrate TF from STD complex ops to ComplexDialect. PiperOrigin-RevId: 352966408	2021-01-21 01:22:25 -08:00
Tres Popp	ba0346b071	Integrate LLVM at llvm/llvm-project@96ef4f307d Updates LLVM usage to match [96ef4f307df2](https://github.com/llvm/llvm-project/commit/96ef4f307df2) PiperOrigin-RevId: 352786460	2021-01-20 07:09:47 -08:00
Alexander Belyaev	180f917446	[KERNEL_GEN] Add a pattern for hlo.dyn_broadcast->linalg to enable is_inf kernel. PiperOrigin-RevId: 351179620	2021-01-11 10:13:31 -08:00
Geoffrey Martin-Noble	47848764a5	Change hardcoded external/ include to match repository name This is the preferred name for the Bazel repository when included from another project. Note: These hardcoded "external/" includes are a horrible hack that we can hopefully get rid of soon with more robust Bazel tablegen rules. PiperOrigin-RevId: 350810768	2021-01-08 12:07:53 -08:00
A. Unique TensorFlower	b0bf2ef45b	Integrate LLVM at llvm/llvm-project@c3acda0798 Updates LLVM usage to match [c3acda0798f9](https://github.com/llvm/llvm-project/commit/c3acda0798f9) PiperOrigin-RevId: 348896724	2020-12-23 23:53:54 -08:00
Tres Popp	a42213b870	Define lowering of [l]mhlo.pow. For floating point operations, this uses std.pow. For integer operations, this lowers to a loop. This adds a dependency on scf. PiperOrigin-RevId: 348537232	2020-12-21 15:27:40 -08:00
A. Unique TensorFlower	5da9190dd9	Integrate LLVM at llvm/llvm-project@0cf7e4b252 Updates LLVM usage to match [0cf7e4b252fe](https://github.com/llvm/llvm-project/commit/0cf7e4b252fe) PiperOrigin-RevId: 347948887	2020-12-16 20:30:17 -08:00
Phoenix Meadowlark	5a080ad795	Add Bazel build configuration to MLIR HLO standalone repo. Tested that this works in OSS in tree with `bazel build //...` and as a submodule in IREE. PiperOrigin-RevId: 347919106	2020-12-16 16:23:31 -08:00

42 Commits