mlir-hlo

Commit Graph

Author	SHA1	Message	Date
A. Unique TensorFlower	82696f8598	[MLIR][HLO] Annotate `mhlo.clamp` and `mhlo.select` as element-wise broadcasting The operations allow for a limited form of broadcasting which allows some operands to be scalars. As such they are neither strictly `Elementwise`, nor `Broadcasting`. They do fulfill the requirements for `BroadcastingElementwise` though. PiperOrigin-RevId: 379719961	2021-06-16 07:59:26 -07:00
Feiwen	3afbe312f8	PR #49919 : [MLIR][DISC] pattern conversion from tf2mhlo: ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49919 We are porting our MLIR-based dynamic shape compiler to tf community (From OP def, Patttern, to Optimization pass, etc). This is the 5th PR about tf2mhlo pattern conversion, which including ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic. The rest pattern conversions we will add: - ConvertSqueezeOpxxx - ConvertStridedSliceOpxxx - ConvertPrintOp Copybara import of the project: -- 21b3c3eb05b12956bcdb8b98cc54d9371dbf034d by azazhu <azazhu@gmail.com>: [MLIR][DISC] pattern conversion from tf2mhlo: ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic -- 634630a4e2e426357290650bd579b35efecab5b3 by azazhu <azazhu@gmail.com>: [MLIR][DISC] refine ConvertUnpackOpDynamic, ConvertSignOpDynamic, ConvertSigmoidGradOpDynamic -- 39a2bedd6dafb369ae960c5197b7a352bfdfbc80 by azazhu <azazhu@gmail.com>: add RealDynamicSliceOp's canonicalize and fix CI -- a1c38dd0963d602ed4812da0d77a096a95920ddb by azazhu <azazhu@gmail.com>: fix CI for ConvertUnpackOpDynamic -- 5a8b4eb389ed6dc554104356c37f2f1550802b8c by azazhu <azazhu@gmail.com>: fix typo in ConvertSigmoidGradOpDynamic PiperOrigin-RevId: 379521079	2021-06-15 10:33:32 -07:00
Chris Jones	5fbdac34a9	[XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect. PiperOrigin-RevId: 379455720	2021-06-15 03:55:19 -07:00
Wenyi Zhao	7f94bd923b	PR #50236 : [MLIR][DISC] Bufferize TransposeOp and ConcatenateOp Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50236 support hlo-to-lhlo conversion for TransposeOp and ConcatenateOp Copybara import of the project: -- 62860e717f2a14fbd3ddfb634aa6ff132d245a72 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] Bufferize TransposeOp and ConcatenateOp -- ce2ff57c1edee1172cd2f36346cc0b34ec1c7467 by Wenyi Zhao <reyizero@gmail.com>: fix PiperOrigin-RevId: 379330954	2021-06-14 12:37:45 -07:00
Wenyi Zhao	23ebbb28d1	PR #50191 : [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50191 DISC is a e2e flow, including both compiler side and runtime side. For runtime side, we have different targeting environments (e.g. tensorflow, pytorch, or sometimes even a standalone binary). In order to simplify the design of the compiler side, we design a Runtime Abstraction Layer (RAL) to sperate the compiler side and runtime side. Thus the compiler side only need to target RAL itself and it is the responsibility of RAL to handle the differences between different targeting environments. One of the most important functions of RAL is to manage stateful resources. To this end, it provides a context object, and hides all stateful operations behind this context, thus the compiler side itself doesn't need to care about the resource initialization. For example, a kernel must be loaded before it can be launched on GPU. However, the loading operation should only be taken once during the whole lifetime of the context in order to achieve the best performance. Based on the initialization-free interfaces provided by RAL, compiler side can focus on its core optimization logic and lets the RAL to manage the resource status. The context mentioned above is passed as a parameter to the entry function and all RAL APIs should always use the context as their first argument. This CR also provides a pass to help to ensure this property. The pass rewrites the entry function to make sure their first argument is the context. For entry function, the pass also rewrites its inputs and outputs. To be concrete, all the original inputs and outputs of the entry function are received from and sent to RAL through a sequence of RAL API calls correspondingly. The motivation behind this is to hide the implementation details of I/Os. This design may also potentially enable partial execution of the compiled module when some of the inputs are ready. Copybara import of the project: -- c4f20a89aed71181e75bcc5265723b88bde23240 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect DISC is a e2e flow, including both compiler side and runtime side. For runtime side, we have different targeting environments (e.g. tensorflow, pytorch, or sometimes even a standalone binary). In order to simplify the design of the compiler side, we design a Runtime Abstraction Layer (RAL) to sperate the compiler side and runtime side. Thus the compiler side only need to target RAL itself and it is the responsibility of RAL to handle the differences between different targeting environments. One of the most important functions of RAL is to manage stateful resources. To this end, it provides a context object, and hides all stateful operations behind this context, thus the compiler side itself doesn't need to care about the resource initialization. For example, a kernel must be loaded before it can be launched on GPU. However, the loading operation should only be taken once during the whole lifetime of the context in order to achieve the best performance. Based on the initialization-free interfaces provided by RAL, compiler side can focus on its core optimization logic and lets the RAL to manage the resource status. The context mentioned above is passed as a parameter to the entry function and all RAL APIs should always use the context as their first argument. This CR also provides a pass to help to ensure this property. The pass rewrites the entry function to make sure their first argument is the context. For entry function, the pass also rewrites its inputs and outputs. To be concrete, all the original inputs and outputs of the entry function are received from and sent to RAL through a sequence of RAL API calls correspondingly. The motivation behind this is to hide the implementation details of I/Os. This design may also potentially enable partial execution of the compiled module when some of the inputs are ready. -- 1991d4f80ab6087943956e1c0fec4940a22ab08d by Wenyi Zhao <reyizero@gmail.com>: fix PiperOrigin-RevId: 379317586	2021-06-14 11:27:43 -07:00
Rahul Joshi	a6011d0279	[HLO] Add AllReduceScatter to MHLO and LMHLO dialects. PiperOrigin-RevId: 379296198	2021-06-14 09:37:07 -07:00
Wenyi Zhao	8388303fd2	PR #50211 : [MLIR][DISC] Bufferize RealDynamicSliceOp and ReduceOp Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50211 support hlo-to-lhlo conversion for RealDynamicSliceOp and ReduceOp Copybara import of the project: -- c417b336670a1fc256f7026dfe8080e46d13d79a by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] Bufferize RealDynamicSliceOp and ReduceOp PiperOrigin-RevId: 378972113	2021-06-11 16:33:15 -07:00
Wenyi Zhao	6660234d80	PR #50100 : [MLIR][DISC] Bufferize DynamicIotaOp and DynamicPadOp Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50100 support hlo-to-lhlo conversion for DynamicIotaOp and DynamicPadOp Copybara import of the project: -- c3aae94954e35d3f8ad265f619ef9765665a5115 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] Bufferize DynamicIotaOp and DynamicPadOp -- adc6996d70b804d61310d56a33fac975d70c8636 by Wenyi Zhao <reyizero@gmail.com>: minor PiperOrigin-RevId: 378733284	2021-06-10 14:20:45 -07:00
A. Unique TensorFlower	14093b7906	[XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect. PiperOrigin-RevId: 378681070	2021-06-10 10:27:22 -07:00
Chris Jones	968226b9d7	[XLA:GPU] Add AllReduce{Start,Done} to MLIR LHLO dialect. PiperOrigin-RevId: 378640706	2021-06-10 06:54:42 -07:00
A. Unique TensorFlower	d828b457b3	Handle empty tensors in SimplifyConcatSlice. If the result of the slice is an empty tensor, do nothing. This fixes a crash: we can't create a `concat` with an empty operand range. PiperOrigin-RevId: 378354956	2021-06-09 02:15:47 -07:00
Wenyi Zhao	ade873a5e0	PR #49970 : [MLIR][DISC] bufferize DynamicReshape and DynamicBroadcastInDim Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49970 1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim 2, add a flag `convert-to-lmhlo-only` to seperate following two case: - hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo counterparts, do not apply any optimization (e.g. elide any buffer copy). Buffer optimization is not easy in dynamic shape world especially when involving control flow, thus we leave this to another dedicated pass. - hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo ops (e.g. reshape) to memref dialect directly and Lowers others to their lmhlo counterparts. Copybara import of the project: -- 562bd65a368f6194405c4ae6900e3b4388a5ec03 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] bufferize DynamicReshape and DynamicBroadcastInDim 1, add hlo-to-lhlo support for DynamicReshape and DynamicBroadcastInDim 2, add a flag `convert-to-lmhlo-only` to seperate following two case: - hlo-to-lhlo only. Simply lowers all mhlo ops to their lmhlo counterparts, do not apply any optimization (e.g. elide any buffer copy). Buffer optimization is not easy in dynamic shape world especially when involving control flow, thus we leave this to another dedicated pass. - hlo-to-lhlo-or-memref-directly. Lowers some metadata-only mhlo ops (e.g. reshape) to memref dialect directly and Lowers others to their lmhlo counterparts. PiperOrigin-RevId: 377603395	2021-06-04 15:36:03 -07:00
A. Unique TensorFlower	aba16adfa5	Add `mhlo.all_gather` op to MHLO dialect. Adds import/export/verifier support as well. Also makes `channel_handle` uniform across mhlo.all_reduce and mhlo.all-gather. PiperOrigin-RevId: 377323468	2021-06-03 10:45:29 -07:00
Adrian Kuegel	a4fa6afa07	[mlir][hlo] Avoid dyn_cast_or_null when called with getDefiningOp result (NFC) PiperOrigin-RevId: 376110457	2021-05-27 00:20:42 -07:00
wyzhao	b93e54d8a4	PR #49454 : [MLIR][DISC] Upgrade to use the new `reifyReturnTypeShapes` interface. Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49454 The new interface is more safe to be used during dialect conversion (e.g. converting from tensor world to buffer world). Copybara import of the project: -- a6968072d59bec3c3bbaef0121d297e807c37c91 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] Upgrade to use the new `reifyReturnTypeShapes` interface. The new interface is more safe to be used during dialect conversion (e.g. converting from tensor world to buffer world). -- 55e7c6b7f2f99b99e226645a57e2433fae3e90ed by Wenyi Zhao <reyizero@gmail.com>: minor fix PiperOrigin-RevId: 375500273	2021-05-24 10:11:55 -07:00
Feiwen	a7884196f5	PR #49228 : [MLIR][DISC] porting dynamic shape related OPs to mhlo and lmhlo dialect Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/49228 We are porting our MLIR-based dynamic shape compiler to tf community (From OP def, Patttern, to Optimization pass, etc). This is the first PR, which including some dynamic shape OPs def in mhlo and lmhlo dialect. For mhlo dialect, we add: - HLO_RealDynamicSliceOp - HLO_DynamicPadOp - HLO_DynamicGatherOp - HLO_DynamicConvOp For lmhlo dialect, we add: - LHLO_RealDynamicSliceOp - LHLO_DynamicBroadcastInDimOp - LHLO_DynamicGatherOp - LHLO_DynamicPadOp - LHLO_DynamicBitcastOp - LHLO_DynamicConvOp - LHLO_DynamicIotaOp - LHLO_DynamicReshapeOp - LHLO_DotGeneralOp - LHLO_BitcastOp Rest Ops to add: * We will send a separate PR containing LHLO_DynamicWhileOp and LHLO_DynamicCaseOp for control flow. * We will add a separate dedicated dialect like mhlo_ral, which including D2HOp/H2DOp/DebugPrintOp/TopKOp, etc. Previous discussions：[RFC](https://groups.google.com/a/tensorflow.org/g/mlir/c/_X48poNcbDI/m/jCC8BWIICQAJ), [discussion_1](https://llvm.discourse.group/t/updates-on-mlir-based-dynamic-shape-compiler/2384), [Recording of meeting](https://drive.google.com/file/d/1_uEISlV5MUWdG9faKAdKlCWnPtGjRC-D/view?usp=sharing). Copybara import of the project: -- e22d9e61106e00a1a1c6f368cc4a03e3bd1f414c by azazhu <azazhu@gmail.com>: [DISC]fea: porting mhlo and lmhlo OPs -- 9ec3e76290da07cbd53d7da5fa86ff67179441a1 by azazhu <azazhu@gmail.com>: [DISC][MLIR] 1. add summary and description for dynamic OPs in mhlo and lmhlo; 2. rm InferOutputTypes; 3. add verify for RealDynamicSliceOp and DynamicPadOp -- 0d68cd135555fd935991c12456b21329e628f23f by azazhu <azazhu@gmail.com>: [DISC][MLIR] 1.remove D2H,H2D and DebugPrint Ops from mhlo/lmhlo dialect; 2. add type constraint to DynamicPadOp and RealDynamicSliceOp; 3.refine lmhlo type constraint; 4.rename RealDynamicSliceOp as name conflict. -- 698762a77d60f6a844cb1ab3f32740d4ef3c5843 by azazhu <azazhu@gmail.com>: [DISC][MLIR] 1. replace dyn_cast to cast 2. refine code PiperOrigin-RevId: 375022260	2021-05-20 23:16:47 -07:00
Rahul Joshi	41f663ce47	[HLO] Adopt custom syntax for convolution dimensions and window attributes (HLO) PiperOrigin-RevId: 374923250	2021-05-20 12:13:50 -07:00
Rahul Joshi	fc88cf1ff4	[HLO] Adopt custom syntax for convolution dims and window attributes for LMHLO_GPU PiperOrigin-RevId: 374889917	2021-05-20 09:41:48 -07:00
A. Unique TensorFlower	57aeb5ab16	Integrate LLVM at llvm/llvm-project@0316f3e649 Updates LLVM usage to match [0316f3e64972](https://github.com/llvm/llvm-project/commit/0316f3e64972) PiperOrigin-RevId: 374855085	2021-05-20 06:09:40 -07:00
Rahul Joshi	a361253e4f	[HLO] Add custom print/parse for window attributes of convolutions (in LMHLO) PiperOrigin-RevId: 373807616	2021-05-14 09:47:25 -07:00
A. Unique TensorFlower	d2cc74317c	Implement constant folding for mhlo.Sign. PiperOrigin-RevId: 373550014	2021-05-13 03:54:04 -07:00
A. Unique TensorFlower	420c42a0a1	[MLIR][HLO] Support CHLO unary operations in rank specialization clustering PiperOrigin-RevId: 373397321	2021-05-12 10:20:43 -07:00
Rahul Joshi	e260aa771c	[HLO] Add custom print/parse for convolution dimension numbers (in LMHLO) PiperOrigin-RevId: 373379227	2021-05-12 08:52:46 -07:00
A. Unique TensorFlower	7f7a86ad0d	[MLIR][HLO] Implement `RegionBranchOpInterface` for rank specialization cluster PiperOrigin-RevId: 373163196	2021-05-11 09:03:05 -07:00
A. Unique TensorFlower	96a47345cc	[MLIR][HLO] Add `rank_specialization_cluster` op to CHLO The operation will be used to cluster compatible operations that can be rank- specialized collectively. PiperOrigin-RevId: 373128557	2021-05-11 05:17:42 -07:00
A. Unique TensorFlower	7f86dd9f7e	Constant fold compare EQ if one of the operands is true and compare NE if one of the operands is false. PiperOrigin-RevId: 373058030	2021-05-10 18:53:49 -07:00
dfki-jugr	6bc854f5d9	PR #48667 : [mlir-hlo] Added RegionBranchOpInterfaces to lmhlo operations. Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/48667 Added RegionBranchOpInterfaces to lmhlo operations that use regions. This is needed, since the bufferization features in MLIR have to reason about the control flow within these operations. Copybara import of the project: -- 572fd7d850a46630b812da84e9094280f89f259e by Julian Gross <julian.gross@dfki.de>: Added RegionBranchOpInterfaces to lmhlo operations. PiperOrigin-RevId: 372070825	2021-05-05 00:27:56 -07:00
A. Unique TensorFlower	e500ab37a1	Introduce constant folds for ReduceOp with single LogicalAnd or LogicalOr op. PiperOrigin-RevId: 370551483	2021-04-26 15:11:27 -07:00
Adrian Kuegel	0e2b255f01	Lower LHLO::AbsOp to complex dialect. Also fix the traits for LHLO::AbsOp to allow different types and add a verifier. PiperOrigin-RevId: 370438790	2021-04-26 05:44:03 -07:00
A. Unique TensorFlower	8db96f54d3	[mhlo] Add a folder for mhlo.map which does nothing but return one of the arguments. Add a folder for maps whose body returns only one of the arguments. When this arises the fold replaces the map output with one of the operand tensors. PiperOrigin-RevId: 369304322	2021-04-19 14:36:08 -07:00
Rahul Joshi	c75cbf4ac7	[MLIR][NFC] Rename ReduceOp operands() => inputs(). - Rename to avoid confusion as operands generally includes all operands of an operation PiperOrigin-RevId: 368479524	2021-04-14 12:08:23 -07:00
Jacques Pienaar	fdd75daed6	Add shape function for MHLO RngNormal and RngUniform PiperOrigin-RevId: 368276963	2021-04-13 12:59:42 -07:00
A. Unique TensorFlower	6d2209e301	[MLIR][HLO] Canonicalize chained broadcasts Compose two subsequent `dynamic_broadcast_in_dim` ops into one. PiperOrigin-RevId: 367630360	2021-04-09 07:35:34 -07:00
Rahul Joshi	ff2cbfa2ec	[MLIR] Add support for representing variadic reduce-window in HLO/LMHLO dialect. - Fixed a subset of transformations to handle variadic reduce-window. PiperOrigin-RevId: 366278650	2021-04-01 10:24:50 -07:00
A. Unique TensorFlower	af3bc47a8b	Integrate LLVM at llvm/llvm-project@8396aeb07c Updates LLVM usage to match [8396aeb07cdd](https://github.com/llvm/llvm-project/commit/8396aeb07cdd) PiperOrigin-RevId: 366034463	2021-03-31 08:01:34 -07:00
Geoffrey Martin-Noble	5d65758e8c	Canonicalize MHLO Case and If Ops with constant conditions ReplaceOpWithRegion was taken directly from ScfOps. We should maybe put that somewhere common in core. PiperOrigin-RevId: 365936724	2021-03-30 17:58:01 -07:00
Geoffrey Martin-Noble	2fb2a92c6e	Verify mhlo.if region return types match op This matches the behavior of mhlo.case. Additionally, fix the verification of CaseOp in the case of nested ops with mhlo.return-containing regions. PiperOrigin-RevId: 365936672	2021-03-30 17:57:20 -07:00
Geoffrey Martin-Noble	7a9394dca5	Restrict MHLO control flow ops to single-block regions PiperOrigin-RevId: 365935824	2021-03-30 17:51:03 -07:00
Geoffrey Martin-Noble	a2b6060c0c	Add folder for HLO NotOp PiperOrigin-RevId: 364989658	2021-03-25 02:08:38 -07:00
A. Unique TensorFlower	0c4a89e52c	[MLIR][MHLO] Implement shape reification for `dynamic_broadcast_in_dim` PiperOrigin-RevId: 363622714	2021-03-18 03:39:15 -07:00
Jacques Pienaar	a58e62590e	Restrict canonicalization to avoid changing type Issue #47516 PiperOrigin-RevId: 363300979	2021-03-16 16:54:05 -07:00
A. Unique TensorFlower	c54527fe88	Integrate LLVM at llvm/llvm-project@678241795c Updates LLVM usage to match [678241795c95](https://github.com/llvm/llvm-project/commit/678241795c95) PiperOrigin-RevId: 363257913	2021-03-16 13:33:00 -07:00
Jacques Pienaar	3de2024a9b	Avoid creating tuple type only for verification Make the error message a bit more verbose & it is cheaper to verify the elements rather than creating a (potentially) new type. PiperOrigin-RevId: 363073909	2021-03-15 17:58:19 -07:00
Benjamin Kramer	67a770e4e0	[HLO:MLIR] Make binary op type reification emit shape_of instead of tensor ops This gives cleaner code and allows shape optimizations to happen on the result. PiperOrigin-RevId: 362242975	2021-03-11 02:01:35 -08:00
Rahul Joshi	9902e6ee32	[HLO] Add LMHLO CollectivePermute verification. - Extract verification of source target pairs attached to collective permute into a common helper function and use that to verify both MHLO and LMHLO variants. - Change MlirGpuTestBase::ParseMlirModule to allow returning back a failure, and use that to update the mlir_gpu_compile_test to check the new behavior. PiperOrigin-RevId: 362156962	2021-03-10 15:37:12 -08:00
Stephan Herhut	cabd4d9a06	Canonicalize dynamic_broadcast_in_dim to own shape with rank narrowing on the shape to a corresponding tensor.cast. PiperOrigin-RevId: 362028291	2021-03-10 05:43:54 -08:00
A. Unique TensorFlower	55eda81407	[MLIR][HLO] Reify shape extents as `index` values PiperOrigin-RevId: 361519167	2021-03-08 02:42:47 -08:00
Marius Brehler	29f70cb892	PR #46723 : Adjust types of loop counters Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/46723 Reduces some warnings about comparison of integers of different signs. Copybara import of the project: -- 311f436f77b334f5462127d8cf179cce067969ca by Marius Brehler <marius.brehler@iml.fraunhofer.de>: Adjust types of loop counters Reduces some warnings about comparison of integers of different signs. PiperOrigin-RevId: 360912203	2021-03-04 07:36:12 -08:00
Adrian Kuegel	e6a1f5f0f9	Add MinimumBroadcastShapesOp to chlo dialect. This op is useful for rank specialization of broadcasts. Kernel Generator needs to generate one kernel for each rank, so if we can minimize the rank of the broadcast shape, we can support more cases with the same number of special-cased kernels. PiperOrigin-RevId: 360137827	2021-03-01 02:23:52 -08:00
Rahul Joshi	5adb7c6e12	[MLIR:LHLO] Add optional call target arg mapping to LMHLO CustomCall operations. - XLA:HLO -> LMHLO conversion drops all token arguments and return values, however custom calls that users write still expect to get buffer pointers for these token types. - To be able to support this, add an optional call target argument mapping attribute to LMHLO custom calls. When this attribute is present, it indicates the number of arguments and returns that the custom call expects and also indicates which LMHLO arg() or output() maps to which arg or result number of the custom call. PiperOrigin-RevId: 358826664	2021-02-22 08:43:00 -08:00

1 2 3

144 Commits