mlir-hlo

Commit Graph

Author	SHA1	Message	Date
Wenyi Zhao	23ebbb28d1	PR #50191 : [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/50191 DISC is a e2e flow, including both compiler side and runtime side. For runtime side, we have different targeting environments (e.g. tensorflow, pytorch, or sometimes even a standalone binary). In order to simplify the design of the compiler side, we design a Runtime Abstraction Layer (RAL) to sperate the compiler side and runtime side. Thus the compiler side only need to target RAL itself and it is the responsibility of RAL to handle the differences between different targeting environments. One of the most important functions of RAL is to manage stateful resources. To this end, it provides a context object, and hides all stateful operations behind this context, thus the compiler side itself doesn't need to care about the resource initialization. For example, a kernel must be loaded before it can be launched on GPU. However, the loading operation should only be taken once during the whole lifetime of the context in order to achieve the best performance. Based on the initialization-free interfaces provided by RAL, compiler side can focus on its core optimization logic and lets the RAL to manage the resource status. The context mentioned above is passed as a parameter to the entry function and all RAL APIs should always use the context as their first argument. This CR also provides a pass to help to ensure this property. The pass rewrites the entry function to make sure their first argument is the context. For entry function, the pass also rewrites its inputs and outputs. To be concrete, all the original inputs and outputs of the entry function are received from and sent to RAL through a sequence of RAL API calls correspondingly. The motivation behind this is to hide the implementation details of I/Os. This design may also potentially enable partial execution of the compiled module when some of the inputs are ready. Copybara import of the project: -- c4f20a89aed71181e75bcc5265723b88bde23240 by Wenyi Zhao <reyizero@gmail.com>: [MLIR][DISC] Add RAL (Runtime abstraction layer) Dialect DISC is a e2e flow, including both compiler side and runtime side. For runtime side, we have different targeting environments (e.g. tensorflow, pytorch, or sometimes even a standalone binary). In order to simplify the design of the compiler side, we design a Runtime Abstraction Layer (RAL) to sperate the compiler side and runtime side. Thus the compiler side only need to target RAL itself and it is the responsibility of RAL to handle the differences between different targeting environments. One of the most important functions of RAL is to manage stateful resources. To this end, it provides a context object, and hides all stateful operations behind this context, thus the compiler side itself doesn't need to care about the resource initialization. For example, a kernel must be loaded before it can be launched on GPU. However, the loading operation should only be taken once during the whole lifetime of the context in order to achieve the best performance. Based on the initialization-free interfaces provided by RAL, compiler side can focus on its core optimization logic and lets the RAL to manage the resource status. The context mentioned above is passed as a parameter to the entry function and all RAL APIs should always use the context as their first argument. This CR also provides a pass to help to ensure this property. The pass rewrites the entry function to make sure their first argument is the context. For entry function, the pass also rewrites its inputs and outputs. To be concrete, all the original inputs and outputs of the entry function are received from and sent to RAL through a sequence of RAL API calls correspondingly. The motivation behind this is to hide the implementation details of I/Os. This design may also potentially enable partial execution of the compiled module when some of the inputs are ready. -- 1991d4f80ab6087943956e1c0fec4940a22ab08d by Wenyi Zhao <reyizero@gmail.com>: fix PiperOrigin-RevId: 379317586	2021-06-14 11:27:43 -07:00
Rahul Joshi	fc88cf1ff4	[HLO] Adopt custom syntax for convolution dims and window attributes for LMHLO_GPU PiperOrigin-RevId: 374889917	2021-05-20 09:41:48 -07:00
Rahul Joshi	9902e6ee32	[HLO] Add LMHLO CollectivePermute verification. - Extract verification of source target pairs attached to collective permute into a common helper function and use that to verify both MHLO and LMHLO variants. - Change MlirGpuTestBase::ParseMlirModule to allow returning back a failure, and use that to update the mlir_gpu_compile_test to check the new behavior. PiperOrigin-RevId: 362156962	2021-03-10 15:37:12 -08:00
Rahul Joshi	5adb7c6e12	[MLIR:LHLO] Add optional call target arg mapping to LMHLO CustomCall operations. - XLA:HLO -> LMHLO conversion drops all token arguments and return values, however custom calls that users write still expect to get buffer pointers for these token types. - To be able to support this, add an optional call target argument mapping attribute to LMHLO custom calls. When this attribute is present, it indicates the number of arguments and returns that the custom call expects and also indicates which LMHLO arg() or output() maps to which arg or result number of the custom call. PiperOrigin-RevId: 358826664	2021-02-22 08:43:00 -08:00
Rahul Joshi	dbbdfea95b	[MLIR:HLO] Generate enum decls for HLO and LHLO GPU dialects. - Split out enum definitions in hlo dialect into a separate .td file (similar to structs) and generate enum decl/defs for these enums. - Also split out the LHLO GPU enums into a separate .td file and generate enum decl/defs for these enums as well. - Remove unused dialect from ConvolutionAttributes and generate lhlo_gpu enums. - Add appropriate namespace for all the enums. PiperOrigin-RevId: 345277240	2020-12-02 11:39:23 -08:00
A. Unique TensorFlower	51cd4200b6	Make LMHLO's Dot have the same power as MHLO's DotGeneral. PiperOrigin-RevId: 337391565	2020-10-15 15:09:06 -07:00
Rahul Joshi	f6b4e6758a	Add GPU specific LMHLO level ops - Introduce operations in a new lmhlo_gpu dialect that map to GPU library function calls in the XLA:GPU backend. - Add basic unit tests as well. PiperOrigin-RevId: 337132166	2020-10-14 11:23:55 -07:00
Mehdi Amini	701312720c	Add CMake files and lit configurations, enough for `ninja check-mlir-hlo` to pass on all the tests PiperOrigin-RevId: 325172984	2020-08-07 22:14:34 -07:00

8 Commits