Commit Graph

55 Commits

Author SHA1 Message Date
Adrian Kuegel 4033a56750 Add special cases for SelectOp rank specialization.
We now use the same special cases for all ops with arity >= 2.
For binary ops, we now have only one special case if at least one of the
operands has exactly one element. In that case, we reshape both operands to
rank 1. Before, we had separate special cases whether the left-hand side
or the right-hand side have a scalar shape.

PiperOrigin-RevId: 366005835
2021-03-31 04:28:51 -07:00
Adrian Kuegel c1a6ae8994 Generalize the HloBinaryElementwiseAdaptor
We can use it also for ternary ops like Select if we change the signature so
that a ValueRange is passed in.
Also remove special casing for HloComplexAdaptor. It can be handled with the
generic adaptor as well.

PiperOrigin-RevId: 365777493
2021-03-30 03:53:53 -07:00
Adrian Kuegel 6388e8d9ee mlir-hlo-opt: set preloadDialectsInContext to false.
This requires specifying dependent dialects in several passes.

PiperOrigin-RevId: 365758084
2021-03-30 01:07:14 -07:00
Stella Laurenzo 7f2bf48b8b Integrate LLVM at llvm/llvm-project@b24436ac96
Updates LLVM usage to match
[b24436ac96bd](https://github.com/llvm/llvm-project/commit/b24436ac96bd)

PiperOrigin-RevId: 364615807
2021-03-23 12:20:17 -07:00
A. Unique TensorFlower 507d9fb61d [MLIR][KernelGen] Add `tf.Polygamma` kernel
PiperOrigin-RevId: 362002943
2021-03-10 02:22:01 -08:00
A. Unique TensorFlower 39650a5d5a Remove rank 1 specialization from TransformUnrankedHloPass.
For binary ops, we already special-case rank 0 vs rank 1, and same shape. So we
don't need to special-case a maximum rank of 1.

PiperOrigin-RevId: 360891955
2021-03-04 05:24:53 -08:00
Adrian Kuegel 62b357b601 Remove rank 1 specialization from TransformUnrankedHloPass.
For binary ops, we already special-case rank 0 vs rank 1, and same shape. So we
don't need to special-case a maximum rank of 1.

PiperOrigin-RevId: 360881387
2021-03-04 04:04:11 -08:00
Adrian Kuegel 0683db3b24 Legalize MinimumBroadcastShapes op.
Use it in TransformUnrankedHloPass, which allows to reduce the maximum
rank for rank specialized broadcast from 6 to 5.

PiperOrigin-RevId: 360415743
2021-03-02 06:39:01 -08:00
Christian Sigg 2d818c4fd9 Use mlir::OpState::operator->() to get to methods of mlir::Operation.
This is a preparation step to remove those methods from OpState.

PiperOrigin-RevId: 360043992
2021-02-28 09:02:33 -08:00
A. Unique TensorFlower ac0552f127 [MLIR][HLO] Remove duplicate `PopulateTransformUnrankedHloPatterns`
PiperOrigin-RevId: 359046173
2021-02-23 07:50:47 -08:00
Benjamin Kramer a9cc1dcfa0 [mlir][hlo] Add basic rank-specialization for select
This just blows up everything to ranked (up to 6) and is probably quite slow.
This is sufficient to make kernelgen compile SelectV2.

PiperOrigin-RevId: 358777728
2021-02-22 02:41:12 -08:00
Benjamin Kramer b42def4612 [mlir][hlo] Refactor rank specialization to allow an arbitrary number of inputs
This actually simplifies the code a bit.

PiperOrigin-RevId: 358201038
2021-02-18 09:53:03 -08:00
Adrian Kuegel 824bc9c425 Improve broadcast transformation to treat dynamic shapes with 1 element as scalar.
A shape that contains exactly one element is effectively a scalar. This leads
to a speedup in cases where we have a binary op with one operand that is
effectively a scalar, because we can use the fast path.

PiperOrigin-RevId: 357515552
2021-02-14 23:25:41 -08:00
Stephan Herhut 60e1b6882c Add kernel definition for zeta operation.
PiperOrigin-RevId: 355575619
2021-02-04 01:27:43 -08:00
A. Unique TensorFlower 3b67b207c4 [MLIR][CHLO] Use CHLO lowering for `is_inf` op
PiperOrigin-RevId: 355189054
2021-02-02 09:53:13 -08:00
A. Unique TensorFlower 0458ae9a22 [MLIR][KernelGen] Add `tf.Digamma` kernels
PiperOrigin-RevId: 355129028
2021-02-02 03:07:39 -08:00
A. Unique TensorFlower 2b72ddc6b2 [MLIR][KernelGen] Add `lgamma` kernels
PiperOrigin-RevId: 354519407
2021-01-29 06:14:17 -08:00
Adrian Kuegel fa059259bc Add template for tf.Cast
Also generate the kernels for all types of casts between signed int and float types.
This requires some adaptations to our build macros so that we can also specify the
output type of a kernel.

PiperOrigin-RevId: 354067727
2021-01-27 04:49:55 -08:00
Stephan Herhut 70a351f301 Add chlo.acosh operation and associated lowerings.
PiperOrigin-RevId: 352839289
2021-01-20 11:43:44 -08:00
Tres Popp ba0346b071 Integrate LLVM at llvm/llvm-project@96ef4f307d
Updates LLVM usage to match
[96ef4f307df2](https://github.com/llvm/llvm-project/commit/96ef4f307df2)

PiperOrigin-RevId: 352786460
2021-01-20 07:09:47 -08:00
A. Unique TensorFlower ec5f5667e1 [MLIR][KernelGen] Add `tf.Asinh` kernels and complete their lowerings
PiperOrigin-RevId: 352773540
2021-01-20 05:31:15 -08:00
A. Unique TensorFlower 0e85b4d511 [MLIR][KernelGen] Add `tf.Asinh` kernels and complete their lowerings
PiperOrigin-RevId: 352604725
2021-01-19 10:51:41 -08:00
A. Unique TensorFlower c11ea4ef5a [MLIR][KernelGen] Add `tf.Atanh` kernels
PiperOrigin-RevId: 352393602
2021-01-18 05:14:09 -08:00
A. Unique TensorFlower 791d5afd28 [MLIR][KernelGen] Add `tf.Asinh` kernels and complete their lowerings
PiperOrigin-RevId: 351989552
2021-01-15 05:26:57 -08:00
A. Unique TensorFlower 316f630728 [MLIR][KernelGen] Add cosh kernels and tests
Allow for relative tolerance in unary kernel tests. In case of the cosh kernels,
this allows to accept an observed difference of 5.6e-8 between the kernel and
the `std::cosh` reference (32829984.568665262 vs. 32829984.568665318) in one of
the test cases.

PiperOrigin-RevId: 351983698
2021-01-15 04:31:30 -08:00
A. Unique TensorFlower 0b85d5c510 [MLIR][KernelGen] Add asin kernels and tests
PiperOrigin-RevId: 351381423
2021-01-12 09:02:46 -08:00
A. Unique TensorFlower b0bf2ef45b Integrate LLVM at llvm/llvm-project@c3acda0798
Updates LLVM usage to match
[c3acda0798f9](https://github.com/llvm/llvm-project/commit/c3acda0798f9)

PiperOrigin-RevId: 348896724
2020-12-23 23:53:54 -08:00
River Riddle 9540e51617 [mlir][NFC] Replace usages or mlir/IR/StandardTypes.h with mlir/IR/BuiltinTypes.h
StandardTypes.h was moved to BuiltinTypes.h and is being removed.

PiperOrigin-RevId: 347559927
2020-12-15 00:59:29 -08:00
Stephan Herhut dd5895d083 Extend unranked hlo transformations to also support and, or and xor.
PiperOrigin-RevId: 346270393
2020-12-08 01:00:26 -08:00
Tres Popp d327fc5737 [kernel_gen] Lower max rank specialization from 6 to 5
We don't care much about rank 6 broadcasting operations and this lowers compile times significantly.

PiperOrigin-RevId: 346046601
2020-12-07 02:18:38 -08:00
Tres Popp 7c3f049c8e [kernel_gen] Lower max rank specialization from 6 to 5
We don't care much about rank 6 broadcasting operations and this lowers compile times significantly.

PiperOrigin-RevId: 345466476
2020-12-03 09:19:25 -08:00
River Riddle f89244381d [mlir][NFC] Replace usages of Function.h and Module.h with BuiltinOps.h
This is part of a larger refactoring cleaning up the BuiltinDialect of MLIR.

PiperOrigin-RevId: 345085278
2020-12-01 13:18:06 -08:00
A. Unique TensorFlower dd15c6cd84 [MLIR][KernelGen] Generate assertion message in `transform_unranked_hlo` pass
Use constant to generate the correct assertion message. This avoids
confusion when lowering the max rank specialization for debugging.

PiperOrigin-RevId: 344769021
2020-11-30 01:41:09 -08:00
Adrian Kuegel 6a71a84302 Support different input/output type for TransformUnrankedHlo.
Also generate the tf.Equal kernel, now that it works.

PiperOrigin-RevId: 344402014
2020-11-26 04:20:34 -08:00
Alexander Belyaev 5583c63cab [KERNEL_GEN] Add unranked Conj kernel.
PiperOrigin-RevId: 344243271
2020-11-25 06:37:26 -08:00
Lucy Fox 85f92a1651 [KernelGen] Lower tf.Erf and tf.Erfc ops to CHLO.
This does not include the lowerings from CHLO to LMHLO.

PiperOrigin-RevId: 344091604
2020-11-24 10:55:43 -08:00
Tres Popp af4c9774dc Handle rank 1 broadcasts in unranked kernel lowering.
Previously this started at rank 2 after checking for scalars and equal shapes. This resulted in cases such as <1xf32> + <2xf32> being treated as impossible.

PiperOrigin-RevId: 341043965
2020-11-06 07:22:43 -08:00
Tres Popp 81e8d778c4 Fix bug using std.rank instead of shape.rank
PiperOrigin-RevId: 339890070
2020-10-30 09:59:24 -07:00
Tres Popp 76b30fd426 Move unranked chlo lowering to transform_unranked_hlo.
Additionally:
- Forward listeners through new if/else op builders.
This corrects an error that led to incomplete legalization of broadcasted op
lowering.
- Use OpConversionPattern to ensure up to date operand values are used.
PiperOrigin-RevId: 339838833
2020-10-30 02:56:44 -07:00
Thomas Joerg 7363748bae Integrate LLVM at llvm/llvm-project@0fc1aa22ee
Updates LLVM usage to match
[0fc1aa22ee6a](https://github.com/llvm/llvm-project/commit/0fc1aa22ee6a)

PiperOrigin-RevId: 339239851
2020-10-27 06:56:16 -07:00
A. Unique TensorFlower 7f84a86cf5 [MLIR][KerneGen] Lower `tf.Atan` all the way to LLVM
PiperOrigin-RevId: 335394668
2020-10-05 05:07:13 -07:00
Chao Xie 5f303440da [MLIR][KerneGen] Lower `tf.Atan` all the way to LLVM
PiperOrigin-RevId: 334843070
2020-10-01 10:25:51 -07:00
A. Unique TensorFlower 458e861254 [MLIR][KerneGen] Lower `tf.Atan` all the way to LLVM
PiperOrigin-RevId: 334810730
2020-10-01 07:38:40 -07:00
A. Unique TensorFlower 4002077261 [MLIR][KernelGen] Lower `tf.Sinh` to MLHLO
PiperOrigin-RevId: 332425724
2020-09-18 04:27:07 -07:00
A. Unique TensorFlower 2fbbbe9cf1 [MLIR][KernelGen] Lower `tf.Acos` to LMHLO.
- Add ranked code generation for `mhlo.compare/select`
- Add bufferization for `tensor_cast`
- Add lowerings for `Atan2Op`

PiperOrigin-RevId: 332407734
2020-09-18 01:40:18 -07:00
A. Unique TensorFlower 69b80d8deb [MLIR] Extend unranked transformation to CHLO dialect
PiperOrigin-RevId: 332026604
2020-09-16 09:49:18 -07:00
A. Unique TensorFlower da43c8596b [MLIR] Simplify and generalize `transform-unranked-hlo`
This refactoring allows to support a wider range of n-ary operations in future
changes.

PiperOrigin-RevId: 331953362
2020-09-16 01:13:23 -07:00
Alexander Belyaev ebc7992d31 [MLIR][KERNEL_GEN] Add a library to lower kernels with the host side.
* Unified TF->Cubin and TF->Kernel_with_host side lowering in `kernel_creator.h|cc`
* Added a pass that attaches GPU binary blob to GPUModuleOp
* Refactored most of the code.
* Added tf_to_kernel binary that emits obj file

PiperOrigin-RevId: 330494488
2020-09-08 06:06:29 -07:00
Mehdi Amini 36ddbeb6b2 Remove the dependency on global dialect registry from mlir-hlo
PiperOrigin-RevId: 328457105
2020-08-25 20:30:42 -07:00
Alexander Belyaev 843af36e05 [MLIR] Add e2e test for unranked unary TF op, lowered and run with CPU runner.
PiperOrigin-RevId: 325665428
2020-08-09 02:38:00 -07:00