This pattern only works for normal convolutions. It does not work for depthwise
convolutions. The Linalg conv ops are defined with static rank, so it only
supports 1d/2d/3d cases, which are the most typical cases.
This also refactors out the same check in lmhlo.conv lowering.
PiperOrigin-RevId: 359503527
- XLA:HLO -> LMHLO conversion drops all token arguments and return values, however
custom calls that users write still expect to get buffer pointers for these token types.
- To be able to support this, add an optional call target argument mapping attribute to
LMHLO custom calls. When this attribute is present, it indicates the number of
arguments and returns that the custom call expects and also indicates which LMHLO
arg() or output() maps to which arg or result number of the custom call.
PiperOrigin-RevId: 358826664
This just blows up everything to ranked (up to 6) and is probably quite slow.
This is sufficient to make kernelgen compile SelectV2.
PiperOrigin-RevId: 358777728
Verification of HLO_BroadcastInDimOp was previously failing or crashing if the
operand had a dynamic shape or was unranked. Update the verification code to
allow the operand to be unranked or have dynamic shape.
PiperOrigin-RevId: 358056793
A shape that contains exactly one element is effectively a scalar. This leads
to a speedup in cases where we have a binary op with one operand that is
effectively a scalar, because we can use the fast path.
PiperOrigin-RevId: 357515552
Also remove BitcastOp. XLA bitcast requires the input buffer to alias the output buffer, which makes bitcast always a no-op.
PiperOrigin-RevId: 356884383
This is being done by just removing the approximation and lowering to atan2 lib calls later to make the implementation the same as XLA. Note that if the approximation is brought back later, it can be fixed by changing the IR checking `less-than(X, 0)` to `less-than(copysign(X, 1), 0)`
PiperOrigin-RevId: 356253941
- Use a common base class to for AllReduce, AllGather, and AllToAll in the ODS spec.
- Add basic verification for replica groups attribute.
PiperOrigin-RevId: 354969654
In IREE, we use indexed generic op to handle the initial value. However, we
lower it to a generic op that carries an init_tensor here, and leave the handle
of initialization problem to later passes.
PiperOrigin-RevId: 354294807
If mhlo.reshape is not purely collapsing some consecutive operand
dimensions into result dimensions, we will generate two linalg
reshape op for it: the first one collapses all operand dimensions
into one dimension, and the second one expands it to all result
dimensions. For this case, the number of collapsed/expanded dimensions
should be coming strictly from the operand/result. It is different
from the case where we can generate one linalg reshape. For that case,
the reassociation map should have rank equal to the largest among
operand/result shape.
PiperOrigin-RevId: 354293826
This is the right behavior for TF and JAX and matches what TF does on GPU. It
doesn't match TF on CPU, but that's really a TF bug.
PiperOrigin-RevId: 353657779
This is the right behavior for TF and JAX and matches what TF does on GPU. It
doesn't match TF on CPU, but that's really a TF bug.
PiperOrigin-RevId: 353628258
This is the right behavior for TF and JAX and matches what TF does on GPU. It
doesn't match TF on CPU, but that's really a TF bug.
PiperOrigin-RevId: 353624935
We prototyped the lowering from mhlo.dot to linalg.matmul in IREE. Since Linalg
now supports matmul in tensors world, we can move the lowering logic to tensors
world, and upstream to legalize_to_linalg.cc. The patch lowers the mhlo.dot to
the linalg.matmul/matvec/dot in tensors world.
PiperOrigin-RevId: 351184911
This updates the tests to no longer rely on tensor_store. Once all users of this behavior have adopted, the tensor_store support will be removed.
PiperOrigin-RevId: 348624899
For floating point operations, this uses std.pow.
For integer operations, this lowers to a loop.
This adds a dependency on scf.
PiperOrigin-RevId: 348537232
- MLIR MemRefs do not preserve layout information correctly when unit dimensions
are involved. Operations like convolution that use cuDNN however need the correct
layout to be preserved so that we do not end up creating an incompatible combination
of input/filter/output layout that is not supported by cuDNN.
- Add these layouts to convolution attributes in the form of I32ArrayAttr for representing
the layout in "minor_to_major" form similar to XLA.
PiperOrigin-RevId: 348034757
It didn't return 0 for 0.0 and -0.0.
Currently we emit -0.0 for -0.0 which is correct according to the HLO dialect.
For the TF_SignOp we should emit 0.0 in that case, we will leave that as a TODO.
Enable the tests which work now, and add another one for Int64.
Also improve the registration code, we should not register the Int32 kernel.
PiperOrigin-RevId: 347981124
It didn't return 0 for 0.0 and -0.0.
Currently we emit -0.0 for -0.0 which is correct according to the HLO dialect.
For the TF_SignOp we should emit 0.0 in that case, we will leave that as a TODO.
Enable the tests which work now, and add another one for Int64.
Also improve the registration code, we should not register the Int32 kernel.
PiperOrigin-RevId: 347602378
It didn't return 0 for 0.0 and -0.0.
Currently we emit -0.0 for -0.0 which is correct according to the HLO dialect.
For the TF_SignOp we should emit 0.0 in that case, we will leave that as a TODO.
Enable the tests which work now, and add another one for Int64.
Also improve the registration code, we should not register the Int32 kernel.
PiperOrigin-RevId: 347590340
- Add this attribute to match the corresponding XLA HLO attribute on convolution
operations.
- A true value indicates a reversal of the corresponding kernel spatial dimension.
- Since XLA builder does not support this attribute, use a custom HLO converted to map
from mlir::mhlo::ConvOp to XLA.
PiperOrigin-RevId: 346891737
Two tensors having the same SSA-value isn't sufficient for equality for floating types, as `NaN != NaN`. As written this causes `tf.IsNan` to [miscompile](https://github.com/google/iree/issues/4061).
PiperOrigin-RevId: 345730640
- Add a variant of the fused convolution that does not need a side input and side input scale.
- Rename the existing one to `ConvForwardFusedSideInputOp`.
- Update tests to exercise all variants of the convolution ops in the GPU dialect.
- Eliminate unused `LHLO_ExtentBuffer` and changed LHLO_Buffer to allow any integer element
type to match what XLA can generate sometimes for scratch buffers.
PiperOrigin-RevId: 345701569
- Map Custom call for GEMM in XLA HLO to Gemm/Gemm bias operations in LHLO GPU
dialect.
- Make 'algorithm' an optional attribute to better match with XLA HLO backend config.
- Replace 'alpha' with 'alpha_real' and 'alpha_complex' to support complex GEMM correctly.
- Generate GemmThunk off of LHLO GPU Gemm operations.
PiperOrigin-RevId: 345250840
Boolean element values should be fetched as an unsigned integer and not signed integer which would return -1 for true.
Added to a TODO to handle unsigned types correctly as well as we don't seem to be using unsigned types.
PiperOrigin-RevId: 343927564
- Restructured LHLO GPU Cholesky to better match XLA HLO by eliminating the
untyped buffer and changing is_upper attribute to is_lower.
- Change LhloDialectEmitter to emit LHLO GPU Cholesky operation.
PiperOrigin-RevId: 343873516
- Also fixed kernel_gen embed_memref_prints to mark the declarations inserted private.
- This is in prep for proposed MLIR change to disallow public declarations.
PiperOrigin-RevId: 342081252
- Extend MHLO CustomCall to have multiple tensors as results.
- Extend LHLO CustomCall to have multiple memrefs for output operands.
- Fix HLO->LHLO and XLA HLO->LHLO mapping for CustomCall to setup the
operand_segment_sizes attribute correctly.
PiperOrigin-RevId: 342067762
This specific pattern can be replaced with the shape
passed to dynamic_reshape. This is implemented as a
canonicalization on mhlo.dynamic_reshape to fit in
the infrastructure of canonicalization.
PiperOrigin-RevId: 342009365
The conversion had a bug in computation of strides and sizes args for std.memref_reinterpret_cast. The previous version also relied on linalg::ReshapeOp to do broadcasting when the rank of the output was higher than the rank of the input. Now the broadcasting is entirely done via descriptor modification and linalg::ReshapeOp was replaced with CopyOp.
PiperOrigin-RevId: 341379871
This is to match with HLO semantics and general dimension semantics in MLIR.
Also,
* Define minimal verifier for these ops.
* Add folder for SetDimensionSize op on static shaped dimension.
* Fix assumption of ranked shape in GetDimensionSize op.
PiperOrigin-RevId: 341150923
Lowerings that depended on operations between real and complex types may
not infer the correct intermediate type. Removing these operations as
they are not technically legally generated operations. Updated tests
to validate this.
PiperOrigin-RevId: 341128903
Previously this started at rank 2 after checking for scalars and equal shapes. This resulted in cases such as <1xf32> + <2xf32> being treated as impossible.
PiperOrigin-RevId: 341043965
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/44405
Splitting #43857 by top-level directories.
Copybara import of the project:
--
fa5da7d5478649d11321dcac9f867b0a57e4798a by Dmitry Volodin <mr.molkree@gmail.com>:
fix typos in compiler dir
--
4d3c9f047f7ecb8ab299f1bf28a86fd39096eee7 by Dmitry Volodin <mr.molkree@gmail.com>:
fix one test as "atleast" in it comes from Bazel
--
9440ebaaa9fc4a735f7f72f0c8f0de4ec58afbd6 by Dmitry Volodin <mr.molkree@gmail.com>:
a bit more
PiperOrigin-RevId: 340819994
The slice indices must be rank-1 and have the same number of elements of the
rank of the operand. Give reasonable error messages for violations of these
requirements instead of a misleading error message that the types of the
indices don't all match.
PiperOrigin-RevId: 340660822
Additionally:
- Forward listeners through new if/else op builders.
This corrects an error that led to incomplete legalization of broadcasted op
lowering.
- Use OpConversionPattern to ensure up to date operand values are used.
PiperOrigin-RevId: 339838833
Doesn't support tensors right now, as it's somewhat hairy to support both at
the same time. Since we use a generic lowering the result is messy
and needs a mem2reg pass to eliminate extra load/store/allocas.
PiperOrigin-RevId: 339562971
If unspecified, `compare_type` is FLOAT for float element types, SIGNED for signed element types and UNSIGNED for unsigned element types. compare_type can be TOTALORDER for float element types.
- Added import and export support the attribute.
- Restricted legalization from HLO to TF to the default compare types.
- Updated existing usage of the CompareOp
PiperOrigin-RevId: 339099219
The lowering assumes that the 'gather' op attributes are identical in both MHLO and LMHLO. But that's not true; some time ago the MHLO version was changed to pack 4 of its attributes into a struct. By doing the same for the LMHLO version we both fix the lowering for this op and resolve a longstanding TODO.
PiperOrigin-RevId: 337943946
XLA HLO concat does not accept scalars, so fail verification if this occurs. Avoids segfault when accessing an empty output shape.
PiperOrigin-RevId: 337618167
The fusion heuristic identifies the root of a fusion by checking whether an
output of a linalg operation is a function result. It did not consider outputs
flowing through aliasing operations (like casts).
PiperOrigin-RevId: 337479910
- Introduce operations in a new lmhlo_gpu dialect that map to GPU library function calls
in the XLA:GPU backend.
- Add basic unit tests as well.
PiperOrigin-RevId: 337132166
Legalize `atan2` analogously to XLA. `atan2` is first reduced to `atan` on the
interval [-1, 1] and subsequently approximated. This CL also adds e2e tests for
trigonometric approximations.
PiperOrigin-RevId: 334794336
- And add conversion from MHLO CustomCall to LHLO CustomCall
- According to XLA documentation, the called function should not be side effecting,
so marking the argument MemRefs as MemRead.
PiperOrigin-RevId: 334737196
- Use MLIR provided constraints for HLO_ScalarIntTensor and HLO_DimensionTensor.
- Update unit tests to expect new error messages.
PiperOrigin-RevId: 333313131
When transforming unranked binary operations from CHLO to HLO, we insert `shape.broadcast` operations. Due to context, we know that the result of the `shape.broadcast` operation has a static shape. Instead of modelling this in the type of the broadcast operation itself, which is illegal, we now use an explicit cast.
PiperOrigin-RevId: 331989879
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/43137
This PR removes lhlo-copy-removal pass entirely and replace its usages with ```mlir::createCopyRemovalPass()```.
--
7ce1a06f507c8db46c6d7b43c7870cf56002e18e by Ehsan Toosi <ehsan.nadjaran_toosi@dfki.de>:
[mlir][lhlo] Replace lhlo-copy-removal pass with mlir-copy-removal pass
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/43137 from dfki-ehna:using_mlir_copy_removal 7ce1a06f507c8db46c6d7b43c7870cf56002e18e
PiperOrigin-RevId: 331498501
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/43069
The legalization of mlhlo.ReturnOp to lhlo.TerminatorOp by using BufferAssignmentReturnOpConverter fails since the Memref typed results (or the Memref typed operands of Return operation) are set to stay as results after legalization but lhlo.TerminatorOp doesn't accept any operands. Therefore, BufferAssignmentReturnOpConverter must be replaced with a manual conversion that removes all operands of mlhlo.ReturnOp and inserts copy operations in their places.
Copybara import of the project:
--
8be0435b0147263c3872bedec58fd215f784b450 by Ehsan Toosi <ehsan.nadjaran_toosi@dfki.de>:
[hlo] Unbreak hlo-legalize-to-lhlo test
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/43069 from dfki-ehna:fix_hlo_legalize_to_lhlo_test 8be0435b0147263c3872bedec58fd215f784b450
PiperOrigin-RevId: 330907602
Start of pass to legalize MHLO control flow to SCF for further optimization in common form. The current version just matches a very simple instance (which also happens to occur a few times). Exposes some further canonicalization opportunities that aren't yet addressed.
PiperOrigin-RevId: 329017723
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/42509
Add folder for mhlo GetDimensionSizeOp (`mhlo.get_dimension_size`).
`get_dimension_size` folds to a constant when the corresponding tensor
dimension size is statically known / constant.
Copybara import of the project:
--
5994915525ec2e932125aa1f133ce2260ba100af by Uday Bondhugula <uday@polymagelabs.com>:
[MLIR] Add folder for mhlo get_dimension_size
Add folder for mhlo GetDimensionSizeOp. get_dimension_size folds to a
constant when the corresponding tensor dimension size is statically
known / constant.
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/42509 from polymage-labs:get_dimension_size_fold 5994915525ec2e932125aa1f133ce2260ba100af
PiperOrigin-RevId: 328222517
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/42508
An lmhlo.constant op on an memref that is locally allocated and with
no users other than dealloc's can be deleted. Add a canonicalization
pattern for this.
Copybara import of the project:
--
8758c409a15f567e7cb8e1077faa020f5705c85a by Uday Bondhugula <uday@polymagelabs.com>:
[MLIR] Erase dead lmhlo.constant ops
An lmhlo.constant op on an memref that is locally allocated and with
no other users (other than dealloc's) can be deleted. Add a
canonicalization patter for this.
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/42508 from polymage-labs:lhlo_constant_erase 8758c409a15f567e7cb8e1077faa020f5705c85a
PiperOrigin-RevId: 328042416
This allows specifying a constant whose shape is only known when operand shape is. Also use it to update tf.Acos legalization.
PiperOrigin-RevId: 325860604
HLO requires that the element types match for all start index parameters. Right now we don't catch this invalid case until export, so adding a check in the verifier so that we catch this sooner.
This also requires a small tweak to the TF InplaceUpdate op lowering.
PiperOrigin-RevId: 325463796
For input and output dimensions which must match, we shouldn't fail in the case where one dim is dynamic and the other is static. This is insufficient information to conclude a dimension mismatch.
PiperOrigin-RevId: 325344738
Create safe or unsafe variants of `shape.broadcast` depending on the context.
The representation by means of an extent tensor is only legal if the operands
are known to be broadcastable. Currently, there is no use in a safe context in
the codebase but it will be used for shape inference eventually.
PiperOrigin-RevId: 325228073
Create safe or unsafe variants of `shape.broadcast` depending on the context.
The representation by means of an extent tensor is only legal if the operands
are known to be broadcastable. Currently, there is no use in a safe context in
the codebase but it will be used for shape inference eventually.
PiperOrigin-RevId: 325079842
The transformation of unranked to ranked operations no longer generates cast
operations for shapes and sizes. Instead, we use the newly introduced support
for extent tensor and index types directly.
PiperOrigin-RevId: 325057440
Create safe or unsafe variants of `shape.broadcast` depending on the context.
The representation by means of an extent tensor is only legal if the operands
are known to be broadcastable. Currently, there is no use in a safe context in
the codebase but it will be used for shape inference eventually.
PiperOrigin-RevId: 325056915
mhlo.get_tuple_element supports extracting a mhlo.token type from a tuple. This updates the creation of tuples to allow for mhlo.token typed operands.
PiperOrigin-RevId: 324628663
This is required before exporting HLO dialect ops with standard dialect constant to XLA.
Also, sink constants for sort op as well. Added a TODO to generalize this pass to handle more ops and non-const values defined outside.
PiperOrigin-RevId: 324301911
Constants of unknown shape cannot be materialized. In most cases, one likely wants to use a scalar constant and rely on broadcasting instead.
PiperOrigin-RevId: 324252475
The computation of a broadcasted shape forced the use of the shape type unnecessarily, which blocked further canonicalizations.
PiperOrigin-RevId: 323783998
This is done through reshaping the unranked tensor into a 1D ranked tensor which will result in a safe broadcast/indexing logic when the other operand is a scalar.
PiperOrigin-RevId: 322553661
Some gathers can be interpreted as torch index selects. Transforming these
cases allow torch_index_select lowerings to be used for certain gathers.
PiperOrigin-RevId: 322255835