The pattern can be generalized to also rank specialize operations with a single
non-scalar operand. Also extract helper functions that can be reused in
following specializations.
PiperOrigin-RevId: 374198381
Also cluster operations that operate on same shape operands. These implicitly
satisfy the broadcasting semantics requirement. Also, add test cases for some
cases that appear in the current MLIR-generated kernels.
PiperOrigin-RevId: 374191950
The ReduceRegion* patterns are matching on the same ops as the PointwiseToLinalg*
patterns and on certain toolchains (MSVC) the order can be wrong. If the pointwise
runs first then it converts the op *within* the reduction before the reduction one
runs, leading to nested linalg op weirdness.
PiperOrigin-RevId: 373848269
Add a pass to cluster unranked C/HLO operations in one
`chlo.rank_specialization_cluster` op. The C/HLO operations are moved to the
body of the operation. Later passes can use this to rank-specialize all these
operations together.
PiperOrigin-RevId: 373336725
This strips away the signedness with a type converter, using unrealized
conversion casts. The rest is mostly mechanically pushing the original op down
the pipeline so lowerings can see the original types.
Signed types stay signless for now. This can be changed in the HLO bridge later.
I did a pass over all ops and added unsigned lowerings where they were missing.
There may be more.
Currently the lowering will die at a later stage because it doesn't understand
the unrealized casts.
PiperOrigin-RevId: 371077494
This uses a indexed linalg.generic, which is rather awkward standalone but
allows fusing into the output of the concatenate and avoid to ever materialize
it in memory. I think this is the only way to get that with the current linalg
stack, fusion across a concatenate would require more infrastructure.
PiperOrigin-RevId: 369677652
Assuming ops can only be merged if their witnesses will dominate the merged
assuming op. This is not the case if the second op's witness is a result of the
first.
PiperOrigin-RevId: 369192868
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/47315
Lowering of `concatenateOp` is added from lmhlo to Affine. The lowering
has been added as a part of `lhlo-legalize-to-affine` pass.
Signed-off-by: Prashant Kumar <prashantk@polymagelabs.com>
Copybara import of the project:
--
15314e4579f7a6901cf3475eff25962a34772eaf by Prashant Kumar <prashantk@polymagelabs.com>:
[MLIR] Add concatenateOp lowering from lmhlo to Affine.
Lowering of `concatenateOp` is added from lmhlo to Affine. The lowering
has been added as a part of `lhlo-legalize-to-affine` pass.
Signed-off-by: Prashant Kumar <prashantk@polymagelabs.com>
PiperOrigin-RevId: 368465992
The pattern does not support ops with non-zero padding config. Add a check to
prevent unexpected lowering.
It is not easy to add tests because other patterns will convert body ops, and
it causes issues like invalid IRs.
PiperOrigin-RevId: 367202450
We now use the same special cases for all ops with arity >= 2.
For binary ops, we now have only one special case if at least one of the
operands has exactly one element. In that case, we reshape both operands to
rank 1. Before, we had separate special cases whether the left-hand side
or the right-hand side have a scalar shape.
PiperOrigin-RevId: 366005835
When an op is moved out of an assuming region we already know statically that it
is independent of the assuming region. Hence, there is no need to yield its
results.
PiperOrigin-RevId: 366001405
Add pattern to move operations out of assuming op. This only valid for
constraint-independent ops, like `cstr_broadcastable` and `shape_of`. It will
eventually allow to make assuming regions' constraints independent from each
other so that they can be merged.
PiperOrigin-RevId: 365993145
We can use it also for ternary ops like Select if we change the signature so
that a ValueRange is passed in.
Also remove special casing for HloComplexAdaptor. It can be handled with the
generic adaptor as well.
PiperOrigin-RevId: 365777493
We only need the memref_reinterpret_cast if we don't know whether a dimension
gets expanded or not. With static shapes we know that a dimension can only be
expanded if it's a static 1, so lower it in the same way we lower fully
static broadcasts.
PiperOrigin-RevId: 363859181
This is the same as iota, but instead of taking the dimensions from the result
tensor we use the supplied shape extents tensor.
PiperOrigin-RevId: 362298548
This is an annoying edge case because the collapse->expand lowering expects at
least R1 or it will produce invalid linalg reshapes. Using the direct lowering
works fine.
PiperOrigin-RevId: 362269199
THe conversion from dot_general to dot fails when trying to retrieve
and use the precision config, since precision_config is optional.
PiperOrigin-RevId: 362095296
For now, the pass only reifies the required shape computations. Moving
broadcasts will follow to allow for fusion across them.
PiperOrigin-RevId: 362033715
Return nan at zeta poles or inf where the limit is defined. Also test the kernel
based on the series representation of zeta.
PiperOrigin-RevId: 361993482
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/46723
Reduces some warnings about comparison of integers of different signs.
Copybara import of the project:
--
311f436f77b334f5462127d8cf179cce067969ca by Marius Brehler <marius.brehler@iml.fraunhofer.de>:
Adjust types of loop counters
Reduces some warnings about comparison of integers of different signs.
PiperOrigin-RevId: 360912203
For binary ops, we already special-case rank 0 vs rank 1, and same shape. So we
don't need to special-case a maximum rank of 1.
PiperOrigin-RevId: 360891955
For binary ops, we already special-case rank 0 vs rank 1, and same shape. So we
don't need to special-case a maximum rank of 1.
PiperOrigin-RevId: 360881387
The linalg named ops are now type polymorphic, so the type-monomorphic
varieties are redundant (and will be deleted soon).
PiperOrigin-RevId: 360509010
This pattern only works for normal convolutions. It does not work for depthwise
convolutions. The Linalg conv ops are defined with static rank, so it only
supports 1d/2d/3d cases, which are the most typical cases.
This also refactors out the same check in lmhlo.conv lowering.
PiperOrigin-RevId: 359503527
This just blows up everything to ranked (up to 6) and is probably quite slow.
This is sufficient to make kernelgen compile SelectV2.
PiperOrigin-RevId: 358777728
A shape that contains exactly one element is effectively a scalar. This leads
to a speedup in cases where we have a binary op with one operand that is
effectively a scalar, because we can use the fast path.
PiperOrigin-RevId: 357515552
This is being done by just removing the approximation and lowering to atan2 lib calls later to make the implementation the same as XLA. Note that if the approximation is brought back later, it can be fixed by changing the IR checking `less-than(X, 0)` to `less-than(copysign(X, 1), 0)`
PiperOrigin-RevId: 356253941
In IREE, we use indexed generic op to handle the initial value. However, we
lower it to a generic op that carries an init_tensor here, and leave the handle
of initialization problem to later passes.
PiperOrigin-RevId: 354294807
If mhlo.reshape is not purely collapsing some consecutive operand
dimensions into result dimensions, we will generate two linalg
reshape op for it: the first one collapses all operand dimensions
into one dimension, and the second one expands it to all result
dimensions. For this case, the number of collapsed/expanded dimensions
should be coming strictly from the operand/result. It is different
from the case where we can generate one linalg reshape. For that case,
the reassociation map should have rank equal to the largest among
operand/result shape.
PiperOrigin-RevId: 354293826