For binary ops, we already special-case rank 0 vs rank 1, and same shape. So we
don't need to special-case a maximum rank of 1.
PiperOrigin-RevId: 360891955
For binary ops, we already special-case rank 0 vs rank 1, and same shape. So we
don't need to special-case a maximum rank of 1.
PiperOrigin-RevId: 360881387
The linalg named ops are now type polymorphic, so the type-monomorphic
varieties are redundant (and will be deleted soon).
PiperOrigin-RevId: 360509010
Previously this would be too strict and fail if dynamic and static dims were
compared. Dynamic/unknown are treated as "maybe equal" to a static value without further info, so at this layer don't flag as invalid unless truly are.
PiperOrigin-RevId: 360189086
This op is useful for rank specialization of broadcasts. Kernel Generator
needs to generate one kernel for each rank, so if we can minimize the rank
of the broadcast shape, we can support more cases with the same number of
special-cased kernels.
PiperOrigin-RevId: 360137827
This pattern only works for normal convolutions. It does not work for depthwise
convolutions. The Linalg conv ops are defined with static rank, so it only
supports 1d/2d/3d cases, which are the most typical cases.
This also refactors out the same check in lmhlo.conv lowering.
PiperOrigin-RevId: 359503527
- XLA:HLO -> LMHLO conversion drops all token arguments and return values, however
custom calls that users write still expect to get buffer pointers for these token types.
- To be able to support this, add an optional call target argument mapping attribute to
LMHLO custom calls. When this attribute is present, it indicates the number of
arguments and returns that the custom call expects and also indicates which LMHLO
arg() or output() maps to which arg or result number of the custom call.
PiperOrigin-RevId: 358826664
This just blows up everything to ranked (up to 6) and is probably quite slow.
This is sufficient to make kernelgen compile SelectV2.
PiperOrigin-RevId: 358777728
Verification of HLO_BroadcastInDimOp was previously failing or crashing if the
operand had a dynamic shape or was unranked. Update the verification code to
allow the operand to be unranked or have dynamic shape.
PiperOrigin-RevId: 358056793
A shape that contains exactly one element is effectively a scalar. This leads
to a speedup in cases where we have a binary op with one operand that is
effectively a scalar, because we can use the fast path.
PiperOrigin-RevId: 357515552
Also remove BitcastOp. XLA bitcast requires the input buffer to alias the output buffer, which makes bitcast always a no-op.
PiperOrigin-RevId: 356884383
This is being done by just removing the approximation and lowering to atan2 lib calls later to make the implementation the same as XLA. Note that if the approximation is brought back later, it can be fixed by changing the IR checking `less-than(X, 0)` to `less-than(copysign(X, 1), 0)`
PiperOrigin-RevId: 356253941
- Use a common base class to for AllReduce, AllGather, and AllToAll in the ODS spec.
- Add basic verification for replica groups attribute.
PiperOrigin-RevId: 354969654
In IREE, we use indexed generic op to handle the initial value. However, we
lower it to a generic op that carries an init_tensor here, and leave the handle
of initialization problem to later passes.
PiperOrigin-RevId: 354294807
If mhlo.reshape is not purely collapsing some consecutive operand
dimensions into result dimensions, we will generate two linalg
reshape op for it: the first one collapses all operand dimensions
into one dimension, and the second one expands it to all result
dimensions. For this case, the number of collapsed/expanded dimensions
should be coming strictly from the operand/result. It is different
from the case where we can generate one linalg reshape. For that case,
the reassociation map should have rank equal to the largest among
operand/result shape.
PiperOrigin-RevId: 354293826
This is the right behavior for TF and JAX and matches what TF does on GPU. It
doesn't match TF on CPU, but that's really a TF bug.
PiperOrigin-RevId: 353657779