The ‘shard’ dialect defines a set of attributes, operations, and interfaces for working with tensor sharding and device communication. It’s inspired by [GSPMD](General and Scalable Parallelization for ML Computation Graphs). Originally, the dialect was called mesh, but it was renamed to better reflect what it actually does. Collective Communication Operations Device Groups In-group Devices Purity and Execution Model Operations shard.all_gather (shard::AllGatherOp) shard.all_reduce (shar...