s390:

* More phys_to_virt conversions

* Improvement of AP management for VSIE (nested virtualization)

ARM64:

* Numerous fixes for the pathological lock inversion issue that
  plagued KVM/arm64 since... forever.

* New framework allowing SMCCC-compliant hypercalls to be forwarded
  to userspace, hopefully paving the way for some more features
  being moved to VMMs rather than be implemented in the kernel.

* Large rework of the timer code to allow a VM-wide offset to be
  applied to both virtual and physical counters as well as a
  per-timer, per-vcpu offset that complements the global one.
  This last part allows the NV timer code to be implemented on
  top.

* A small set of fixes to make sure that we don't change anything
  affecting the EL1&0 translation regime just after having having
  taken an exception to EL2 until we have executed a DSB. This
  ensures that speculative walks started in EL1&0 have completed.

* The usual selftest fixes and improvements.

KVM x86 changes for 6.4:

* Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled,
  and by giving the guest control of CR0.WP when EPT is enabled on VMX
  (VMX-only because SVM doesn't support per-bit controls)

* Add CR0/CR4 helpers to query single bits, and clean up related code
  where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return
  as a bool

* Move AMD_PSFD to cpufeatures.h and purge KVM's definition

* Avoid unnecessary writes+flushes when the guest is only adding new PTEs

* Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations
  when emulating invalidations

* Clean up the range-based flushing APIs

* Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single
  A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle
  changed SPTE" overhead associated with writing the entire entry

* Track the number of "tail" entries in a pte_list_desc to avoid having
  to walk (potentially) all descriptors during insertion and deletion,
  which gets quite expensive if the guest is spamming fork()

* Disallow virtualizing legacy LBRs if architectural LBRs are available,
  the two are mutually exclusive in hardware

* Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
  after KVM_RUN, similar to CPUID features

* Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES

* Apply PMU filters to emulated events and add test coverage to the
  pmu_event_filter selftest

x86 AMD:

* Add support for virtual NMIs

* Fixes for edge cases related to virtual interrupts

x86 Intel:

* Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
  not being reported due to userspace not opting in via prctl()

* Fix a bug in emulation of ENCLS in compatibility mode

* Allow emulation of NOP and PAUSE for L2

* AMX selftests improvements

* Misc cleanups

MIPS:

* Constify MIPS's internal callbacks (a leftover from the hardware enabling
  rework that landed in 6.3)

Generic:

* Drop unnecessary casts from "void *" throughout kvm_main.c

* Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
  size by 8 bytes on 64-bit kernels by utilizing a padding hole

Documentation:

* Fix goof introduced by the conversion to rST
Merge tag 'kvm-x86-vmx-6.4' of https://github.com/kvm-x86/linux into HEAD

KVM VMX changes for 6.4:

 - Fix a bug in emulation of ENCLS in compatibility mode

 - Allow emulation of NOP and PAUSE for L2

 - Misc cleanups