As network data rates advance toward 1 Tb/s, hardware-based implementations of anti-replay offer desirable tradeoffs over software. However, internal logic busses in FPGAs are becoming wider (512+ bits) and segmented (more than one packet per clock cycle) to accommodate increased network data rates. Such busses are challenging for applications such as anti-replay that require read-modify-write operations to a coherent database on each packet arrival. In this paper we present an FPGA-Targeted pipelined anti-replay design capable of accommodating 1024 IPsec tunnels at 1 Tb/s data rate. The novel design is enabled by fast on-chip block RAMs in a xcvu190 Virtex Ultrascale FPGA that are used to construct a 20-port RAM memory operating at 400 MHz with over 5 Tb/s of peak bandwidth. Custom single-clock write-combining techniques are described that accommodate multiple concurrent updates to the same database address. We also investigate the limits of capacity and concurrency for the anti-replay application.