[2026-04-08 07:52:44.651374 INFO duck_llm] 这是一条信息日志 [2026-04-08 07:52:44.651404 WARN duck_llm] 这是一条警告日志 [2026-04-08 07:52:44.651406 ERROR duck_llm] 这是一条错误日志 [2026-04-08 07:52:44.651606 INFO utils] Selected DPDK lcores: master=0, workers=[2, 4, 6, 8], all_performance_core_representatives=[0, 2, 4, 6, 8, 10, 12, 14] EAL: Detected CPU lcores: 32 EAL: Detected NUMA nodes: 1 EAL: Detected shared linkage of DPDK EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: VFIO support initialized EAL: Using IOMMU type 1 (Type 1) ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.36.0, ICE OS Default Package (single VLAN mode) ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.36.0, ICE OS Default Package (single VLAN mode) ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.36.0, ICE OS Default Package (single VLAN mode) ICE_INIT: ice_load_pkg_type(): Active package is: 1.3.36.0, ICE OS Default Package (single VLAN mode) [2026-04-08 07:52:46.724585 INFO dpdk_workers] DPDK initialized successfully. Found 4 ports. [2026-04-08 07:52:46.724601 INFO dpdk_workers] Port 0 device name: 0000:01:00.0 [2026-04-08 07:52:46.724603 INFO dpdk_workers] Port 0 IP address: 10.21.1.1 [2026-04-08 07:52:46.724605 INFO dpdk_workers] Port 0 Broadcast address: 10.21.1.255 [2026-04-08 07:52:46.724607 INFO dpdk_workers] Port 1 device name: 0000:01:00.1 [2026-04-08 07:52:46.724609 INFO dpdk_workers] Port 1 IP address: 10.21.2.1 [2026-04-08 07:52:46.724611 INFO dpdk_workers] Port 1 Broadcast address: 10.21.2.255 [2026-04-08 07:52:46.724612 INFO dpdk_workers] Port 2 device name: 0000:01:00.2 [2026-04-08 07:52:46.724614 INFO dpdk_workers] Port 2 IP address: 10.21.3.1 [2026-04-08 07:52:46.724615 INFO dpdk_workers] Port 2 Broadcast address: 10.21.3.255 [2026-04-08 07:52:46.724617 INFO dpdk_workers] Port 3 device name: 0000:01:00.3 [2026-04-08 07:52:46.724619 INFO dpdk_workers] Port 3 IP address: 10.21.4.1 [2026-04-08 07:52:46.724620 INFO dpdk_workers] Port 3 Broadcast address: 10.21.4.255 [2026-04-08 07:52:46.724622 INFO dpdk_workers] Available netifs list: [(10.21.1.255, 0, 10.21.1.1), (10.21.2.255, 1, 10.21.2.1), (10.21.3.255, 2, 10.21.3.1), (10.21.4.255, 3, 10.21.4.1)] [2026-04-08 07:52:46.724627 INFO dpdk_workers] Starting worker #0: (bcast_ip: 10.21.1.255, port_id: 0, lcore_id: 2, host_ip: 10.21.1.1) [2026-04-08 07:52:46.724673 INFO dpdk_workers] Initializing worker port 0 on lcore 2... [2026-04-08 07:52:46.726781 INFO dpdk_workers] Starting worker #1: (bcast_ip: 10.21.2.255, port_id: 1, lcore_id: 4, host_ip: 10.21.2.1) [2026-04-08 07:52:46.726811 INFO dpdk_workers] Starting worker #2: (bcast_ip: 10.21.3.255, port_id: 2, lcore_id: 6, host_ip: 10.21.3.1) [2026-04-08 07:52:46.726825 INFO dpdk_workers] Starting worker #3: (bcast_ip: 10.21.4.255, port_id: 3, lcore_id: 8, host_ip: 10.21.4.1) [2026-04-08 07:52:46.726856 INFO dpdk_workers] Initializing worker port 1 on lcore 4... [2026-04-08 07:52:46.728805 INFO dpdk_workers] Initializing worker port 2 on lcore 6... [2026-04-08 07:52:46.730791 INFO dpdk_workers] Initializing worker port 3 on lcore 8... ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 0). ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 3). ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 1). ICE_DRIVER: ice_set_rx_function(): Using Vector AVX2 (port 2). [2026-04-08 07:52:50.069443 INFO dpdk_workers] Worker port 0 initialized successfully. [2026-04-08 07:52:50.070348 INFO dpdk_workers] Worker port 1 initialized successfully. [2026-04-08 07:52:50.940220 INFO dpdk_workers] Worker port 2 initialized successfully. [2026-04-08 07:52:51.750697 INFO dpdk_workers] Worker port 3 initialized successfully. [2026-04-08 07:52:51.750727 INFO dpdk_workers] Workers initialized successfully. 4 workers running. [2026-04-08 07:52:51.751017 INFO utils] Binding master thread to cores (excluding workers): [0, 1, 3, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] [2026-04-08 07:52:51.751025 INFO utils] set_thread_affinity(tid 1357592, cores [0, 1, 3, 5, 7, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]): 0 [2026-04-08 07:52:51.751771 INFO dpdk_workers] Run command Ping all time: send 1.0 us, recv 737.0 us [2026-04-08 07:52:51.801831 INFO dpdk_workers] Run command Ping all time: send 0.4 us, recv 0.5 us [2026-04-08 07:52:51.851886 INFO dpdk_workers] Run command Ping all time: send 0.2 us, recv 0.5 us [2026-04-08 07:52:51.901942 INFO dpdk_workers] Run command Ping all time: send 0.2 us, recv 0.4 us [2026-04-08 07:52:51.951997 INFO dpdk_workers] Run command Ping all time: send 0.3 us, recv 0.4 us [2026-04-08 07:52:52.002054 INFO dpdk_workers] Run command Ping all time: send 0.2 us, recv 0.4 us [2026-04-08 07:52:52.052110 INFO dpdk_workers] Run command Ping all time: send 0.3 us, recv 0.3 us [2026-04-08 07:52:52.102166 INFO dpdk_workers] Run command Ping all time: send 0.3 us, recv 0.4 us [2026-04-08 07:52:52.152222 INFO dpdk_workers] Run command Ping all time: send 0.2 us, recv 0.5 us [2026-04-08 07:52:52.202277 INFO dpdk_workers] Run command Ping all time: send 0.3 us, recv 0.4 us [2026-04-08 07:52:52.252344 INFO dpdk_workers] Found 32 ducks in duck-ips-multi-netifs.txt [2026-04-08 07:52:52.252347 INFO dpdk_workers] Duck #0: 10.21.1.101 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252350 INFO dpdk_workers] Duck #1: 10.21.1.102 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252352 INFO dpdk_workers] Duck #2: 10.21.1.103 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252354 INFO dpdk_workers] Duck #3: 10.21.1.104 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252356 INFO dpdk_workers] Duck #4: 10.21.1.105 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252358 INFO dpdk_workers] Duck #5: 10.21.1.106 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252359 INFO dpdk_workers] Duck #6: 10.21.1.107 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252361 INFO dpdk_workers] Duck #7: 10.21.1.108 (bcast_ip: 10.21.1.255) [2026-04-08 07:52:52.252363 INFO dpdk_workers] Duck #8: 10.21.2.101 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252365 INFO dpdk_workers] Duck #9: 10.21.2.102 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252367 INFO dpdk_workers] Duck #10: 10.21.2.103 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252369 INFO dpdk_workers] Duck #11: 10.21.2.104 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252371 INFO dpdk_workers] Duck #12: 10.21.2.105 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252373 INFO dpdk_workers] Duck #13: 10.21.2.106 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252375 INFO dpdk_workers] Duck #14: 10.21.2.107 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252377 INFO dpdk_workers] Duck #15: 10.21.2.108 (bcast_ip: 10.21.2.255) [2026-04-08 07:52:52.252378 INFO dpdk_workers] Duck #16: 10.21.3.101 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252380 INFO dpdk_workers] Duck #17: 10.21.3.102 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252382 INFO dpdk_workers] Duck #18: 10.21.3.103 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252384 INFO dpdk_workers] Duck #19: 10.21.3.104 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252386 INFO dpdk_workers] Duck #20: 10.21.3.105 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252388 INFO dpdk_workers] Duck #21: 10.21.3.106 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252390 INFO dpdk_workers] Duck #22: 10.21.3.107 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252392 INFO dpdk_workers] Duck #23: 10.21.3.108 (bcast_ip: 10.21.3.255) [2026-04-08 07:52:52.252394 INFO dpdk_workers] Duck #24: 10.21.4.101 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.252395 INFO dpdk_workers] Duck #25: 10.21.4.102 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.252397 INFO dpdk_workers] Duck #26: 10.21.4.103 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.252399 INFO dpdk_workers] Duck #27: 10.21.4.104 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.252401 INFO dpdk_workers] Duck #28: 10.21.4.105 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.252403 INFO dpdk_workers] Duck #29: 10.21.4.106 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.252405 INFO dpdk_workers] Duck #30: 10.21.4.107 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.252409 INFO dpdk_workers] Duck #31: 10.21.4.108 (bcast_ip: 10.21.4.255) [2026-04-08 07:52:52.254156 INFO dpdk_workers] [Worker 0]: 10.21.1.101 [2026-04-08 07:52:52.254158 INFO dpdk_workers] [Worker 0]: 10.21.1.102 [2026-04-08 07:52:52.254160 INFO dpdk_workers] [Worker 0]: 10.21.1.103 [2026-04-08 07:52:52.254161 INFO dpdk_workers] [Worker 0]: 10.21.1.104 [2026-04-08 07:52:52.254163 INFO dpdk_workers] [Worker 0]: 10.21.1.105 [2026-04-08 07:52:52.254164 INFO dpdk_workers] [Worker 0]: 10.21.1.106 [2026-04-08 07:52:52.254166 INFO dpdk_workers] [Worker 0]: 10.21.1.107 [2026-04-08 07:52:52.254167 INFO dpdk_workers] [Worker 0]: 10.21.1.108 [2026-04-08 07:52:52.254170 INFO dpdk_workers] [Worker 1]: 10.21.2.101 [2026-04-08 07:52:52.254172 INFO dpdk_workers] [Worker 1]: 10.21.2.102 [2026-04-08 07:52:52.254173 INFO dpdk_workers] [Worker 1]: 10.21.2.103 [2026-04-08 07:52:52.254175 INFO dpdk_workers] [Worker 1]: 10.21.2.104 [2026-04-08 07:52:52.254176 INFO dpdk_workers] [Worker 1]: 10.21.2.105 [2026-04-08 07:52:52.254178 INFO dpdk_workers] [Worker 1]: 10.21.2.106 [2026-04-08 07:52:52.254179 INFO dpdk_workers] [Worker 1]: 10.21.2.107 [2026-04-08 07:52:52.254181 INFO dpdk_workers] [Worker 1]: 10.21.2.108 [2026-04-08 07:52:52.254197 INFO dpdk_workers] [Worker 2]: 10.21.3.101 [2026-04-08 07:52:52.254198 INFO dpdk_workers] [Worker 2]: 10.21.3.102 [2026-04-08 07:52:52.254200 INFO dpdk_workers] [Worker 2]: 10.21.3.103 [2026-04-08 07:52:52.254201 INFO dpdk_workers] [Worker 2]: 10.21.3.104 [2026-04-08 07:52:52.254203 INFO dpdk_workers] [Worker 2]: 10.21.3.105 [2026-04-08 07:52:52.254205 INFO dpdk_workers] [Worker 2]: 10.21.3.106 [2026-04-08 07:52:52.254206 INFO dpdk_workers] [Worker 2]: 10.21.3.107 [2026-04-08 07:52:52.254208 INFO dpdk_workers] [Worker 2]: 10.21.3.108 [2026-04-08 07:52:52.352701 INFO dpdk_workers] [Worker 3]: 10.21.4.101 [2026-04-08 07:52:52.352716 INFO dpdk_workers] [Worker 3]: 10.21.4.102 [2026-04-08 07:52:52.352727 INFO dpdk_workers] [Worker 3]: 10.21.4.103 [2026-04-08 07:52:52.352737 INFO dpdk_workers] [Worker 3]: 10.21.4.104 [2026-04-08 07:52:52.352748 INFO dpdk_workers] [Worker 3]: 10.21.4.105 [2026-04-08 07:52:52.352750 INFO dpdk_workers] [Worker 3]: 10.21.4.106 [2026-04-08 07:52:52.352752 INFO dpdk_workers] [Worker 3]: 10.21.4.107 [2026-04-08 07:52:52.352754 INFO dpdk_workers] [Worker 3]: 10.21.4.108 [2026-04-08 07:52:52.352757 INFO dpdk_workers] init_ducks done [2026-04-08 07:52:52.358839 INFO dpdk_ducks] Initialized 4 DPDK duck workers [2026-04-08 07:52:52.358842 INFO dpdk_ducks] DPDK duck worker 0: DpdkDuckWorker { worker_idx: 0, ducks: [DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }], all_ranks: [0, 1, 2, 3, 4, 5, 6, 7], tp_rank_range: (0, 8) } [2026-04-08 07:52:52.358846 INFO dpdk_ducks] DPDK duck worker 1: DpdkDuckWorker { worker_idx: 1, ducks: [DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }], all_ranks: [0, 1, 2, 3, 4, 5, 6, 7], tp_rank_range: (8, 16) } [2026-04-08 07:52:52.358849 INFO dpdk_ducks] DPDK duck worker 2: DpdkDuckWorker { worker_idx: 2, ducks: [DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }], all_ranks: [0, 1, 2, 3, 4, 5, 6, 7], tp_rank_range: (16, 24) } [2026-04-08 07:52:52.358851 INFO dpdk_ducks] DPDK duck worker 3: DpdkDuckWorker { worker_idx: 3, ducks: [DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }, DpdkDuck { buffer_size: 32212254720 }], all_ranks: [0, 1, 2, 3, 4, 5, 6, 7], tp_rank_range: (24, 32) } [2026-04-08 07:52:52.358856 INFO buffer_manager] Initializing buffer manager [2026-04-08 07:52:52.358858 INFO buffer_manager] Buffer manager initialized: ELF BufferAllocator { begin: 0, end: 10485760, current: 0 }, input BufferAllocator { begin: 10485760, end: 104857600, current: 10485760 }, weights BufferAllocator { begin: 104923136, end: 32212254720, current: 104923136 } [2026-04-08 07:52:52.358862 INFO fp8_dpdk_common] fp9 persistent judge enabled by default; set DUCK_FP9_PERSISTENT_JUDGE=0 to disable [2026-04-08 07:52:52.359271 INFO buffer_manager] Added kernel fp9_kernels at (0, 91664) [2026-04-08 07:52:52.359305 INFO fp8_dpdk_common] fp9 persistent judge: opened 32 sessions [2026-04-08 07:52:52.359308 INFO fp8_dpdk_common] fp9 persistent judge: force-opened 32 fresh sessions for new init [2026-04-08 07:52:52.359310 INFO fp8_mlp_dpdk] fp8_mlp_dpdk: init(tp_size=32) [2026-04-08 07:52:52.359312 INFO fp8_moe_dpdk] fp8_moe_dpdk: init(tp_size=32) [2026-04-08 07:52:52.745417 INFO weight_cache] weight_cache: header hit tp_size=32 num_slots=62 finished_slots=62 [2026-04-08 07:52:53.072497 INFO buffer_manager] Allocated weights buffer at (104923136, 0) [2026-04-08 07:52:53.072528 INFO buffer_manager] Allocated weights buffer at (104923136, 4128768) [2026-04-08 07:52:53.072531 INFO buffer_manager] Allocated weights buffer at (109051904, 516096) [2026-04-08 07:52:53.072532 INFO buffer_manager] Allocated weights buffer at (109568000, 2016) [2026-04-08 07:52:53.072534 INFO buffer_manager] Allocated weights buffer at (109572096, 4128768) [2026-04-08 07:52:53.072535 INFO buffer_manager] Allocated weights buffer at (113700864, 516096) [2026-04-08 07:52:53.072537 INFO buffer_manager] Allocated weights buffer at (114216960, 2016) [2026-04-08 07:52:53.072538 INFO buffer_manager] Allocated weights buffer at (114221056, 4128768) [2026-04-08 07:52:53.072540 INFO buffer_manager] Allocated weights buffer at (118349824, 516096) [2026-04-08 07:52:53.072541 INFO buffer_manager] Allocated weights buffer at (118865920, 2016) [2026-04-08 07:52:53.072543 INFO buffer_manager] Allocated weights buffer at (118870016, 0) [2026-04-08 07:52:53.072544 INFO fp8_mlp_dpdk] fp8_mlp_dpdk: init_layer_cached(layer_idx=0, cache_slot=0) planned desc only [2026-04-08 07:52:53.164817 INFO buffer_manager] Allocated weights buffer at (118870016, 0) [2026-04-08 07:52:53.164838 INFO buffer_manager] Allocated weights buffer at (118870016, 4128768) [2026-04-08 07:52:53.164840 INFO buffer_manager] Allocated weights buffer at (122998784, 516096) [2026-04-08 07:52:53.164841 INFO buffer_manager] Allocated weights buffer at (123514880, 2016) [2026-04-08 07:52:53.164843 INFO buffer_manager] Allocated weights buffer at (123518976, 4128768) [2026-04-08 07:52:53.164844 INFO buffer_manager] Allocated weights buffer at (127647744, 516096) [2026-04-08 07:52:53.164846 INFO buffer_manager] Allocated weights buffer at (128163840, 2016) [2026-04-08 07:52:53.164847 INFO buffer_manager] Allocated weights buffer at (128167936, 4128768) [2026-04-08 07:52:53.164849 INFO buffer_manager] Allocated weights buffer at (132296704, 516096) [2026-04-08 07:52:53.164850 INFO buffer_manager] Allocated weights buffer at (132812800, 2016) [2026-04-08 07:52:53.164852 INFO buffer_manager] Allocated weights buffer at (132816896, 0) [2026-04-08 07:52:53.164853 INFO fp8_mlp_dpdk] fp8_mlp_dpdk: init_layer_cached(layer_idx=1, cache_slot=1) planned desc only [2026-04-08 07:52:53.251410 INFO buffer_manager] Allocated weights buffer at (132816896, 0) [2026-04-08 07:52:53.251428 INFO buffer_manager] Allocated weights buffer at (132816896, 4128768) [2026-04-08 07:52:53.251431 INFO buffer_manager] Allocated weights buffer at (136945664, 516096) [2026-04-08 07:52:53.251432 INFO buffer_manager] Allocated weights buffer at (137461760, 2016) [2026-04-08 07:52:53.251438 INFO buffer_manager] Allocated weights buffer at (137465856, 4128768) [2026-04-08 07:52:53.251440 INFO buffer_manager] Allocated weights buffer at (141594624, 516096) [2026-04-08 07:52:53.251441 INFO buffer_manager] Allocated weights buffer at (142110720, 2016) [2026-04-08 07:52:53.251443 INFO buffer_manager] Allocated weights buffer at (142114816, 4128768) [2026-04-08 07:52:53.251444 INFO buffer_manager] Allocated weights buffer at (146243584, 516096) [2026-04-08 07:52:53.251446 INFO buffer_manager] Allocated weights buffer at (146759680, 2016) [2026-04-08 07:52:53.251447 INFO buffer_manager] Allocated weights buffer at (146763776, 0) [2026-04-08 07:52:53.251450 INFO fp8_mlp_dpdk] fp8_mlp_dpdk: init_layer_cached(layer_idx=2, cache_slot=2) planned desc only [2026-04-08 07:52:53.279924 INFO buffer_manager] Allocated weights buffer at (146763776, 0) [2026-04-08 07:52:53.279938 INFO buffer_manager] Allocated weights buffer at (146763776, 132120576) [2026-04-08 07:52:53.279941 INFO buffer_manager] Allocated weights buffer at (278884352, 57344) [2026-04-08 07:52:53.279942 INFO buffer_manager] Allocated weights buffer at (278941696, 132120576) [2026-04-08 07:52:53.279944 INFO buffer_manager] Allocated weights buffer at (411062272, 57344) [2026-04-08 07:52:53.279945 INFO buffer_manager] Allocated weights buffer at (411119616, 132120576) [2026-04-08 07:52:53.279946 INFO buffer_manager] Allocated weights buffer at (543240192, 57344) [2026-04-08 07:52:53.279948 INFO buffer_manager] Allocated weights buffer at (543297536, 0) [2026-04-08 07:52:53.279950 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=3, cache_slot=3) planned desc only [2026-04-08 07:52:53.316311 INFO buffer_manager] Allocated weights buffer at (543297536, 0) [2026-04-08 07:52:53.316324 INFO buffer_manager] Allocated weights buffer at (543297536, 132120576) [2026-04-08 07:52:53.316326 INFO buffer_manager] Allocated weights buffer at (675418112, 57344) [2026-04-08 07:52:53.316328 INFO buffer_manager] Allocated weights buffer at (675475456, 132120576) [2026-04-08 07:52:53.316330 INFO buffer_manager] Allocated weights buffer at (807596032, 57344) [2026-04-08 07:52:53.316331 INFO buffer_manager] Allocated weights buffer at (807653376, 132120576) [2026-04-08 07:52:53.316333 INFO buffer_manager] Allocated weights buffer at (939773952, 57344) [2026-04-08 07:52:53.316334 INFO buffer_manager] Allocated weights buffer at (939831296, 0) [2026-04-08 07:52:53.316336 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=4, cache_slot=4) planned desc only [2026-04-08 07:52:53.352597 INFO buffer_manager] Allocated weights buffer at (939831296, 0) [2026-04-08 07:52:53.352611 INFO buffer_manager] Allocated weights buffer at (939831296, 132120576) [2026-04-08 07:52:53.352614 INFO buffer_manager] Allocated weights buffer at (1071951872, 57344) [2026-04-08 07:52:53.352616 INFO buffer_manager] Allocated weights buffer at (1072009216, 132120576) [2026-04-08 07:52:53.352617 INFO buffer_manager] Allocated weights buffer at (1204129792, 57344) [2026-04-08 07:52:53.352619 INFO buffer_manager] Allocated weights buffer at (1204187136, 132120576) [2026-04-08 07:52:53.352620 INFO buffer_manager] Allocated weights buffer at (1336307712, 57344) [2026-04-08 07:52:53.352622 INFO buffer_manager] Allocated weights buffer at (1336365056, 0) [2026-04-08 07:52:53.352623 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=5, cache_slot=5) planned desc only [2026-04-08 07:52:53.388811 INFO buffer_manager] Allocated weights buffer at (1336365056, 0) [2026-04-08 07:52:53.388824 INFO buffer_manager] Allocated weights buffer at (1336365056, 132120576) [2026-04-08 07:52:53.388826 INFO buffer_manager] Allocated weights buffer at (1468485632, 57344) [2026-04-08 07:52:53.388828 INFO buffer_manager] Allocated weights buffer at (1468542976, 132120576) [2026-04-08 07:52:53.388829 INFO buffer_manager] Allocated weights buffer at (1600663552, 57344) [2026-04-08 07:52:53.388831 INFO buffer_manager] Allocated weights buffer at (1600720896, 132120576) [2026-04-08 07:52:53.388836 INFO buffer_manager] Allocated weights buffer at (1732841472, 57344) [2026-04-08 07:52:53.388838 INFO buffer_manager] Allocated weights buffer at (1732898816, 0) [2026-04-08 07:52:53.388840 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=6, cache_slot=6) planned desc only [2026-04-08 07:52:53.425011 INFO buffer_manager] Allocated weights buffer at (1732898816, 0) [2026-04-08 07:52:53.425024 INFO buffer_manager] Allocated weights buffer at (1732898816, 132120576) [2026-04-08 07:52:53.425027 INFO buffer_manager] Allocated weights buffer at (1865019392, 57344) [2026-04-08 07:52:53.425028 INFO buffer_manager] Allocated weights buffer at (1865076736, 132120576) [2026-04-08 07:52:53.425030 INFO buffer_manager] Allocated weights buffer at (1997197312, 57344) [2026-04-08 07:52:53.425031 INFO buffer_manager] Allocated weights buffer at (1997254656, 132120576) [2026-04-08 07:52:53.425033 INFO buffer_manager] Allocated weights buffer at (2129375232, 57344) [2026-04-08 07:52:53.425034 INFO buffer_manager] Allocated weights buffer at (2129432576, 0) [2026-04-08 07:52:53.425036 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=7, cache_slot=7) planned desc only [2026-04-08 07:52:53.461195 INFO buffer_manager] Allocated weights buffer at (2129432576, 0) [2026-04-08 07:52:53.461207 INFO buffer_manager] Allocated weights buffer at (2129432576, 132120576) [2026-04-08 07:52:53.461210 INFO buffer_manager] Allocated weights buffer at (2261553152, 57344) [2026-04-08 07:52:53.461211 INFO buffer_manager] Allocated weights buffer at (2261610496, 132120576) [2026-04-08 07:52:53.461213 INFO buffer_manager] Allocated weights buffer at (2393731072, 57344) [2026-04-08 07:52:53.461214 INFO buffer_manager] Allocated weights buffer at (2393788416, 132120576) [2026-04-08 07:52:53.461216 INFO buffer_manager] Allocated weights buffer at (2525908992, 57344) [2026-04-08 07:52:53.461217 INFO buffer_manager] Allocated weights buffer at (2525966336, 0) [2026-04-08 07:52:53.461219 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=8, cache_slot=8) planned desc only [2026-04-08 07:52:53.497380 INFO buffer_manager] Allocated weights buffer at (2525966336, 0) [2026-04-08 07:52:53.497393 INFO buffer_manager] Allocated weights buffer at (2525966336, 132120576) [2026-04-08 07:52:53.497395 INFO buffer_manager] Allocated weights buffer at (2658086912, 57344) [2026-04-08 07:52:53.497397 INFO buffer_manager] Allocated weights buffer at (2658144256, 132120576) [2026-04-08 07:52:53.497398 INFO buffer_manager] Allocated weights buffer at (2790264832, 57344) [2026-04-08 07:52:53.497400 INFO buffer_manager] Allocated weights buffer at (2790322176, 132120576) [2026-04-08 07:52:53.497401 INFO buffer_manager] Allocated weights buffer at (2922442752, 57344) [2026-04-08 07:52:53.497403 INFO buffer_manager] Allocated weights buffer at (2922500096, 0) [2026-04-08 07:52:53.497404 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=9, cache_slot=9) planned desc only [2026-04-08 07:52:53.533696 INFO buffer_manager] Allocated weights buffer at (2922500096, 0) [2026-04-08 07:52:53.533709 INFO buffer_manager] Allocated weights buffer at (2922500096, 132120576) [2026-04-08 07:52:53.533711 INFO buffer_manager] Allocated weights buffer at (3054620672, 57344) [2026-04-08 07:52:53.533713 INFO buffer_manager] Allocated weights buffer at (3054678016, 132120576) [2026-04-08 07:52:53.533714 INFO buffer_manager] Allocated weights buffer at (3186798592, 57344) [2026-04-08 07:52:53.533716 INFO buffer_manager] Allocated weights buffer at (3186855936, 132120576) [2026-04-08 07:52:53.533717 INFO buffer_manager] Allocated weights buffer at (3318976512, 57344) [2026-04-08 07:52:53.533719 INFO buffer_manager] Allocated weights buffer at (3319033856, 0) [2026-04-08 07:52:53.533720 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=10, cache_slot=10) planned desc only [2026-04-08 07:52:53.569937 INFO buffer_manager] Allocated weights buffer at (3319033856, 0) [2026-04-08 07:52:53.569953 INFO buffer_manager] Allocated weights buffer at (3319033856, 132120576) [2026-04-08 07:52:53.569959 INFO buffer_manager] Allocated weights buffer at (3451154432, 57344) [2026-04-08 07:52:53.569961 INFO buffer_manager] Allocated weights buffer at (3451211776, 132120576) [2026-04-08 07:52:53.569963 INFO buffer_manager] Allocated weights buffer at (3583332352, 57344) [2026-04-08 07:52:53.569964 INFO buffer_manager] Allocated weights buffer at (3583389696, 132120576) [2026-04-08 07:52:53.569965 INFO buffer_manager] Allocated weights buffer at (3715510272, 57344) [2026-04-08 07:52:53.569967 INFO buffer_manager] Allocated weights buffer at (3715567616, 0) [2026-04-08 07:52:53.569968 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=11, cache_slot=11) planned desc only [2026-04-08 07:52:53.606138 INFO buffer_manager] Allocated weights buffer at (3715567616, 0) [2026-04-08 07:52:53.606152 INFO buffer_manager] Allocated weights buffer at (3715567616, 132120576) [2026-04-08 07:52:53.606154 INFO buffer_manager] Allocated weights buffer at (3847688192, 57344) [2026-04-08 07:52:53.606156 INFO buffer_manager] Allocated weights buffer at (3847745536, 132120576) [2026-04-08 07:52:53.606158 INFO buffer_manager] Allocated weights buffer at (3979866112, 57344) [2026-04-08 07:52:53.606159 INFO buffer_manager] Allocated weights buffer at (3979923456, 132120576) [2026-04-08 07:52:53.606161 INFO buffer_manager] Allocated weights buffer at (4112044032, 57344) [2026-04-08 07:52:53.606163 INFO buffer_manager] Allocated weights buffer at (4112101376, 0) [2026-04-08 07:52:53.606164 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=12, cache_slot=12) planned desc only [2026-04-08 07:52:53.642276 INFO buffer_manager] Allocated weights buffer at (4112101376, 0) [2026-04-08 07:52:53.642290 INFO buffer_manager] Allocated weights buffer at (4112101376, 132120576) [2026-04-08 07:52:53.642293 INFO buffer_manager] Allocated weights buffer at (4244221952, 57344) [2026-04-08 07:52:53.642295 INFO buffer_manager] Allocated weights buffer at (4244279296, 132120576) [2026-04-08 07:52:53.642296 INFO buffer_manager] Allocated weights buffer at (4376399872, 57344) [2026-04-08 07:52:53.642298 INFO buffer_manager] Allocated weights buffer at (4376457216, 132120576) [2026-04-08 07:52:53.642299 INFO buffer_manager] Allocated weights buffer at (4508577792, 57344) [2026-04-08 07:52:53.642301 INFO buffer_manager] Allocated weights buffer at (4508635136, 0) [2026-04-08 07:52:53.642302 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=13, cache_slot=13) planned desc only [2026-04-08 07:52:53.678478 INFO buffer_manager] Allocated weights buffer at (4508635136, 0) [2026-04-08 07:52:53.678491 INFO buffer_manager] Allocated weights buffer at (4508635136, 132120576) [2026-04-08 07:52:53.678493 INFO buffer_manager] Allocated weights buffer at (4640755712, 57344) [2026-04-08 07:52:53.678495 INFO buffer_manager] Allocated weights buffer at (4640813056, 132120576) [2026-04-08 07:52:53.678496 INFO buffer_manager] Allocated weights buffer at (4772933632, 57344) [2026-04-08 07:52:53.678498 INFO buffer_manager] Allocated weights buffer at (4772990976, 132120576) [2026-04-08 07:52:53.678499 INFO buffer_manager] Allocated weights buffer at (4905111552, 57344) [2026-04-08 07:52:53.678500 INFO buffer_manager] Allocated weights buffer at (4905168896, 0) [2026-04-08 07:52:53.678502 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=14, cache_slot=14) planned desc only [2026-04-08 07:52:53.714864 INFO buffer_manager] Allocated weights buffer at (4905168896, 0) [2026-04-08 07:52:53.714878 INFO buffer_manager] Allocated weights buffer at (4905168896, 132120576) [2026-04-08 07:52:53.714880 INFO buffer_manager] Allocated weights buffer at (5037289472, 57344) [2026-04-08 07:52:53.714881 INFO buffer_manager] Allocated weights buffer at (5037346816, 132120576) [2026-04-08 07:52:53.714883 INFO buffer_manager] Allocated weights buffer at (5169467392, 57344) [2026-04-08 07:52:53.714884 INFO buffer_manager] Allocated weights buffer at (5169524736, 132120576) [2026-04-08 07:52:53.714886 INFO buffer_manager] Allocated weights buffer at (5301645312, 57344) [2026-04-08 07:52:53.714892 INFO buffer_manager] Allocated weights buffer at (5301702656, 0) [2026-04-08 07:52:53.714893 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=15, cache_slot=15) planned desc only [2026-04-08 07:52:53.751172 INFO buffer_manager] Allocated weights buffer at (5301702656, 0) [2026-04-08 07:52:53.751186 INFO buffer_manager] Allocated weights buffer at (5301702656, 132120576) [2026-04-08 07:52:53.751188 INFO buffer_manager] Allocated weights buffer at (5433823232, 57344) [2026-04-08 07:52:53.751190 INFO buffer_manager] Allocated weights buffer at (5433880576, 132120576) [2026-04-08 07:52:53.751191 INFO buffer_manager] Allocated weights buffer at (5566001152, 57344) [2026-04-08 07:52:53.751193 INFO buffer_manager] Allocated weights buffer at (5566058496, 132120576) [2026-04-08 07:52:53.751194 INFO buffer_manager] Allocated weights buffer at (5698179072, 57344) [2026-04-08 07:52:53.751196 INFO buffer_manager] Allocated weights buffer at (5698236416, 0) [2026-04-08 07:52:53.751201 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=16, cache_slot=16) planned desc only [2026-04-08 07:52:53.787337 INFO buffer_manager] Allocated weights buffer at (5698236416, 0) [2026-04-08 07:52:53.787349 INFO buffer_manager] Allocated weights buffer at (5698236416, 132120576) [2026-04-08 07:52:53.787352 INFO buffer_manager] Allocated weights buffer at (5830356992, 57344) [2026-04-08 07:52:53.787353 INFO buffer_manager] Allocated weights buffer at (5830414336, 132120576) [2026-04-08 07:52:53.787355 INFO buffer_manager] Allocated weights buffer at (5962534912, 57344) [2026-04-08 07:52:53.787356 INFO buffer_manager] Allocated weights buffer at (5962592256, 132120576) [2026-04-08 07:52:53.787358 INFO buffer_manager] Allocated weights buffer at (6094712832, 57344) [2026-04-08 07:52:53.787359 INFO buffer_manager] Allocated weights buffer at (6094770176, 0) [2026-04-08 07:52:53.787361 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=17, cache_slot=17) planned desc only [2026-04-08 07:52:53.823438 INFO buffer_manager] Allocated weights buffer at (6094770176, 0) [2026-04-08 07:52:53.823450 INFO buffer_manager] Allocated weights buffer at (6094770176, 132120576) [2026-04-08 07:52:53.823453 INFO buffer_manager] Allocated weights buffer at (6226890752, 57344) [2026-04-08 07:52:53.823454 INFO buffer_manager] Allocated weights buffer at (6226948096, 132120576) [2026-04-08 07:52:53.823456 INFO buffer_manager] Allocated weights buffer at (6359068672, 57344) [2026-04-08 07:52:53.823457 INFO buffer_manager] Allocated weights buffer at (6359126016, 132120576) [2026-04-08 07:52:53.823459 INFO buffer_manager] Allocated weights buffer at (6491246592, 57344) [2026-04-08 07:52:53.823460 INFO buffer_manager] Allocated weights buffer at (6491303936, 0) [2026-04-08 07:52:53.823462 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=18, cache_slot=18) planned desc only [2026-04-08 07:52:53.859625 INFO buffer_manager] Allocated weights buffer at (6491303936, 0) [2026-04-08 07:52:53.859638 INFO buffer_manager] Allocated weights buffer at (6491303936, 132120576) [2026-04-08 07:52:53.859640 INFO buffer_manager] Allocated weights buffer at (6623424512, 57344) [2026-04-08 07:52:53.859641 INFO buffer_manager] Allocated weights buffer at (6623481856, 132120576) [2026-04-08 07:52:53.859643 INFO buffer_manager] Allocated weights buffer at (6755602432, 57344) [2026-04-08 07:52:53.859644 INFO buffer_manager] Allocated weights buffer at (6755659776, 132120576) [2026-04-08 07:52:53.859646 INFO buffer_manager] Allocated weights buffer at (6887780352, 57344) [2026-04-08 07:52:53.859647 INFO buffer_manager] Allocated weights buffer at (6887837696, 0) [2026-04-08 07:52:53.859649 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=19, cache_slot=19) planned desc only [2026-04-08 07:52:53.895785 INFO buffer_manager] Allocated weights buffer at (6887837696, 0) [2026-04-08 07:52:53.895798 INFO buffer_manager] Allocated weights buffer at (6887837696, 132120576) [2026-04-08 07:52:53.895804 INFO buffer_manager] Allocated weights buffer at (7019958272, 57344) [2026-04-08 07:52:53.895806 INFO buffer_manager] Allocated weights buffer at (7020015616, 132120576) [2026-04-08 07:52:53.895807 INFO buffer_manager] Allocated weights buffer at (7152136192, 57344) [2026-04-08 07:52:53.895809 INFO buffer_manager] Allocated weights buffer at (7152193536, 132120576) [2026-04-08 07:52:53.895810 INFO buffer_manager] Allocated weights buffer at (7284314112, 57344) [2026-04-08 07:52:53.895811 INFO buffer_manager] Allocated weights buffer at (7284371456, 0) [2026-04-08 07:52:53.895813 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=20, cache_slot=20) planned desc only [2026-04-08 07:52:53.931994 INFO buffer_manager] Allocated weights buffer at (7284371456, 0) [2026-04-08 07:52:53.932007 INFO buffer_manager] Allocated weights buffer at (7284371456, 132120576) [2026-04-08 07:52:53.932010 INFO buffer_manager] Allocated weights buffer at (7416492032, 57344) [2026-04-08 07:52:53.932011 INFO buffer_manager] Allocated weights buffer at (7416549376, 132120576) [2026-04-08 07:52:53.932013 INFO buffer_manager] Allocated weights buffer at (7548669952, 57344) [2026-04-08 07:52:53.932014 INFO buffer_manager] Allocated weights buffer at (7548727296, 132120576) [2026-04-08 07:52:53.932016 INFO buffer_manager] Allocated weights buffer at (7680847872, 57344) [2026-04-08 07:52:53.932017 INFO buffer_manager] Allocated weights buffer at (7680905216, 0) [2026-04-08 07:52:53.932019 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=21, cache_slot=21) planned desc only [2026-04-08 07:52:53.968164 INFO buffer_manager] Allocated weights buffer at (7680905216, 0) [2026-04-08 07:52:53.968177 INFO buffer_manager] Allocated weights buffer at (7680905216, 132120576) [2026-04-08 07:52:53.968179 INFO buffer_manager] Allocated weights buffer at (7813025792, 57344) [2026-04-08 07:52:53.968181 INFO buffer_manager] Allocated weights buffer at (7813083136, 132120576) [2026-04-08 07:52:53.968182 INFO buffer_manager] Allocated weights buffer at (7945203712, 57344) [2026-04-08 07:52:53.968184 INFO buffer_manager] Allocated weights buffer at (7945261056, 132120576) [2026-04-08 07:52:53.968185 INFO buffer_manager] Allocated weights buffer at (8077381632, 57344) [2026-04-08 07:52:53.968187 INFO buffer_manager] Allocated weights buffer at (8077438976, 0) [2026-04-08 07:52:53.968188 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=22, cache_slot=22) planned desc only [2026-04-08 07:52:54.004350 INFO buffer_manager] Allocated weights buffer at (8077438976, 0) [2026-04-08 07:52:54.004363 INFO buffer_manager] Allocated weights buffer at (8077438976, 132120576) [2026-04-08 07:52:54.004365 INFO buffer_manager] Allocated weights buffer at (8209559552, 57344) [2026-04-08 07:52:54.004367 INFO buffer_manager] Allocated weights buffer at (8209616896, 132120576) [2026-04-08 07:52:54.004369 INFO buffer_manager] Allocated weights buffer at (8341737472, 57344) [2026-04-08 07:52:54.004370 INFO buffer_manager] Allocated weights buffer at (8341794816, 132120576) [2026-04-08 07:52:54.004372 INFO buffer_manager] Allocated weights buffer at (8473915392, 57344) [2026-04-08 07:52:54.004373 INFO buffer_manager] Allocated weights buffer at (8473972736, 0) [2026-04-08 07:52:54.004375 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=23, cache_slot=23) planned desc only [2026-04-08 07:52:54.040591 INFO buffer_manager] Allocated weights buffer at (8473972736, 0) [2026-04-08 07:52:54.040604 INFO buffer_manager] Allocated weights buffer at (8473972736, 132120576) [2026-04-08 07:52:54.040606 INFO buffer_manager] Allocated weights buffer at (8606093312, 57344) [2026-04-08 07:52:54.040608 INFO buffer_manager] Allocated weights buffer at (8606150656, 132120576) [2026-04-08 07:52:54.040609 INFO buffer_manager] Allocated weights buffer at (8738271232, 57344) [2026-04-08 07:52:54.040611 INFO buffer_manager] Allocated weights buffer at (8738328576, 132120576) [2026-04-08 07:52:54.040612 INFO buffer_manager] Allocated weights buffer at (8870449152, 57344) [2026-04-08 07:52:54.040617 INFO buffer_manager] Allocated weights buffer at (8870506496, 0) [2026-04-08 07:52:54.040619 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=24, cache_slot=24) planned desc only [2026-04-08 07:52:54.076817 INFO buffer_manager] Allocated weights buffer at (8870506496, 0) [2026-04-08 07:52:54.076836 INFO buffer_manager] Allocated weights buffer at (8870506496, 132120576) [2026-04-08 07:52:54.076838 INFO buffer_manager] Allocated weights buffer at (9002627072, 57344) [2026-04-08 07:52:54.076840 INFO buffer_manager] Allocated weights buffer at (9002684416, 132120576) [2026-04-08 07:52:54.076841 INFO buffer_manager] Allocated weights buffer at (9134804992, 57344) [2026-04-08 07:52:54.076843 INFO buffer_manager] Allocated weights buffer at (9134862336, 132120576) [2026-04-08 07:52:54.076844 INFO buffer_manager] Allocated weights buffer at (9266982912, 57344) [2026-04-08 07:52:54.076846 INFO buffer_manager] Allocated weights buffer at (9267040256, 0) [2026-04-08 07:52:54.076848 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=25, cache_slot=25) planned desc only [2026-04-08 07:52:54.112992 INFO buffer_manager] Allocated weights buffer at (9267040256, 0) [2026-04-08 07:52:54.113007 INFO buffer_manager] Allocated weights buffer at (9267040256, 132120576) [2026-04-08 07:52:54.113009 INFO buffer_manager] Allocated weights buffer at (9399160832, 57344) [2026-04-08 07:52:54.113011 INFO buffer_manager] Allocated weights buffer at (9399218176, 132120576) [2026-04-08 07:52:54.113013 INFO buffer_manager] Allocated weights buffer at (9531338752, 57344) [2026-04-08 07:52:54.113014 INFO buffer_manager] Allocated weights buffer at (9531396096, 132120576) [2026-04-08 07:52:54.113015 INFO buffer_manager] Allocated weights buffer at (9663516672, 57344) [2026-04-08 07:52:54.113017 INFO buffer_manager] Allocated weights buffer at (9663574016, 0) [2026-04-08 07:52:54.113019 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=26, cache_slot=26) planned desc only [2026-04-08 07:52:54.149269 INFO buffer_manager] Allocated weights buffer at (9663574016, 0) [2026-04-08 07:52:54.149283 INFO buffer_manager] Allocated weights buffer at (9663574016, 132120576) [2026-04-08 07:52:54.149285 INFO buffer_manager] Allocated weights buffer at (9795694592, 57344) [2026-04-08 07:52:54.149286 INFO buffer_manager] Allocated weights buffer at (9795751936, 132120576) [2026-04-08 07:52:54.149288 INFO buffer_manager] Allocated weights buffer at (9927872512, 57344) [2026-04-08 07:52:54.149289 INFO buffer_manager] Allocated weights buffer at (9927929856, 132120576) [2026-04-08 07:52:54.149291 INFO buffer_manager] Allocated weights buffer at (10060050432, 57344) [2026-04-08 07:52:54.149292 INFO buffer_manager] Allocated weights buffer at (10060107776, 0) [2026-04-08 07:52:54.149294 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=27, cache_slot=27) planned desc only [2026-04-08 07:52:54.185413 INFO buffer_manager] Allocated weights buffer at (10060107776, 0) [2026-04-08 07:52:54.185427 INFO buffer_manager] Allocated weights buffer at (10060107776, 132120576) [2026-04-08 07:52:54.185429 INFO buffer_manager] Allocated weights buffer at (10192228352, 57344) [2026-04-08 07:52:54.185430 INFO buffer_manager] Allocated weights buffer at (10192285696, 132120576) [2026-04-08 07:52:54.185432 INFO buffer_manager] Allocated weights buffer at (10324406272, 57344) [2026-04-08 07:52:54.185433 INFO buffer_manager] Allocated weights buffer at (10324463616, 132120576) [2026-04-08 07:52:54.185435 INFO buffer_manager] Allocated weights buffer at (10456584192, 57344) [2026-04-08 07:52:54.185436 INFO buffer_manager] Allocated weights buffer at (10456641536, 0) [2026-04-08 07:52:54.185438 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=28, cache_slot=28) planned desc only [2026-04-08 07:52:54.221583 INFO buffer_manager] Allocated weights buffer at (10456641536, 0) [2026-04-08 07:52:54.221595 INFO buffer_manager] Allocated weights buffer at (10456641536, 132120576) [2026-04-08 07:52:54.221602 INFO buffer_manager] Allocated weights buffer at (10588762112, 57344) [2026-04-08 07:52:54.221604 INFO buffer_manager] Allocated weights buffer at (10588819456, 132120576) [2026-04-08 07:52:54.221605 INFO buffer_manager] Allocated weights buffer at (10720940032, 57344) [2026-04-08 07:52:54.221607 INFO buffer_manager] Allocated weights buffer at (10720997376, 132120576) [2026-04-08 07:52:54.221608 INFO buffer_manager] Allocated weights buffer at (10853117952, 57344) [2026-04-08 07:52:54.221610 INFO buffer_manager] Allocated weights buffer at (10853175296, 0) [2026-04-08 07:52:54.221611 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=29, cache_slot=29) planned desc only [2026-04-08 07:52:54.257810 INFO buffer_manager] Allocated weights buffer at (10853175296, 0) [2026-04-08 07:52:54.257824 INFO buffer_manager] Allocated weights buffer at (10853175296, 132120576) [2026-04-08 07:52:54.257826 INFO buffer_manager] Allocated weights buffer at (10985295872, 57344) [2026-04-08 07:52:54.257828 INFO buffer_manager] Allocated weights buffer at (10985353216, 132120576) [2026-04-08 07:52:54.257829 INFO buffer_manager] Allocated weights buffer at (11117473792, 57344) [2026-04-08 07:52:54.257831 INFO buffer_manager] Allocated weights buffer at (11117531136, 132120576) [2026-04-08 07:52:54.257832 INFO buffer_manager] Allocated weights buffer at (11249651712, 57344) [2026-04-08 07:52:54.257834 INFO buffer_manager] Allocated weights buffer at (11249709056, 0) [2026-04-08 07:52:54.257835 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=30, cache_slot=30) planned desc only [2026-04-08 07:52:54.294094 INFO buffer_manager] Allocated weights buffer at (11249709056, 0) [2026-04-08 07:52:54.294108 INFO buffer_manager] Allocated weights buffer at (11249709056, 132120576) [2026-04-08 07:52:54.294110 INFO buffer_manager] Allocated weights buffer at (11381829632, 57344) [2026-04-08 07:52:54.294112 INFO buffer_manager] Allocated weights buffer at (11381886976, 132120576) [2026-04-08 07:52:54.294113 INFO buffer_manager] Allocated weights buffer at (11514007552, 57344) [2026-04-08 07:52:54.294115 INFO buffer_manager] Allocated weights buffer at (11514064896, 132120576) [2026-04-08 07:52:54.294116 INFO buffer_manager] Allocated weights buffer at (11646185472, 57344) [2026-04-08 07:52:54.294118 INFO buffer_manager] Allocated weights buffer at (11646242816, 0) [2026-04-08 07:52:54.294119 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=31, cache_slot=31) planned desc only [2026-04-08 07:52:54.330358 INFO buffer_manager] Allocated weights buffer at (11646242816, 0) [2026-04-08 07:52:54.330372 INFO buffer_manager] Allocated weights buffer at (11646242816, 132120576) [2026-04-08 07:52:54.330374 INFO buffer_manager] Allocated weights buffer at (11778363392, 57344) [2026-04-08 07:52:54.330376 INFO buffer_manager] Allocated weights buffer at (11778420736, 132120576) [2026-04-08 07:52:54.330377 INFO buffer_manager] Allocated weights buffer at (11910541312, 57344) [2026-04-08 07:52:54.330379 INFO buffer_manager] Allocated weights buffer at (11910598656, 132120576) [2026-04-08 07:52:54.330380 INFO buffer_manager] Allocated weights buffer at (12042719232, 57344) [2026-04-08 07:52:54.330382 INFO buffer_manager] Allocated weights buffer at (12042776576, 0) [2026-04-08 07:52:54.330383 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=32, cache_slot=32) planned desc only [2026-04-08 07:52:54.366633 INFO buffer_manager] Allocated weights buffer at (12042776576, 0) [2026-04-08 07:52:54.366646 INFO buffer_manager] Allocated weights buffer at (12042776576, 132120576) [2026-04-08 07:52:54.366655 INFO buffer_manager] Allocated weights buffer at (12174897152, 57344) [2026-04-08 07:52:54.366657 INFO buffer_manager] Allocated weights buffer at (12174954496, 132120576) [2026-04-08 07:52:54.366659 INFO buffer_manager] Allocated weights buffer at (12307075072, 57344) [2026-04-08 07:52:54.366660 INFO buffer_manager] Allocated weights buffer at (12307132416, 132120576) [2026-04-08 07:52:54.366662 INFO buffer_manager] Allocated weights buffer at (12439252992, 57344) [2026-04-08 07:52:54.366668 INFO buffer_manager] Allocated weights buffer at (12439310336, 0) [2026-04-08 07:52:54.366670 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=33, cache_slot=33) planned desc only [2026-04-08 07:52:54.402827 INFO buffer_manager] Allocated weights buffer at (12439310336, 0) [2026-04-08 07:52:54.402841 INFO buffer_manager] Allocated weights buffer at (12439310336, 132120576) [2026-04-08 07:52:54.402843 INFO buffer_manager] Allocated weights buffer at (12571430912, 57344) [2026-04-08 07:52:54.402845 INFO buffer_manager] Allocated weights buffer at (12571488256, 132120576) [2026-04-08 07:52:54.402846 INFO buffer_manager] Allocated weights buffer at (12703608832, 57344) [2026-04-08 07:52:54.402848 INFO buffer_manager] Allocated weights buffer at (12703666176, 132120576) [2026-04-08 07:52:54.402849 INFO buffer_manager] Allocated weights buffer at (12835786752, 57344) [2026-04-08 07:52:54.402851 INFO buffer_manager] Allocated weights buffer at (12835844096, 0) [2026-04-08 07:52:54.402852 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=34, cache_slot=34) planned desc only [2026-04-08 07:52:54.439026 INFO buffer_manager] Allocated weights buffer at (12835844096, 0) [2026-04-08 07:52:54.439039 INFO buffer_manager] Allocated weights buffer at (12835844096, 132120576) [2026-04-08 07:52:54.439041 INFO buffer_manager] Allocated weights buffer at (12967964672, 57344) [2026-04-08 07:52:54.439043 INFO buffer_manager] Allocated weights buffer at (12968022016, 132120576) [2026-04-08 07:52:54.439044 INFO buffer_manager] Allocated weights buffer at (13100142592, 57344) [2026-04-08 07:52:54.439046 INFO buffer_manager] Allocated weights buffer at (13100199936, 132120576) [2026-04-08 07:52:54.439048 INFO buffer_manager] Allocated weights buffer at (13232320512, 57344) [2026-04-08 07:52:54.439049 INFO buffer_manager] Allocated weights buffer at (13232377856, 0) [2026-04-08 07:52:54.439051 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=35, cache_slot=35) planned desc only [2026-04-08 07:52:54.475265 INFO buffer_manager] Allocated weights buffer at (13232377856, 0) [2026-04-08 07:52:54.475281 INFO buffer_manager] Allocated weights buffer at (13232377856, 132120576) [2026-04-08 07:52:54.475283 INFO buffer_manager] Allocated weights buffer at (13364498432, 57344) [2026-04-08 07:52:54.475285 INFO buffer_manager] Allocated weights buffer at (13364555776, 132120576) [2026-04-08 07:52:54.475286 INFO buffer_manager] Allocated weights buffer at (13496676352, 57344) [2026-04-08 07:52:54.475288 INFO buffer_manager] Allocated weights buffer at (13496733696, 132120576) [2026-04-08 07:52:54.475289 INFO buffer_manager] Allocated weights buffer at (13628854272, 57344) [2026-04-08 07:52:54.475291 INFO buffer_manager] Allocated weights buffer at (13628911616, 0) [2026-04-08 07:52:54.475293 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=36, cache_slot=36) planned desc only [2026-04-08 07:52:54.511569 INFO buffer_manager] Allocated weights buffer at (13628911616, 0) [2026-04-08 07:52:54.511587 INFO buffer_manager] Allocated weights buffer at (13628911616, 132120576) [2026-04-08 07:52:54.511589 INFO buffer_manager] Allocated weights buffer at (13761032192, 57344) [2026-04-08 07:52:54.511591 INFO buffer_manager] Allocated weights buffer at (13761089536, 132120576) [2026-04-08 07:52:54.511592 INFO buffer_manager] Allocated weights buffer at (13893210112, 57344) [2026-04-08 07:52:54.511594 INFO buffer_manager] Allocated weights buffer at (13893267456, 132120576) [2026-04-08 07:52:54.511595 INFO buffer_manager] Allocated weights buffer at (14025388032, 57344) [2026-04-08 07:52:54.511597 INFO buffer_manager] Allocated weights buffer at (14025445376, 0) [2026-04-08 07:52:54.511598 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=37, cache_slot=37) planned desc only [2026-04-08 07:52:54.547798 INFO buffer_manager] Allocated weights buffer at (14025445376, 0) [2026-04-08 07:52:54.547812 INFO buffer_manager] Allocated weights buffer at (14025445376, 132120576) [2026-04-08 07:52:54.547818 INFO buffer_manager] Allocated weights buffer at (14157565952, 57344) [2026-04-08 07:52:54.547820 INFO buffer_manager] Allocated weights buffer at (14157623296, 132120576) [2026-04-08 07:52:54.547822 INFO buffer_manager] Allocated weights buffer at (14289743872, 57344) [2026-04-08 07:52:54.547823 INFO buffer_manager] Allocated weights buffer at (14289801216, 132120576) [2026-04-08 07:52:54.547825 INFO buffer_manager] Allocated weights buffer at (14421921792, 57344) [2026-04-08 07:52:54.547826 INFO buffer_manager] Allocated weights buffer at (14421979136, 0) [2026-04-08 07:52:54.547828 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=38, cache_slot=38) planned desc only [2026-04-08 07:52:54.584070 INFO buffer_manager] Allocated weights buffer at (14421979136, 0) [2026-04-08 07:52:54.584084 INFO buffer_manager] Allocated weights buffer at (14421979136, 132120576) [2026-04-08 07:52:54.584086 INFO buffer_manager] Allocated weights buffer at (14554099712, 57344) [2026-04-08 07:52:54.584088 INFO buffer_manager] Allocated weights buffer at (14554157056, 132120576) [2026-04-08 07:52:54.584089 INFO buffer_manager] Allocated weights buffer at (14686277632, 57344) [2026-04-08 07:52:54.584091 INFO buffer_manager] Allocated weights buffer at (14686334976, 132120576) [2026-04-08 07:52:54.584093 INFO buffer_manager] Allocated weights buffer at (14818455552, 57344) [2026-04-08 07:52:54.584094 INFO buffer_manager] Allocated weights buffer at (14818512896, 0) [2026-04-08 07:52:54.584097 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=39, cache_slot=39) planned desc only [2026-04-08 07:52:54.620381 INFO buffer_manager] Allocated weights buffer at (14818512896, 0) [2026-04-08 07:52:54.620396 INFO buffer_manager] Allocated weights buffer at (14818512896, 132120576) [2026-04-08 07:52:54.620398 INFO buffer_manager] Allocated weights buffer at (14950633472, 57344) [2026-04-08 07:52:54.620399 INFO buffer_manager] Allocated weights buffer at (14950690816, 132120576) [2026-04-08 07:52:54.620401 INFO buffer_manager] Allocated weights buffer at (15082811392, 57344) [2026-04-08 07:52:54.620403 INFO buffer_manager] Allocated weights buffer at (15082868736, 132120576) [2026-04-08 07:52:54.620404 INFO buffer_manager] Allocated weights buffer at (15214989312, 57344) [2026-04-08 07:52:54.620406 INFO buffer_manager] Allocated weights buffer at (15215046656, 0) [2026-04-08 07:52:54.620407 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=40, cache_slot=40) planned desc only [2026-04-08 07:52:54.656659 INFO buffer_manager] Allocated weights buffer at (15215046656, 0) [2026-04-08 07:52:54.656671 INFO buffer_manager] Allocated weights buffer at (15215046656, 132120576) [2026-04-08 07:52:54.656673 INFO buffer_manager] Allocated weights buffer at (15347167232, 57344) [2026-04-08 07:52:54.656675 INFO buffer_manager] Allocated weights buffer at (15347224576, 132120576) [2026-04-08 07:52:54.656676 INFO buffer_manager] Allocated weights buffer at (15479345152, 57344) [2026-04-08 07:52:54.656678 INFO buffer_manager] Allocated weights buffer at (15479402496, 132120576) [2026-04-08 07:52:54.656679 INFO buffer_manager] Allocated weights buffer at (15611523072, 57344) [2026-04-08 07:52:54.656681 INFO buffer_manager] Allocated weights buffer at (15611580416, 0) [2026-04-08 07:52:54.656682 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=41, cache_slot=41) planned desc only [2026-04-08 07:52:54.692836 INFO buffer_manager] Allocated weights buffer at (15611580416, 0) [2026-04-08 07:52:54.692850 INFO buffer_manager] Allocated weights buffer at (15611580416, 132120576) [2026-04-08 07:52:54.692852 INFO buffer_manager] Allocated weights buffer at (15743700992, 57344) [2026-04-08 07:52:54.692853 INFO buffer_manager] Allocated weights buffer at (15743758336, 132120576) [2026-04-08 07:52:54.692855 INFO buffer_manager] Allocated weights buffer at (15875878912, 57344) [2026-04-08 07:52:54.692856 INFO buffer_manager] Allocated weights buffer at (15875936256, 132120576) [2026-04-08 07:52:54.692858 INFO buffer_manager] Allocated weights buffer at (16008056832, 57344) [2026-04-08 07:52:54.692863 INFO buffer_manager] Allocated weights buffer at (16008114176, 0) [2026-04-08 07:52:54.692865 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=42, cache_slot=42) planned desc only [2026-04-08 07:52:54.729048 INFO buffer_manager] Allocated weights buffer at (16008114176, 0) [2026-04-08 07:52:54.729060 INFO buffer_manager] Allocated weights buffer at (16008114176, 132120576) [2026-04-08 07:52:54.729063 INFO buffer_manager] Allocated weights buffer at (16140234752, 57344) [2026-04-08 07:52:54.729065 INFO buffer_manager] Allocated weights buffer at (16140292096, 132120576) [2026-04-08 07:52:54.729066 INFO buffer_manager] Allocated weights buffer at (16272412672, 57344) [2026-04-08 07:52:54.729068 INFO buffer_manager] Allocated weights buffer at (16272470016, 132120576) [2026-04-08 07:52:54.729069 INFO buffer_manager] Allocated weights buffer at (16404590592, 57344) [2026-04-08 07:52:54.729071 INFO buffer_manager] Allocated weights buffer at (16404647936, 0) [2026-04-08 07:52:54.729072 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=43, cache_slot=43) planned desc only [2026-04-08 07:52:54.765256 INFO buffer_manager] Allocated weights buffer at (16404647936, 0) [2026-04-08 07:52:54.765269 INFO buffer_manager] Allocated weights buffer at (16404647936, 132120576) [2026-04-08 07:52:54.765271 INFO buffer_manager] Allocated weights buffer at (16536768512, 57344) [2026-04-08 07:52:54.765273 INFO buffer_manager] Allocated weights buffer at (16536825856, 132120576) [2026-04-08 07:52:54.765274 INFO buffer_manager] Allocated weights buffer at (16668946432, 57344) [2026-04-08 07:52:54.765276 INFO buffer_manager] Allocated weights buffer at (16669003776, 132120576) [2026-04-08 07:52:54.765277 INFO buffer_manager] Allocated weights buffer at (16801124352, 57344) [2026-04-08 07:52:54.765279 INFO buffer_manager] Allocated weights buffer at (16801181696, 0) [2026-04-08 07:52:54.765280 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=44, cache_slot=44) planned desc only [2026-04-08 07:52:54.801344 INFO buffer_manager] Allocated weights buffer at (16801181696, 0) [2026-04-08 07:52:54.801356 INFO buffer_manager] Allocated weights buffer at (16801181696, 132120576) [2026-04-08 07:52:54.801359 INFO buffer_manager] Allocated weights buffer at (16933302272, 57344) [2026-04-08 07:52:54.801360 INFO buffer_manager] Allocated weights buffer at (16933359616, 132120576) [2026-04-08 07:52:54.801362 INFO buffer_manager] Allocated weights buffer at (17065480192, 57344) [2026-04-08 07:52:54.801363 INFO buffer_manager] Allocated weights buffer at (17065537536, 132120576) [2026-04-08 07:52:54.801365 INFO buffer_manager] Allocated weights buffer at (17197658112, 57344) [2026-04-08 07:52:54.801366 INFO buffer_manager] Allocated weights buffer at (17197715456, 0) [2026-04-08 07:52:54.801368 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=45, cache_slot=45) planned desc only [2026-04-08 07:52:54.837410 INFO buffer_manager] Allocated weights buffer at (17197715456, 0) [2026-04-08 07:52:54.837423 INFO buffer_manager] Allocated weights buffer at (17197715456, 132120576) [2026-04-08 07:52:54.837425 INFO buffer_manager] Allocated weights buffer at (17329836032, 57344) [2026-04-08 07:52:54.837427 INFO buffer_manager] Allocated weights buffer at (17329893376, 132120576) [2026-04-08 07:52:54.837428 INFO buffer_manager] Allocated weights buffer at (17462013952, 57344) [2026-04-08 07:52:54.837430 INFO buffer_manager] Allocated weights buffer at (17462071296, 132120576) [2026-04-08 07:52:54.837431 INFO buffer_manager] Allocated weights buffer at (17594191872, 57344) [2026-04-08 07:52:54.837433 INFO buffer_manager] Allocated weights buffer at (17594249216, 0) [2026-04-08 07:52:54.837434 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=46, cache_slot=46) planned desc only [2026-04-08 07:52:54.873539 INFO buffer_manager] Allocated weights buffer at (17594249216, 0) [2026-04-08 07:52:54.873553 INFO buffer_manager] Allocated weights buffer at (17594249216, 132120576) [2026-04-08 07:52:54.873561 INFO buffer_manager] Allocated weights buffer at (17726369792, 57344) [2026-04-08 07:52:54.873563 INFO buffer_manager] Allocated weights buffer at (17726427136, 132120576) [2026-04-08 07:52:54.873564 INFO buffer_manager] Allocated weights buffer at (17858547712, 57344) [2026-04-08 07:52:54.873566 INFO buffer_manager] Allocated weights buffer at (17858605056, 132120576) [2026-04-08 07:52:54.873568 INFO buffer_manager] Allocated weights buffer at (17990725632, 57344) [2026-04-08 07:52:54.873569 INFO buffer_manager] Allocated weights buffer at (17990782976, 0) [2026-04-08 07:52:54.873571 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=47, cache_slot=47) planned desc only [2026-04-08 07:52:54.909662 INFO buffer_manager] Allocated weights buffer at (17990782976, 0) [2026-04-08 07:52:54.909675 INFO buffer_manager] Allocated weights buffer at (17990782976, 132120576) [2026-04-08 07:52:54.909691 INFO buffer_manager] Allocated weights buffer at (18122903552, 57344) [2026-04-08 07:52:54.909692 INFO buffer_manager] Allocated weights buffer at (18122960896, 132120576) [2026-04-08 07:52:54.909694 INFO buffer_manager] Allocated weights buffer at (18255081472, 57344) [2026-04-08 07:52:54.909696 INFO buffer_manager] Allocated weights buffer at (18255138816, 132120576) [2026-04-08 07:52:54.909698 INFO buffer_manager] Allocated weights buffer at (18387259392, 57344) [2026-04-08 07:52:54.909699 INFO buffer_manager] Allocated weights buffer at (18387316736, 0) [2026-04-08 07:52:54.909701 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=48, cache_slot=48) planned desc only [2026-04-08 07:52:54.945768 INFO buffer_manager] Allocated weights buffer at (18387316736, 0) [2026-04-08 07:52:54.945783 INFO buffer_manager] Allocated weights buffer at (18387316736, 132120576) [2026-04-08 07:52:54.945786 INFO buffer_manager] Allocated weights buffer at (18519437312, 57344) [2026-04-08 07:52:54.945787 INFO buffer_manager] Allocated weights buffer at (18519494656, 132120576) [2026-04-08 07:52:54.945789 INFO buffer_manager] Allocated weights buffer at (18651615232, 57344) [2026-04-08 07:52:54.945790 INFO buffer_manager] Allocated weights buffer at (18651672576, 132120576) [2026-04-08 07:52:54.945792 INFO buffer_manager] Allocated weights buffer at (18783793152, 57344) [2026-04-08 07:52:54.945793 INFO buffer_manager] Allocated weights buffer at (18783850496, 0) [2026-04-08 07:52:54.945795 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=49, cache_slot=49) planned desc only [2026-04-08 07:52:54.981908 INFO buffer_manager] Allocated weights buffer at (18783850496, 0) [2026-04-08 07:52:54.981921 INFO buffer_manager] Allocated weights buffer at (18783850496, 132120576) [2026-04-08 07:52:54.981923 INFO buffer_manager] Allocated weights buffer at (18915971072, 57344) [2026-04-08 07:52:54.981925 INFO buffer_manager] Allocated weights buffer at (18916028416, 132120576) [2026-04-08 07:52:54.981926 INFO buffer_manager] Allocated weights buffer at (19048148992, 57344) [2026-04-08 07:52:54.981928 INFO buffer_manager] Allocated weights buffer at (19048206336, 132120576) [2026-04-08 07:52:54.981929 INFO buffer_manager] Allocated weights buffer at (19180326912, 57344) [2026-04-08 07:52:54.981931 INFO buffer_manager] Allocated weights buffer at (19180384256, 0) [2026-04-08 07:52:54.981932 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=50, cache_slot=50) planned desc only [2026-04-08 07:52:55.018041 INFO buffer_manager] Allocated weights buffer at (19180384256, 0) [2026-04-08 07:52:55.018054 INFO buffer_manager] Allocated weights buffer at (19180384256, 132120576) [2026-04-08 07:52:55.018056 INFO buffer_manager] Allocated weights buffer at (19312504832, 57344) [2026-04-08 07:52:55.018057 INFO buffer_manager] Allocated weights buffer at (19312562176, 132120576) [2026-04-08 07:52:55.018059 INFO buffer_manager] Allocated weights buffer at (19444682752, 57344) [2026-04-08 07:52:55.018061 INFO buffer_manager] Allocated weights buffer at (19444740096, 132120576) [2026-04-08 07:52:55.018069 INFO buffer_manager] Allocated weights buffer at (19576860672, 57344) [2026-04-08 07:52:55.018070 INFO buffer_manager] Allocated weights buffer at (19576918016, 0) [2026-04-08 07:52:55.018072 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=51, cache_slot=51) planned desc only [2026-04-08 07:52:55.054149 INFO buffer_manager] Allocated weights buffer at (19576918016, 0) [2026-04-08 07:52:55.054162 INFO buffer_manager] Allocated weights buffer at (19576918016, 132120576) [2026-04-08 07:52:55.054164 INFO buffer_manager] Allocated weights buffer at (19709038592, 57344) [2026-04-08 07:52:55.054165 INFO buffer_manager] Allocated weights buffer at (19709095936, 132120576) [2026-04-08 07:52:55.054167 INFO buffer_manager] Allocated weights buffer at (19841216512, 57344) [2026-04-08 07:52:55.054168 INFO buffer_manager] Allocated weights buffer at (19841273856, 132120576) [2026-04-08 07:52:55.054170 INFO buffer_manager] Allocated weights buffer at (19973394432, 57344) [2026-04-08 07:52:55.054171 INFO buffer_manager] Allocated weights buffer at (19973451776, 0) [2026-04-08 07:52:55.054173 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=52, cache_slot=52) planned desc only [2026-04-08 07:52:55.090329 INFO buffer_manager] Allocated weights buffer at (19973451776, 0) [2026-04-08 07:52:55.090347 INFO buffer_manager] Allocated weights buffer at (19973451776, 132120576) [2026-04-08 07:52:55.090349 INFO buffer_manager] Allocated weights buffer at (20105572352, 57344) [2026-04-08 07:52:55.090351 INFO buffer_manager] Allocated weights buffer at (20105629696, 132120576) [2026-04-08 07:52:55.090352 INFO buffer_manager] Allocated weights buffer at (20237750272, 57344) [2026-04-08 07:52:55.090354 INFO buffer_manager] Allocated weights buffer at (20237807616, 132120576) [2026-04-08 07:52:55.090355 INFO buffer_manager] Allocated weights buffer at (20369928192, 57344) [2026-04-08 07:52:55.090357 INFO buffer_manager] Allocated weights buffer at (20369985536, 0) [2026-04-08 07:52:55.090358 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=53, cache_slot=53) planned desc only [2026-04-08 07:52:55.126541 INFO buffer_manager] Allocated weights buffer at (20369985536, 0) [2026-04-08 07:52:55.126555 INFO buffer_manager] Allocated weights buffer at (20369985536, 132120576) [2026-04-08 07:52:55.126557 INFO buffer_manager] Allocated weights buffer at (20502106112, 57344) [2026-04-08 07:52:55.126559 INFO buffer_manager] Allocated weights buffer at (20502163456, 132120576) [2026-04-08 07:52:55.126561 INFO buffer_manager] Allocated weights buffer at (20634284032, 57344) [2026-04-08 07:52:55.126562 INFO buffer_manager] Allocated weights buffer at (20634341376, 132120576) [2026-04-08 07:52:55.126564 INFO buffer_manager] Allocated weights buffer at (20766461952, 57344) [2026-04-08 07:52:55.126565 INFO buffer_manager] Allocated weights buffer at (20766519296, 0) [2026-04-08 07:52:55.126567 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=54, cache_slot=54) planned desc only [2026-04-08 07:52:55.162960 INFO buffer_manager] Allocated weights buffer at (20766519296, 0) [2026-04-08 07:52:55.162973 INFO buffer_manager] Allocated weights buffer at (20766519296, 132120576) [2026-04-08 07:52:55.162975 INFO buffer_manager] Allocated weights buffer at (20898639872, 57344) [2026-04-08 07:52:55.162976 INFO buffer_manager] Allocated weights buffer at (20898697216, 132120576) [2026-04-08 07:52:55.162978 INFO buffer_manager] Allocated weights buffer at (21030817792, 57344) [2026-04-08 07:52:55.162979 INFO buffer_manager] Allocated weights buffer at (21030875136, 132120576) [2026-04-08 07:52:55.162981 INFO buffer_manager] Allocated weights buffer at (21162995712, 57344) [2026-04-08 07:52:55.162982 INFO buffer_manager] Allocated weights buffer at (21163053056, 0) [2026-04-08 07:52:55.162984 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=55, cache_slot=55) planned desc only [2026-04-08 07:52:55.199327 INFO buffer_manager] Allocated weights buffer at (21163053056, 0) [2026-04-08 07:52:55.199344 INFO buffer_manager] Allocated weights buffer at (21163053056, 132120576) [2026-04-08 07:52:55.199346 INFO buffer_manager] Allocated weights buffer at (21295173632, 57344) [2026-04-08 07:52:55.199348 INFO buffer_manager] Allocated weights buffer at (21295230976, 132120576) [2026-04-08 07:52:55.199350 INFO buffer_manager] Allocated weights buffer at (21427351552, 57344) [2026-04-08 07:52:55.199351 INFO buffer_manager] Allocated weights buffer at (21427408896, 132120576) [2026-04-08 07:52:55.199353 INFO buffer_manager] Allocated weights buffer at (21559529472, 57344) [2026-04-08 07:52:55.199356 INFO buffer_manager] Allocated weights buffer at (21559586816, 0) [2026-04-08 07:52:55.199358 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=56, cache_slot=56) planned desc only [2026-04-08 07:52:55.235671 INFO buffer_manager] Allocated weights buffer at (21559586816, 0) [2026-04-08 07:52:55.235683 INFO buffer_manager] Allocated weights buffer at (21559586816, 132120576) [2026-04-08 07:52:55.235685 INFO buffer_manager] Allocated weights buffer at (21691707392, 57344) [2026-04-08 07:52:55.235687 INFO buffer_manager] Allocated weights buffer at (21691764736, 132120576) [2026-04-08 07:52:55.235688 INFO buffer_manager] Allocated weights buffer at (21823885312, 57344) [2026-04-08 07:52:55.235690 INFO buffer_manager] Allocated weights buffer at (21823942656, 132120576) [2026-04-08 07:52:55.235691 INFO buffer_manager] Allocated weights buffer at (21956063232, 57344) [2026-04-08 07:52:55.235693 INFO buffer_manager] Allocated weights buffer at (21956120576, 0) [2026-04-08 07:52:55.235694 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=57, cache_slot=57) planned desc only [2026-04-08 07:52:55.271923 INFO buffer_manager] Allocated weights buffer at (21956120576, 0) [2026-04-08 07:52:55.271937 INFO buffer_manager] Allocated weights buffer at (21956120576, 132120576) [2026-04-08 07:52:55.271940 INFO buffer_manager] Allocated weights buffer at (22088241152, 57344) [2026-04-08 07:52:55.271942 INFO buffer_manager] Allocated weights buffer at (22088298496, 132120576) [2026-04-08 07:52:55.271943 INFO buffer_manager] Allocated weights buffer at (22220419072, 57344) [2026-04-08 07:52:55.271945 INFO buffer_manager] Allocated weights buffer at (22220476416, 132120576) [2026-04-08 07:52:55.271946 INFO buffer_manager] Allocated weights buffer at (22352596992, 57344) [2026-04-08 07:52:55.271948 INFO buffer_manager] Allocated weights buffer at (22352654336, 0) [2026-04-08 07:52:55.271949 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=58, cache_slot=58) planned desc only [2026-04-08 07:52:55.308283 INFO buffer_manager] Allocated weights buffer at (22352654336, 0) [2026-04-08 07:52:55.308296 INFO buffer_manager] Allocated weights buffer at (22352654336, 132120576) [2026-04-08 07:52:55.308298 INFO buffer_manager] Allocated weights buffer at (22484774912, 57344) [2026-04-08 07:52:55.308299 INFO buffer_manager] Allocated weights buffer at (22484832256, 132120576) [2026-04-08 07:52:55.308301 INFO buffer_manager] Allocated weights buffer at (22616952832, 57344) [2026-04-08 07:52:55.308303 INFO buffer_manager] Allocated weights buffer at (22617010176, 132120576) [2026-04-08 07:52:55.308304 INFO buffer_manager] Allocated weights buffer at (22749130752, 57344) [2026-04-08 07:52:55.308306 INFO buffer_manager] Allocated weights buffer at (22749188096, 0) [2026-04-08 07:52:55.308307 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=59, cache_slot=59) planned desc only [2026-04-08 07:52:55.344589 INFO buffer_manager] Allocated weights buffer at (22749188096, 0) [2026-04-08 07:52:55.344602 INFO buffer_manager] Allocated weights buffer at (22749188096, 132120576) [2026-04-08 07:52:55.344604 INFO buffer_manager] Allocated weights buffer at (22881308672, 57344) [2026-04-08 07:52:55.344605 INFO buffer_manager] Allocated weights buffer at (22881366016, 132120576) [2026-04-08 07:52:55.344607 INFO buffer_manager] Allocated weights buffer at (23013486592, 57344) [2026-04-08 07:52:55.344608 INFO buffer_manager] Allocated weights buffer at (23013543936, 132120576) [2026-04-08 07:52:55.344614 INFO buffer_manager] Allocated weights buffer at (23145664512, 57344) [2026-04-08 07:52:55.344616 INFO buffer_manager] Allocated weights buffer at (23145721856, 0) [2026-04-08 07:52:55.344617 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=60, cache_slot=60) planned desc only [2026-04-08 07:52:55.707499 INFO buffer_manager] Allocated weights buffer at (23145721856, 0) [2026-04-08 07:52:55.707521 INFO buffer_manager] Allocated weights buffer at (23145721856, 132120576) [2026-04-08 07:52:55.707523 INFO buffer_manager] Allocated weights buffer at (23277842432, 57344) [2026-04-08 07:52:55.707525 INFO buffer_manager] Allocated weights buffer at (23277899776, 132120576) [2026-04-08 07:52:55.707526 INFO buffer_manager] Allocated weights buffer at (23410020352, 57344) [2026-04-08 07:52:55.707528 INFO buffer_manager] Allocated weights buffer at (23410077696, 132120576) [2026-04-08 07:52:55.707530 INFO buffer_manager] Allocated weights buffer at (23542198272, 57344) [2026-04-08 07:52:55.707531 INFO buffer_manager] Allocated weights buffer at (23542255616, 0) [2026-04-08 07:52:55.707533 INFO fp8_moe_dpdk] fp8_moe_dpdk: init_layer_cached(layer_idx=61, cache_slot=61) planned desc only [2026-04-08 07:53:10.672840 INFO fp8_dpdk_common] fp9 fast path forced on by default in the current kernel build [2026-04-08 07:53:10.692483 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=88, expert_tiles=88, avg_tile_batch=1.27, prepare=429.044µs, send=3.164221ms, judge_wait=14.363103ms, fetch=1.161341ms, reduce=22ns; duck time-ns stats: p50=13.998113ms, p90=14.031549ms, max=14.054849ms; kernel_model: matmul=0.308281 GFLOP (21.934 GFLOP/s @ duck_max), param_stream=0.121111G (8.617 Gparam/s @ duck_max), weight_stream=129.994 MiB (9.698 GB/s @ duck_max) [2026-04-08 07:53:10.710603 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=88, expert_tiles=88, avg_tile_batch=1.27, prepare=121.871µs, send=769.28µs, judge_wait=14.394741ms, fetch=858.847µs, reduce=21ns; duck time-ns stats: p50=14.203147ms, p90=14.239515ms, max=14.261517ms; kernel_model: matmul=0.308281 GFLOP (21.616 GFLOP/s @ duck_max), param_stream=0.121111G (8.492 Gparam/s @ duck_max), weight_stream=129.994 MiB (9.558 GB/s @ duck_max) [2026-04-08 07:53:10.727971 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=88, expert_tiles=88, avg_tile_batch=1.27, prepare=22.831µs, send=772.831µs, judge_wait=13.918322ms, fetch=809.272µs, reduce=20ns; duck time-ns stats: p50=13.746686ms, p90=13.77692ms, max=13.797349ms; kernel_model: matmul=0.308281 GFLOP (22.344 GFLOP/s @ duck_max), param_stream=0.121111G (8.778 Gparam/s @ duck_max), weight_stream=129.994 MiB (9.879 GB/s @ duck_max) [2026-04-08 07:53:10.745987 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=92, expert_tiles=92, avg_tile_batch=1.22, prepare=77.556µs, send=772.582µs, judge_wait=14.54038ms, fetch=807.733µs, reduce=20ns; duck time-ns stats: p50=14.287876ms, p90=14.353918ms, max=14.403106ms; kernel_model: matmul=0.308281 GFLOP (21.404 GFLOP/s @ duck_max), param_stream=0.126616G (8.791 Gparam/s @ duck_max), weight_stream=135.903 MiB (9.894 GB/s @ duck_max) [2026-04-08 07:53:10.763058 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=89, expert_tiles=89, avg_tile_batch=1.26, prepare=20.535µs, send=770.362µs, judge_wait=13.630398ms, fetch=821.152µs, reduce=21ns; duck time-ns stats: p50=13.463839ms, p90=13.485709ms, max=13.506487ms; kernel_model: matmul=0.308281 GFLOP (22.825 GFLOP/s @ duck_max), param_stream=0.122487G (9.069 Gparam/s @ duck_max), weight_stream=131.471 MiB (10.207 GB/s @ duck_max) [2026-04-08 07:53:10.780155 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=76, expert_tiles=79, avg_tile_batch=1.42, prepare=20.621µs, send=769.833µs, judge_wait=13.686632ms, fetch=807.196µs, reduce=14ns; duck time-ns stats: p50=13.482482ms, p90=13.545096ms, max=13.569403ms; kernel_model: matmul=0.308281 GFLOP (22.719 GFLOP/s @ duck_max), param_stream=0.108724G (8.012 Gparam/s @ duck_max), weight_stream=116.699 MiB (9.018 GB/s @ duck_max) [2026-04-08 07:53:10.795972 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=71, expert_tiles=72, avg_tile_batch=1.56, prepare=20.317µs, send=769.996µs, judge_wait=12.431592ms, fetch=804.933µs, reduce=20ns; duck time-ns stats: p50=12.28915ms, p90=12.310893ms, max=12.316976ms; kernel_model: matmul=0.308281 GFLOP (25.029 GFLOP/s @ duck_max), param_stream=0.099090G (8.045 Gparam/s @ duck_max), weight_stream=106.359 MiB (9.055 GB/s @ duck_max) [2026-04-08 07:53:10.812682 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=75, expert_tiles=78, avg_tile_batch=1.44, prepare=20.466µs, send=772.971µs, judge_wait=13.286067ms, fetch=805.605µs, reduce=23ns; duck time-ns stats: p50=13.076624ms, p90=13.130905ms, max=13.162243ms; kernel_model: matmul=0.308281 GFLOP (23.422 GFLOP/s @ duck_max), param_stream=0.107348G (8.156 Gparam/s @ duck_max), weight_stream=115.222 MiB (9.179 GB/s @ duck_max) [2026-04-08 07:53:10.828815 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=73, expert_tiles=76, avg_tile_batch=1.47, prepare=20.698µs, send=774.521µs, judge_wait=12.754714ms, fetch=807.397µs, reduce=19ns; duck time-ns stats: p50=12.520174ms, p90=12.583356ms, max=12.629072ms; kernel_model: matmul=0.308281 GFLOP (24.410 GFLOP/s @ duck_max), param_stream=0.104595G (8.282 Gparam/s @ duck_max), weight_stream=112.267 MiB (9.321 GB/s @ duck_max) [2026-04-08 07:53:10.844126 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=68, expert_tiles=69, avg_tile_batch=1.62, prepare=20.345µs, send=772.826µs, judge_wait=11.910255ms, fetch=808.315µs, reduce=19ns; duck time-ns stats: p50=11.748197ms, p90=11.77451ms, max=11.798325ms; kernel_model: matmul=0.308281 GFLOP (26.129 GFLOP/s @ duck_max), param_stream=0.094962G (8.049 Gparam/s @ duck_max), weight_stream=101.927 MiB (9.059 GB/s @ duck_max) [2026-04-08 07:53:10.860003 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=62, expert_tiles=67, avg_tile_batch=1.67, prepare=20.97µs, send=768.923µs, judge_wait=12.468608ms, fetch=806.576µs, reduce=20ns; duck time-ns stats: p50=12.3043ms, p90=12.328225ms, max=12.337735ms; kernel_model: matmul=0.308281 GFLOP (24.987 GFLOP/s @ duck_max), param_stream=0.092209G (7.474 Gparam/s @ duck_max), weight_stream=98.973 MiB (8.412 GB/s @ duck_max) [2026-04-08 07:53:10.876032 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=57, expert_tiles=62, avg_tile_batch=1.81, prepare=20.028µs, send=769.516µs, judge_wait=12.632544ms, fetch=809.398µs, reduce=22ns; duck time-ns stats: p50=12.440917ms, p90=12.468891ms, max=12.513578ms; kernel_model: matmul=0.308281 GFLOP (24.636 GFLOP/s @ duck_max), param_stream=0.085328G (6.819 Gparam/s @ duck_max), weight_stream=91.587 MiB (7.675 GB/s @ duck_max) [2026-04-08 07:53:10.892587 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=60, expert_tiles=65, avg_tile_batch=1.72, prepare=20.881µs, send=770.309µs, judge_wait=13.163795ms, fetch=809.752µs, reduce=19ns; duck time-ns stats: p50=12.993524ms, p90=13.016837ms, max=13.03324ms; kernel_model: matmul=0.308281 GFLOP (23.653 GFLOP/s @ duck_max), param_stream=0.089457G (6.864 Gparam/s @ duck_max), weight_stream=96.018 MiB (7.725 GB/s @ duck_max) [2026-04-08 07:53:10.908453 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=56, expert_tiles=60, avg_tile_batch=1.87, prepare=19.793µs, send=768.596µs, judge_wait=12.453007ms, fetch=809.066µs, reduce=19ns; duck time-ns stats: p50=12.304547ms, p90=12.323038ms, max=12.332237ms; kernel_model: matmul=0.308281 GFLOP (24.998 GFLOP/s @ duck_max), param_stream=0.082575G (6.696 Gparam/s @ duck_max), weight_stream=88.632 MiB (7.536 GB/s @ duck_max) [2026-04-08 07:53:10.924707 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=70, expert_tiles=73, avg_tile_batch=1.53, prepare=20.325µs, send=769.693µs, judge_wait=12.840292ms, fetch=809.77µs, reduce=19ns; duck time-ns stats: p50=12.679198ms, p90=12.70409ms, max=12.724667ms; kernel_model: matmul=0.308281 GFLOP (24.227 GFLOP/s @ duck_max), param_stream=0.100467G (7.895 Gparam/s @ duck_max), weight_stream=107.836 MiB (8.886 GB/s @ duck_max) [2026-04-08 07:53:10.940992 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=62, expert_tiles=65, avg_tile_batch=1.72, prepare=20.998µs, send=770.016µs, judge_wait=12.911908ms, fetch=809.421µs, reduce=19ns; duck time-ns stats: p50=12.713126ms, p90=12.750326ms, max=12.79518ms; kernel_model: matmul=0.308281 GFLOP (24.094 GFLOP/s @ duck_max), param_stream=0.089457G (6.991 Gparam/s @ duck_max), weight_stream=96.018 MiB (7.869 GB/s @ duck_max) [2026-04-08 07:53:10.957163 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=65, expert_tiles=69, avg_tile_batch=1.62, prepare=20.03µs, send=772.503µs, judge_wait=12.758878ms, fetch=811.032µs, reduce=14ns; duck time-ns stats: p50=12.584539ms, p90=12.602243ms, max=12.631164ms; kernel_model: matmul=0.308281 GFLOP (24.406 GFLOP/s @ duck_max), param_stream=0.094962G (7.518 Gparam/s @ duck_max), weight_stream=101.927 MiB (8.461 GB/s @ duck_max) [2026-04-08 07:53:10.972588 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=62, expert_tiles=65, avg_tile_batch=1.72, prepare=23.723µs, send=770.194µs, judge_wait=12.009732ms, fetch=807.663µs, reduce=13ns; duck time-ns stats: p50=11.814573ms, p90=11.853059ms, max=11.892328ms; kernel_model: matmul=0.308281 GFLOP (25.923 GFLOP/s @ duck_max), param_stream=0.089457G (7.522 Gparam/s @ duck_max), weight_stream=96.018 MiB (8.466 GB/s @ duck_max) [2026-04-08 07:53:10.988438 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=63, expert_tiles=66, avg_tile_batch=1.70, prepare=83.459µs, send=769.167µs, judge_wait=12.400028ms, fetch=808.049µs, reduce=14ns; duck time-ns stats: p50=12.22314ms, p90=12.269969ms, max=12.271706ms; kernel_model: matmul=0.308281 GFLOP (25.121 GFLOP/s @ duck_max), param_stream=0.090833G (7.402 Gparam/s @ duck_max), weight_stream=97.495 MiB (8.331 GB/s @ duck_max) [2026-04-08 07:53:11.005239 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=65, expert_tiles=68, avg_tile_batch=1.65, prepare=34.086µs, send=772.066µs, judge_wait=13.37964ms, fetch=808.802µs, reduce=15ns; duck time-ns stats: p50=13.162114ms, p90=13.198689ms, max=13.264398ms; kernel_model: matmul=0.308281 GFLOP (23.241 GFLOP/s @ duck_max), param_stream=0.093585G (7.055 Gparam/s @ duck_max), weight_stream=100.450 MiB (7.941 GB/s @ duck_max) [2026-04-08 07:53:11.020986 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=58, expert_tiles=64, avg_tile_batch=1.75, prepare=19.922µs, send=770.888µs, judge_wait=12.348667ms, fetch=809.806µs, reduce=13ns; duck time-ns stats: p50=12.164429ms, p90=12.188869ms, max=12.23338ms; kernel_model: matmul=0.308281 GFLOP (25.200 GFLOP/s @ duck_max), param_stream=0.088080G (7.200 Gparam/s @ duck_max), weight_stream=94.541 MiB (8.104 GB/s @ duck_max) [2026-04-08 07:53:11.035247 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=47, expert_tiles=55, avg_tile_batch=2.04, prepare=20.139µs, send=769.701µs, judge_wait=10.808881ms, fetch=826.124µs, reduce=164ns; duck time-ns stats: p50=10.667205ms, p90=10.686552ms, max=10.696976ms; kernel_model: matmul=0.308281 GFLOP (28.819 GFLOP/s @ duck_max), param_stream=0.075694G (7.076 Gparam/s @ duck_max), weight_stream=81.246 MiB (7.964 GB/s @ duck_max) [2026-04-08 07:53:11.049833 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=51, expert_tiles=57, avg_tile_batch=1.96, prepare=20.015µs, send=771.508µs, judge_wait=11.069471ms, fetch=809.095µs, reduce=15ns; duck time-ns stats: p50=10.901698ms, p90=10.930573ms, max=10.945554ms; kernel_model: matmul=0.308281 GFLOP (28.165 GFLOP/s @ duck_max), param_stream=0.078447G (7.167 Gparam/s @ duck_max), weight_stream=84.201 MiB (8.066 GB/s @ duck_max) [2026-04-08 07:53:11.065058 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=61, expert_tiles=66, avg_tile_batch=1.70, prepare=20.178µs, send=768.855µs, judge_wait=11.832234ms, fetch=808.485µs, reduce=14ns; duck time-ns stats: p50=11.665747ms, p90=11.69012ms, max=11.713403ms; kernel_model: matmul=0.308281 GFLOP (26.319 GFLOP/s @ duck_max), param_stream=0.090833G (7.755 Gparam/s @ duck_max), weight_stream=97.495 MiB (8.728 GB/s @ duck_max) [2026-04-08 07:53:11.080096 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=51, expert_tiles=58, avg_tile_batch=1.93, prepare=19.548µs, send=770.339µs, judge_wait=11.644089ms, fetch=810.714µs, reduce=15ns; duck time-ns stats: p50=11.451075ms, p90=11.488957ms, max=11.518441ms; kernel_model: matmul=0.308281 GFLOP (26.764 GFLOP/s @ duck_max), param_stream=0.079823G (6.930 Gparam/s @ duck_max), weight_stream=85.678 MiB (7.800 GB/s @ duck_max) [2026-04-08 07:53:11.095704 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=54, expert_tiles=60, avg_tile_batch=1.87, prepare=19.949µs, send=773.676µs, judge_wait=12.194457ms, fetch=808.811µs, reduce=21ns; duck time-ns stats: p50=12.056481ms, p90=12.070985ms, max=12.083439ms; kernel_model: matmul=0.308281 GFLOP (25.513 GFLOP/s @ duck_max), param_stream=0.082575G (6.834 Gparam/s @ duck_max), weight_stream=88.632 MiB (7.691 GB/s @ duck_max) [2026-04-08 07:53:11.111155 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=51, expert_tiles=59, avg_tile_batch=1.90, prepare=19.622µs, send=775.07µs, judge_wait=12.049014ms, fetch=809.55µs, reduce=20ns; duck time-ns stats: p50=11.905026ms, p90=11.919581ms, max=11.936957ms; kernel_model: matmul=0.308281 GFLOP (25.826 GFLOP/s @ duck_max), param_stream=0.081199G (6.802 Gparam/s @ duck_max), weight_stream=87.155 MiB (7.656 GB/s @ duck_max) [2026-04-08 07:53:11.127479 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=48, expert_tiles=57, avg_tile_batch=1.96, prepare=19.989µs, send=769.155µs, judge_wait=12.919697ms, fetch=809.941µs, reduce=14ns; duck time-ns stats: p50=12.758007ms, p90=12.781056ms, max=12.789353ms; kernel_model: matmul=0.308281 GFLOP (24.105 GFLOP/s @ duck_max), param_stream=0.078447G (6.134 Gparam/s @ duck_max), weight_stream=84.201 MiB (6.903 GB/s @ duck_max) [2026-04-08 07:53:11.142020 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=57, expert_tiles=60, avg_tile_batch=1.87, prepare=38.35µs, send=768.977µs, judge_wait=11.066676ms, fetch=811.092µs, reduce=16ns; duck time-ns stats: p50=10.900521ms, p90=10.932281ms, max=10.942881ms; kernel_model: matmul=0.308281 GFLOP (28.172 GFLOP/s @ duck_max), param_stream=0.082575G (7.546 Gparam/s @ duck_max), weight_stream=88.632 MiB (8.493 GB/s @ duck_max) [2026-04-08 07:53:11.157426 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=52, expert_tiles=57, avg_tile_batch=1.96, prepare=19.938µs, send=770.108µs, judge_wait=12.006645ms, fetch=809.275µs, reduce=14ns; duck time-ns stats: p50=11.825492ms, p90=11.853403ms, max=11.892902ms; kernel_model: matmul=0.308281 GFLOP (25.921 GFLOP/s @ duck_max), param_stream=0.078447G (6.596 Gparam/s @ duck_max), weight_stream=84.201 MiB (7.424 GB/s @ duck_max) [2026-04-08 07:53:11.172572 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=53, expert_tiles=58, avg_tile_batch=1.93, prepare=19.881µs, send=768.828µs, judge_wait=11.754925ms, fetch=810.011µs, reduce=16ns; duck time-ns stats: p50=11.587064ms, p90=11.6037ms, max=11.630051ms; kernel_model: matmul=0.308281 GFLOP (26.507 GFLOP/s @ duck_max), param_stream=0.079823G (6.863 Gparam/s @ duck_max), weight_stream=85.678 MiB (7.725 GB/s @ duck_max) [2026-04-08 07:53:11.187436 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=46, expert_tiles=55, avg_tile_batch=2.04, prepare=22.943µs, send=768.415µs, judge_wait=11.428516ms, fetch=808.985µs, reduce=17ns; duck time-ns stats: p50=11.269538ms, p90=11.304055ms, max=11.313615ms; kernel_model: matmul=0.308281 GFLOP (27.249 GFLOP/s @ duck_max), param_stream=0.075694G (6.691 Gparam/s @ duck_max), weight_stream=81.246 MiB (7.530 GB/s @ duck_max) [2026-04-08 07:53:11.203912 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=58, expert_tiles=65, avg_tile_batch=1.72, prepare=23.827µs, send=769.867µs, judge_wait=13.086495ms, fetch=807.527µs, reduce=15ns; duck time-ns stats: p50=12.924257ms, p90=12.955684ms, max=12.974438ms; kernel_model: matmul=0.308281 GFLOP (23.761 GFLOP/s @ duck_max), param_stream=0.089457G (6.895 Gparam/s @ duck_max), weight_stream=96.018 MiB (7.760 GB/s @ duck_max) [2026-04-08 07:53:11.219384 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=55, expert_tiles=61, avg_tile_batch=1.84, prepare=19.334µs, send=769.62µs, judge_wait=12.027332ms, fetch=808.452µs, reduce=20ns; duck time-ns stats: p50=11.876852ms, p90=11.892816ms, max=11.907688ms; kernel_model: matmul=0.308281 GFLOP (25.889 GFLOP/s @ duck_max), param_stream=0.083952G (7.050 Gparam/s @ duck_max), weight_stream=90.109 MiB (7.935 GB/s @ duck_max) [2026-04-08 07:53:11.235586 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=60, expert_tiles=68, avg_tile_batch=1.65, prepare=19.909µs, send=769.54µs, judge_wait=12.768897ms, fetch=806.574µs, reduce=20ns; duck time-ns stats: p50=12.605135ms, p90=12.617316ms, max=12.641321ms; kernel_model: matmul=0.308281 GFLOP (24.387 GFLOP/s @ duck_max), param_stream=0.093585G (7.403 Gparam/s @ duck_max), weight_stream=100.450 MiB (8.332 GB/s @ duck_max) [2026-04-08 07:53:11.250892 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=55, expert_tiles=63, avg_tile_batch=1.78, prepare=18.578µs, send=769.231µs, judge_wait=11.872441ms, fetch=826.802µs, reduce=21ns; duck time-ns stats: p50=11.691787ms, p90=11.709647ms, max=11.743261ms; kernel_model: matmul=0.308281 GFLOP (26.252 GFLOP/s @ duck_max), param_stream=0.086704G (7.383 Gparam/s @ duck_max), weight_stream=93.064 MiB (8.310 GB/s @ duck_max) [2026-04-08 07:53:11.265953 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=53, expert_tiles=60, avg_tile_batch=1.87, prepare=20.196µs, send=770.371µs, judge_wait=11.667296ms, fetch=809.82µs, reduce=23ns; duck time-ns stats: p50=11.482613ms, p90=11.513611ms, max=11.552634ms; kernel_model: matmul=0.308281 GFLOP (26.685 GFLOP/s @ duck_max), param_stream=0.082575G (7.148 Gparam/s @ duck_max), weight_stream=88.632 MiB (8.045 GB/s @ duck_max) [2026-04-08 07:53:11.282666 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=59, expert_tiles=64, avg_tile_batch=1.75, prepare=19.668µs, send=769.466µs, judge_wait=13.326736ms, fetch=806.977µs, reduce=16ns; duck time-ns stats: p50=13.170403ms, p90=13.183852ms, max=13.198227ms; kernel_model: matmul=0.308281 GFLOP (23.358 GFLOP/s @ duck_max), param_stream=0.088080G (6.674 Gparam/s @ duck_max), weight_stream=94.541 MiB (7.511 GB/s @ duck_max) [2026-04-08 07:53:11.299031 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=56, expert_tiles=61, avg_tile_batch=1.84, prepare=20.247µs, send=768.916µs, judge_wait=12.970354ms, fetch=811.091µs, reduce=13ns; duck time-ns stats: p50=12.800182ms, p90=12.834393ms, max=12.839328ms; kernel_model: matmul=0.308281 GFLOP (24.011 GFLOP/s @ duck_max), param_stream=0.083952G (6.539 Gparam/s @ duck_max), weight_stream=90.109 MiB (7.359 GB/s @ duck_max) [2026-04-08 07:53:11.314833 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=60, expert_tiles=66, avg_tile_batch=1.70, prepare=19.985µs, send=769.77µs, judge_wait=12.395525ms, fetch=805.855µs, reduce=13ns; duck time-ns stats: p50=12.241795ms, p90=12.254814ms, max=12.264446ms; kernel_model: matmul=0.308281 GFLOP (25.136 GFLOP/s @ duck_max), param_stream=0.090833G (7.406 Gparam/s @ duck_max), weight_stream=97.495 MiB (8.336 GB/s @ duck_max) [2026-04-08 07:53:11.331617 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=59, expert_tiles=66, avg_tile_batch=1.70, prepare=20.327µs, send=768.599µs, judge_wait=13.372764ms, fetch=809.957µs, reduce=14ns; duck time-ns stats: p50=13.203038ms, p90=13.223951ms, max=13.240392ms; kernel_model: matmul=0.308281 GFLOP (23.283 GFLOP/s @ duck_max), param_stream=0.090833G (6.860 Gparam/s @ duck_max), weight_stream=97.495 MiB (7.721 GB/s @ duck_max) [2026-04-08 07:53:11.347715 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=63, expert_tiles=68, avg_tile_batch=1.65, prepare=25.128µs, send=769.505µs, judge_wait=12.700029ms, fetch=808.454µs, reduce=21ns; duck time-ns stats: p50=12.516938ms, p90=12.550035ms, max=12.585805ms; kernel_model: matmul=0.308281 GFLOP (24.494 GFLOP/s @ duck_max), param_stream=0.093585G (7.436 Gparam/s @ duck_max), weight_stream=100.450 MiB (8.369 GB/s @ duck_max) [2026-04-08 07:53:11.364356 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=55, expert_tiles=62, avg_tile_batch=1.81, prepare=22.011µs, send=772.458µs, judge_wait=13.247498ms, fetch=812.269µs, reduce=20ns; duck time-ns stats: p50=13.082281ms, p90=13.102356ms, max=13.118903ms; kernel_model: matmul=0.308281 GFLOP (23.499 GFLOP/s @ duck_max), param_stream=0.085328G (6.504 Gparam/s @ duck_max), weight_stream=91.587 MiB (7.320 GB/s @ duck_max) [2026-04-08 07:53:11.379652 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=53, expert_tiles=59, avg_tile_batch=1.90, prepare=21.47µs, send=770.996µs, judge_wait=11.884212ms, fetch=808.844µs, reduce=22ns; duck time-ns stats: p50=11.690967ms, p90=11.71701ms, max=11.764561ms; kernel_model: matmul=0.308281 GFLOP (26.204 GFLOP/s @ duck_max), param_stream=0.081199G (6.902 Gparam/s @ duck_max), weight_stream=87.155 MiB (7.768 GB/s @ duck_max) [2026-04-08 07:53:11.395335 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=52, expert_tiles=61, avg_tile_batch=1.84, prepare=21.125µs, send=769.364µs, judge_wait=12.296514ms, fetch=809.344µs, reduce=20ns; duck time-ns stats: p50=12.107913ms, p90=12.138466ms, max=12.173098ms; kernel_model: matmul=0.308281 GFLOP (25.325 GFLOP/s @ duck_max), param_stream=0.083952G (6.896 Gparam/s @ duck_max), weight_stream=90.109 MiB (7.762 GB/s @ duck_max) [2026-04-08 07:53:11.411658 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=59, expert_tiles=65, avg_tile_batch=1.72, prepare=21.205µs, send=772.992µs, judge_wait=12.902791ms, fetch=810.429µs, reduce=13ns; duck time-ns stats: p50=12.722284ms, p90=12.757846ms, max=12.773584ms; kernel_model: matmul=0.308281 GFLOP (24.134 GFLOP/s @ duck_max), param_stream=0.089457G (7.003 Gparam/s @ duck_max), weight_stream=96.018 MiB (7.882 GB/s @ duck_max) [2026-04-08 07:53:11.427268 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=58, expert_tiles=64, avg_tile_batch=1.75, prepare=19.951µs, send=771.134µs, judge_wait=12.227306ms, fetch=809.183µs, reduce=13ns; duck time-ns stats: p50=12.055132ms, p90=12.075311ms, max=12.099493ms; kernel_model: matmul=0.308281 GFLOP (25.479 GFLOP/s @ duck_max), param_stream=0.088080G (7.280 Gparam/s @ duck_max), weight_stream=94.541 MiB (8.193 GB/s @ duck_max) [2026-04-08 07:53:11.442004 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=58, expert_tiles=63, avg_tile_batch=1.78, prepare=19.815µs, send=772.465µs, judge_wait=11.342339ms, fetch=807.083µs, reduce=15ns; duck time-ns stats: p50=11.165389ms, p90=11.198704ms, max=11.2182ms; kernel_model: matmul=0.308281 GFLOP (27.480 GFLOP/s @ duck_max), param_stream=0.086704G (7.729 Gparam/s @ duck_max), weight_stream=93.064 MiB (8.699 GB/s @ duck_max) [2026-04-08 07:53:11.460956 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=50, expert_tiles=58, avg_tile_batch=1.93, prepare=20.139µs, send=770.578µs, judge_wait=15.565815ms, fetch=809.764µs, reduce=14ns; duck time-ns stats: p50=15.413195ms, p90=15.433321ms, max=15.444659ms; kernel_model: matmul=0.308281 GFLOP (19.960 GFLOP/s @ duck_max), param_stream=0.079823G (5.168 Gparam/s @ duck_max), weight_stream=85.678 MiB (5.817 GB/s @ duck_max) [2026-04-08 07:53:11.476115 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=53, expert_tiles=58, avg_tile_batch=1.93, prepare=19.349µs, send=774.16µs, judge_wait=11.753744ms, fetch=813.78µs, reduce=14ns; duck time-ns stats: p50=11.580432ms, p90=11.614002ms, max=11.638597ms; kernel_model: matmul=0.308281 GFLOP (26.488 GFLOP/s @ duck_max), param_stream=0.079823G (6.858 Gparam/s @ duck_max), weight_stream=85.678 MiB (7.719 GB/s @ duck_max) [2026-04-08 07:53:11.492370 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=67, expert_tiles=69, avg_tile_batch=1.62, prepare=19.615µs, send=769.874µs, judge_wait=12.778229ms, fetch=813.163µs, reduce=16ns; duck time-ns stats: p50=12.615504ms, p90=12.631095ms, max=12.657459ms; kernel_model: matmul=0.308281 GFLOP (24.356 GFLOP/s @ duck_max), param_stream=0.094962G (7.502 Gparam/s @ duck_max), weight_stream=101.927 MiB (8.444 GB/s @ duck_max) [2026-04-08 07:53:11.509220 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=57, expert_tiles=63, avg_tile_batch=1.78, prepare=22.156µs, send=769.429µs, judge_wait=13.435832ms, fetch=810.999µs, reduce=15ns; duck time-ns stats: p50=13.242952ms, p90=13.28036ms, max=13.306751ms; kernel_model: matmul=0.308281 GFLOP (23.167 GFLOP/s @ duck_max), param_stream=0.086704G (6.516 Gparam/s @ duck_max), weight_stream=93.064 MiB (7.333 GB/s @ duck_max) [2026-04-08 07:53:11.524411 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=61, expert_tiles=65, avg_tile_batch=1.72, prepare=20.108µs, send=769.809µs, judge_wait=11.772481ms, fetch=811.234µs, reduce=14ns; duck time-ns stats: p50=11.613589ms, p90=11.648756ms, max=11.656974ms; kernel_model: matmul=0.308281 GFLOP (26.446 GFLOP/s @ duck_max), param_stream=0.089457G (7.674 Gparam/s @ duck_max), weight_stream=96.018 MiB (8.637 GB/s @ duck_max) [2026-04-08 07:53:11.540501 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=57, expert_tiles=63, avg_tile_batch=1.78, prepare=20.085µs, send=773.644µs, judge_wait=12.699156ms, fetch=811.776µs, reduce=14ns; duck time-ns stats: p50=12.548622ms, p90=12.565033ms, max=12.572195ms; kernel_model: matmul=0.308281 GFLOP (24.521 GFLOP/s @ duck_max), param_stream=0.086704G (6.896 Gparam/s @ duck_max), weight_stream=93.064 MiB (7.762 GB/s @ duck_max) [2026-04-08 07:53:11.555144 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=56, expert_tiles=61, avg_tile_batch=1.84, prepare=20.642µs, send=772.976µs, judge_wait=11.240977ms, fetch=808.998µs, reduce=30ns; duck time-ns stats: p50=11.040976ms, p90=11.069266ms, max=11.116178ms; kernel_model: matmul=0.308281 GFLOP (27.733 GFLOP/s @ duck_max), param_stream=0.083952G (7.552 Gparam/s @ duck_max), weight_stream=90.109 MiB (8.500 GB/s @ duck_max) [2026-04-08 07:53:11.571071 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=64, expert_tiles=67, avg_tile_batch=1.67, prepare=20.972µs, send=773.215µs, judge_wait=12.508624ms, fetch=809.808µs, reduce=13ns; duck time-ns stats: p50=12.369712ms, p90=12.384147ms, max=12.389764ms; kernel_model: matmul=0.308281 GFLOP (24.882 GFLOP/s @ duck_max), param_stream=0.092209G (7.442 Gparam/s @ duck_max), weight_stream=98.973 MiB (8.376 GB/s @ duck_max) [2026-04-08 07:53:11.585745 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=57, expert_tiles=59, avg_tile_batch=1.90, prepare=25.229µs, send=772.649µs, judge_wait=11.266089ms, fetch=810.93µs, reduce=14ns; duck time-ns stats: p50=11.085277ms, p90=11.126631ms, max=11.151314ms; kernel_model: matmul=0.308281 GFLOP (27.645 GFLOP/s @ duck_max), param_stream=0.081199G (7.282 Gparam/s @ duck_max), weight_stream=87.155 MiB (8.195 GB/s @ duck_max) [2026-04-08 07:53:11.601399 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=14, top_k=8, tasks=112, unique_experts=65, expert_tiles=69, avg_tile_batch=1.62, prepare=19.749µs, send=773.92µs, judge_wait=12.257491ms, fetch=810.134µs, reduce=15ns; duck time-ns stats: p50=12.076773ms, p90=12.130765ms, max=12.144113ms; kernel_model: matmul=0.308281 GFLOP (25.385 GFLOP/s @ duck_max), param_stream=0.094962G (7.820 Gparam/s @ duck_max), weight_stream=101.927 MiB (8.801 GB/s @ duck_max) [2026-04-08 07:53:11.660086 INFO fp8_moe_dpdk] MoE prefill forward (Rust): batch_size=13, top_k=8, tasks=104, unique_experts=75, expert_tiles=76, avg_tile_batch=1.37, prepare=158.626µs, send=2.182101ms, judge_wait=13.124639ms, fetch=812.521µs, reduce=25ns; duck time-ns stats: p50=12.721709ms, p90=12.75436ms, max=12.76215ms; kernel_model: matmul=0.286261 GFLOP (22.430 GFLOP/s @ duck_max), param_stream=0.104595G (8.196 Gparam/s @ duck_max), weight_stream=112.267 MiB (9.224 GB/s @ duck_max) [2026-04-08 07:53:11.668496 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 1.037503ms; phases: prepare=4.843µs, send=63.966µs, judge_wait=830.229µs, fetch=96.821µs, reduce=21ns, writeback=804ns; duck time-ns stats: p50=728.06µs, p90=731.279µs, max=734.676µs; effective_read: activated_experts=8, params=0.011010G (14.986 Gparam/s @ duck_max), memory=11.818 MiB (16.867 GB/s @ duck_max), judge_gap=95.553µs, judge_ratio=1.130x [2026-04-08 07:53:12.391315 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 1.759877ms; phases: prepare=5.736µs, send=832.795µs, judge_wait=786.96µs, fetch=97.122µs, reduce=20ns, writeback=504ns; duck time-ns stats: p50=705.495µs, p90=709.611µs, max=711.808µs; effective_read: activated_experts=8, params=0.011010G (15.468 Gparam/s @ duck_max), memory=11.818 MiB (17.409 GB/s @ duck_max), judge_gap=75.152µs, judge_ratio=1.106x Token # 1: 765.421ms; value: next_token_ids=tensor([1415], device='cuda:0') mtp accept=1 prop=1415 top1=1415 accp=0.995 next=draft=3099 prop=3099 olap pair=692.5ms serial=1280.9ms gain=588.4ms ratio=0.46 s0=610.7ms s1=670.2ms wait=0.2/43.8ms pred gate=device [2026-04-08 07:53:12.395274 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 973.206µs; phases: prepare=3.508µs, send=62.682µs, judge_wait=778.914µs, fetch=91.642µs, reduce=19ns, writeback=619ns; duck time-ns stats: p50=694.392µs, p90=698.182µs, max=700.161µs; effective_read: activated_experts=8, params=0.011010G (15.725 Gparam/s @ duck_max), memory=11.818 MiB (17.698 GB/s @ duck_max), judge_gap=78.753µs, judge_ratio=1.112x Token # 2: 3.793ms; value: next_token_ids=tensor([3099], device='cuda:0') mtp accept=1 prop=3099 top1=3099 accp=1.000 next=pair draft=4036 prop=4036 pred gate=device [2026-04-08 07:53:12.507888 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 978.841µs; phases: prepare=3.933µs, send=62.283µs, judge_wait=783.437µs, fetch=91.626µs, reduce=21ns, writeback=481ns; duck time-ns stats: p50=699.72µs, p90=703.414µs, max=705.446µs; effective_read: activated_experts=8, params=0.011010G (15.607 Gparam/s @ duck_max), memory=11.818 MiB (17.566 GB/s @ duck_max), judge_gap=77.991µs, judge_ratio=1.111x Token # 3: 112.784ms; value: next_token_ids=tensor([4036], device='cuda:0') mtp accept=1 prop=4036 top1=4036 accp=0.999 next=draft=10626 prop=10626 olap pair=107.4ms serial=189.8ms gain=82.4ms ratio=0.43 s0=5.1ms s1=184.7ms wait=0.1/50.0ms pred gate=device [2026-04-08 07:53:12.511912 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 1.02668ms; phases: prepare=3.112µs, send=62.767µs, judge_wait=777.496µs, fetch=101.743µs, reduce=137ns, writeback=508ns; duck time-ns stats: p50=694.87µs, p90=700.982µs, max=702.903µs; effective_read: activated_experts=8, params=0.011010G (15.664 Gparam/s @ duck_max), memory=11.818 MiB (17.629 GB/s @ duck_max), judge_gap=74.593µs, judge_ratio=1.106x Token # 4: 3.856ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=1.000 next=pair draft=19 prop=1255 pred gate=device [2026-04-08 07:53:12.625000 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 978.5µs; phases: prepare=3.151µs, send=61.087µs, judge_wait=785.143µs, fetch=92.056µs, reduce=20ns, writeback=451ns; duck time-ns stats: p50=696.009µs, p90=702.913µs, max=706.983µs; effective_read: activated_experts=8, params=0.011010G (15.573 Gparam/s @ duck_max), memory=11.818 MiB (17.528 GB/s @ duck_max), judge_gap=78.16µs, judge_ratio=1.111x Token # 5: 113.300ms; value: next_token_ids=tensor([1255], device='cuda:0') mtp accept=1 prop=1255 top1=19 accp=0.993 next=draft=19 prop=19 olap pair=107.6ms serial=189.6ms gain=82.0ms ratio=0.43 s0=6.8ms s1=182.8ms wait=0.2/48.4ms pred gate=device [2026-04-08 07:53:12.629000 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 980.131µs; phases: prepare=4.44µs, send=63.741µs, judge_wait=783.43µs, fetch=91.186µs, reduce=21ns, writeback=504ns; duck time-ns stats: p50=699.148µs, p90=704.857µs, max=706.383µs; effective_read: activated_experts=8, params=0.011010G (15.587 Gparam/s @ duck_max), memory=11.818 MiB (17.542 GB/s @ duck_max), judge_gap=77.047µs, judge_ratio=1.109x Token # 6: 3.867ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=1.000 next=pair draft=621 prop=621 pred gate=device [2026-04-08 07:53:12.741944 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 985.505µs; phases: prepare=3.668µs, send=63.783µs, judge_wait=785.311µs, fetch=95.274µs, reduce=20ns, writeback=469ns; duck time-ns stats: p50=701.221µs, p90=706.283µs, max=708.461µs; effective_read: activated_experts=8, params=0.011010G (15.541 Gparam/s @ duck_max), memory=11.818 MiB (17.491 GB/s @ duck_max), judge_gap=76.85µs, judge_ratio=1.108x Token # 7: 113.022ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=draft=1457 prop=1457 olap pair=107.7ms serial=189.6ms gain=81.9ms ratio=0.43 s0=4.2ms s1=185.3ms wait=0.1/51.0ms pred gate=device [2026-04-08 07:53:12.745875 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 976.392µs; phases: prepare=3.414µs, send=64.323µs, judge_wait=780.7µs, fetch=91.194µs, reduce=19ns, writeback=579ns; duck time-ns stats: p50=696.239µs, p90=701.699µs, max=707.371µs; effective_read: activated_experts=8, params=0.011010G (15.565 Gparam/s @ duck_max), memory=11.818 MiB (17.518 GB/s @ duck_max), judge_gap=73.329µs, judge_ratio=1.104x Token # 8: 3.789ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=pair draft=18 prop=18 pred gate=device [2026-04-08 07:53:12.858631 INFO fp8_moe_dpdk] MoE forward e2e time (Rust): 988.277µs; phases: prepare=3.815µs, send=62.711µs, judge_wait=789.661µs, fetch=94.972µs, reduce=20ns, writeback=520ns; duck time-ns stats: p50=703.131µs, p90=708.519µs, max=713.4µs; effective_read: activated_experts=8, params=0.011010G (15.433 Gparam/s @ duck_max), memory=11.818 MiB (17.370 GB/s @ duck_max), judge_gap=76.261µs, judge_ratio=1.107x Token # 9: 112.827ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=draft=76499 prop=76499 olap pair=107.5ms serial=190.2ms gain=82.7ms ratio=0.43 s0=4.2ms s1=186.1ms wait=0.1/50.9ms pred gate=device Token # 10: 3.819ms; value: next_token_ids=tensor([76499], device='cuda:0') mtp accept=1 prop=76499 top1=76499 accp=0.858 next=pair draft=303 prop=303 pred gate=device Token # 11: 113.749ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.997 next=draft=15495 prop=15495 olap pair=107.7ms serial=189.9ms gain=82.3ms ratio=0.43 s0=4.9ms s1=185.0ms wait=0.1/50.6ms pred gate=device Token # 12: 4.614ms; value: next_token_ids=tensor([15495], device='cuda:0') mtp accept=1 prop=15495 top1=4916 accp=0.405 next=pair draft=2935 prop=2935 pred gate=device Token # 13: 113.051ms; value: next_token_ids=tensor([32000], device='cuda:0') mtp accept=0 prop=2935 top1=32000 accp=0.241 next=draft=6064 prop=6064 olap pair=107.5ms serial=189.3ms gain=81.8ms ratio=0.43 s0=7.9ms s1=181.4ms wait=0.2/47.2ms pred gate=device Token # 14: 112.796ms; value: next_token_ids=tensor([6064], device='cuda:0') mtp accept=1 prop=6064 top1=6064 accp=0.987 next=draft=320 prop=320 olap pair=107.3ms serial=190.1ms gain=82.7ms ratio=0.44 s0=3.8ms s1=186.3ms wait=0.1/51.7ms pred gate=device Token # 15: 3.800ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=320 accp=1.000 next=pair draft=52197 prop=52197 pred gate=device Token # 16: 112.473ms; value: next_token_ids=tensor([52197], device='cuda:0') mtp accept=1 prop=52197 top1=52197 accp=0.721 next=draft=39932 prop=55262 olap pair=107.2ms serial=189.2ms gain=82.0ms ratio=0.43 s0=6.1ms s1=183.1ms wait=0.2/48.9ms pred gate=device Token # 17: 3.801ms; value: next_token_ids=tensor([39932], device='cuda:0') mtp accept=0 prop=55262 top1=39932 accp=0.998 next=pair draft=52271 prop=52271 pred gate=device Token # 18: 112.921ms; value: next_token_ids=tensor([14171], device='cuda:0') mtp accept=0 prop=52271 top1=14171 accp=0.491 next=draft=1057 prop=1057 olap pair=107.6ms serial=190.3ms gain=82.7ms ratio=0.43 s0=4.1ms s1=186.2ms wait=0.1/51.0ms pred gate=device Token # 19: 112.724ms; value: next_token_ids=tensor([1057], device='cuda:0') mtp accept=1 prop=1057 top1=1057 accp=0.988 next=draft=18804 prop=18804 olap pair=107.3ms serial=189.0ms gain=81.7ms ratio=0.43 s0=5.2ms s1=183.8ms wait=0.1/50.1ms pred gate=device Token # 20: 3.826ms; value: next_token_ids=tensor([12519], device='cuda:0') mtp accept=0 prop=18804 top1=12519 accp=0.478 next=pair draft=19 prop=1255 pred gate=device Token # 21: 113.100ms; value: next_token_ids=tensor([1255], device='cuda:0') mtp accept=1 prop=1255 top1=1255 accp=0.187 next=draft=19 prop=19 olap pair=107.7ms serial=189.7ms gain=82.0ms ratio=0.43 s0=4.1ms s1=185.6ms wait=0.1/51.3ms pred gate=device Token # 22: 3.822ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=1.000 next=pair draft=621 prop=621 pred gate=device Token # 23: 113.945ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=draft=1457 prop=1457 olap pair=108.4ms serial=189.6ms gain=81.2ms ratio=0.43 s0=6.4ms s1=183.2ms wait=0.2/48.7ms pred gate=device Token # 24: 3.782ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=pair draft=18 prop=18 pred gate=device Token # 25: 112.985ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=draft=27413 prop=27413 olap pair=107.7ms serial=189.7ms gain=82.0ms ratio=0.43 s0=7.9ms s1=181.8ms wait=0.2/47.4ms pred gate=device Token # 26: 3.833ms; value: next_token_ids=tensor([301], device='cuda:0') mtp accept=0 prop=27413 top1=27413 accp=0.492 next=pair draft=1913 prop=1913 pred gate=device Token # 27: 112.852ms; value: next_token_ids=tensor([7849], device='cuda:0') mtp accept=0 prop=1913 top1=7849 accp=0.009 next=draft=8283 prop=8283 olap pair=107.5ms serial=190.5ms gain=82.9ms ratio=0.44 s0=3.8ms s1=186.7ms wait=0.1/51.7ms pred gate=device Token # 28: 113.155ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=0.996 next=draft=301 prop=301 olap pair=107.6ms serial=190.7ms gain=83.1ms ratio=0.44 s0=3.8ms s1=186.9ms wait=0.1/51.6ms pred gate=device Token # 29: 3.802ms; value: next_token_ids=tensor([301], device='cuda:0') mtp accept=1 prop=301 top1=301 accp=1.000 next=pair draft=18804 prop=18804 pred gate=device Token # 30: 112.920ms; value: next_token_ids=tensor([18804], device='cuda:0') mtp accept=1 prop=18804 top1=18804 accp=0.801 next=draft=303 prop=303 olap pair=107.6ms serial=190.3ms gain=82.7ms ratio=0.43 s0=4.2ms s1=186.1ms wait=0.1/51.2ms pred gate=device Token # 31: 3.851ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.830 next=pair draft=7849 prop=7849 pred gate=device Token # 32: 113.427ms; value: next_token_ids=tensor([2431], device='cuda:0') mtp accept=0 prop=7849 top1=2431 accp=0.194 next=draft=580 prop=580 olap pair=108.1ms serial=190.8ms gain=82.7ms ratio=0.43 s0=5.4ms s1=185.4ms wait=0.1/49.6ms pred gate=device Token # 33: 113.256ms; value: next_token_ids=tensor([7849], device='cuda:0') mtp accept=0 prop=580 top1=7849 accp=0.004 next=draft=8283 prop=8283 olap pair=107.8ms serial=191.1ms gain=83.3ms ratio=0.44 s0=4.2ms s1=186.9ms wait=0.1/51.1ms pred gate=device Token # 34: 113.303ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=draft=21988 prop=21988 olap pair=107.8ms serial=190.9ms gain=83.1ms ratio=0.44 s0=4.0ms s1=186.9ms wait=0.1/51.3ms pred gate=device Token # 35: 3.855ms; value: next_token_ids=tensor([4857], device='cuda:0') mtp accept=0 prop=21988 top1=21988 accp=0.718 next=pair draft=21988 prop=21988 pred gate=device Token # 36: 113.232ms; value: next_token_ids=tensor([21988], device='cuda:0') mtp accept=1 prop=21988 top1=21988 accp=1.000 next=draft=303 prop=320 olap pair=107.9ms serial=190.8ms gain=82.9ms ratio=0.43 s0=4.2ms s1=186.6ms wait=0.1/51.0ms pred gate=device Token # 37: 3.738ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=303 accp=0.928 next=pair draft=4389 prop=4389 pred gate=device Token # 38: 112.951ms; value: next_token_ids=tensor([4389], device='cuda:0') mtp accept=1 prop=4389 top1=4389 accp=0.979 next=draft=28669 prop=28669 olap pair=107.7ms serial=190.7ms gain=83.1ms ratio=0.44 s0=3.8ms s1=186.9ms wait=0.1/51.7ms pred gate=device Token # 39: 3.733ms; value: next_token_ids=tensor([28669], device='cuda:0') mtp accept=1 prop=28669 top1=28669 accp=0.933 next=pair draft=18804 prop=18804 pred gate=device Token # 40: 112.839ms; value: next_token_ids=tensor([18804], device='cuda:0') mtp accept=1 prop=18804 top1=18804 accp=0.966 next=draft=10626 prop=10626 olap pair=107.6ms serial=190.7ms gain=83.1ms ratio=0.44 s0=3.8ms s1=186.9ms wait=0.1/51.6ms pred gate=device Token # 41: 3.740ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=0.894 next=pair draft=303 prop=303 pred gate=device Token # 42: 113.163ms; value: next_token_ids=tensor([6533], device='cuda:0') mtp accept=0 prop=303 top1=6533 accp=0.427 next=draft=303 prop=303 olap pair=107.8ms serial=191.0ms gain=83.2ms ratio=0.44 s0=3.8ms s1=187.3ms wait=0.1/51.7ms pred gate=device Token # 43: 113.345ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=39932 prop=39932 olap pair=108.0ms serial=191.5ms gain=83.4ms ratio=0.44 s0=3.8ms s1=187.7ms wait=0.1/51.7ms pred gate=device Token # 44: 3.751ms; value: next_token_ids=tensor([39932], device='cuda:0') mtp accept=1 prop=39932 top1=39932 accp=0.920 next=pair draft=14171 prop=14171 pred gate=device Token # 45: 113.101ms; value: next_token_ids=tensor([14171], device='cuda:0') mtp accept=1 prop=14171 top1=14171 accp=0.764 next=draft=1057 prop=1057 olap pair=107.8ms serial=191.1ms gain=83.3ms ratio=0.44 s0=3.7ms s1=187.4ms wait=0.1/51.8ms pred gate=device Token # 46: 3.737ms; value: next_token_ids=tensor([1057], device='cuda:0') mtp accept=1 prop=1057 top1=1057 accp=0.907 next=pair draft=16996 prop=12519 pred gate=device Token # 47: 113.243ms; value: next_token_ids=tensor([1168], device='cuda:0') mtp accept=0 prop=12519 top1=1168 accp=0.026 next=draft=12244 prop=12244 olap pair=107.9ms serial=190.5ms gain=82.7ms ratio=0.43 s0=5.8ms s1=184.7ms wait=0.2/49.2ms pred gate=device Token # 48: 113.390ms; value: next_token_ids=tensor([12244], device='cuda:0') mtp accept=1 prop=12244 top1=12244 accp=1.000 next=draft=18804 prop=18804 olap pair=108.0ms serial=191.3ms gain=83.3ms ratio=0.44 s0=3.8ms s1=187.5ms wait=0.1/49.2ms pred gate=device Token # 49: 3.728ms; value: next_token_ids=tensor([18804], device='cuda:0') mtp accept=1 prop=18804 top1=18804 accp=0.986 next=pair draft=320 prop=303 pred gate=device Token # 50: 112.681ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=0 prop=303 top1=320 accp=0.864 next=draft=4389 prop=1207 olap pair=107.5ms serial=190.6ms gain=83.1ms ratio=0.44 s0=3.8ms s1=186.7ms wait=0.1/46.0ms pred gate=device Token # 51: 113.137ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.095 next=draft=4036 prop=4754 olap pair=107.9ms serial=191.3ms gain=83.4ms ratio=0.44 s0=4.0ms s1=187.3ms wait=0.1/45.8ms pred gate=device Token # 52: 3.685ms; value: next_token_ids=tensor([4754], device='cuda:0') mtp accept=1 prop=4754 top1=4754 accp=0.329 next=pair draft=303 prop=303 pred gate=device Token # 53: 113.507ms; value: next_token_ids=tensor([768], device='cuda:0') mtp accept=0 prop=303 top1=768 accp=0.238 next=draft=23153 prop=23153 olap pair=107.9ms serial=191.2ms gain=83.4ms ratio=0.44 s0=4.3ms s1=186.9ms wait=0.1/45.8ms pred gate=device Token # 54: 113.526ms; value: next_token_ids=tensor([445], device='cuda:0') mtp accept=0 prop=23153 top1=4036 accp=0.220 next=draft=27318 prop=27318 olap pair=108.3ms serial=192.1ms gain=83.8ms ratio=0.44 s0=3.8ms s1=188.4ms wait=0.1/46.5ms pred gate=device Token # 55: 113.151ms; value: next_token_ids=tensor([13653], device='cuda:0') mtp accept=0 prop=27318 top1=13653 accp=0.321 next=draft=34571 prop=34571 olap pair=107.8ms serial=191.1ms gain=83.2ms ratio=0.44 s0=5.2ms s1=185.8ms wait=0.1/44.4ms pred gate=device Token # 56: 113.807ms; value: next_token_ids=tensor([34571], device='cuda:0') mtp accept=1 prop=34571 top1=34571 accp=1.000 next=draft=303 prop=303 olap pair=108.3ms serial=191.6ms gain=83.3ms ratio=0.43 s0=5.2ms s1=186.4ms wait=0.1/44.8ms pred gate=device Token # 57: 3.750ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.998 next=pair draft=97105 prop=97105 pred gate=device Token # 58: 113.476ms; value: next_token_ids=tensor([97105], device='cuda:0') mtp accept=1 prop=97105 top1=97105 accp=0.690 next=draft=6715 prop=6715 olap pair=108.3ms serial=191.8ms gain=83.5ms ratio=0.44 s0=3.8ms s1=188.0ms wait=0.1/46.3ms pred gate=device Token # 59: 3.682ms; value: next_token_ids=tensor([16806], device='cuda:0') mtp accept=0 prop=6715 top1=16806 accp=0.485 next=pair draft=39932 prop=39932 pred gate=device Token # 60: 113.199ms; value: next_token_ids=tensor([1300], device='cuda:0') mtp accept=0 prop=39932 top1=39932 accp=0.727 next=draft=33298 prop=33298 olap pair=108.0ms serial=191.4ms gain=83.5ms ratio=0.44 s0=4.1ms s1=187.4ms wait=0.1/45.5ms pred gate=device Token # 61: 114.084ms; value: next_token_ids=tensor([33298], device='cuda:0') mtp accept=1 prop=33298 top1=33298 accp=0.916 next=draft=10626 prop=10626 olap pair=108.0ms serial=191.1ms gain=83.1ms ratio=0.43 s0=5.6ms s1=185.5ms wait=0.2/44.0ms pred gate=device Token # 62: 4.582ms; value: next_token_ids=tensor([3796], device='cuda:0') mtp accept=0 prop=10626 top1=3796 accp=0.198 next=pair draft=8283 prop=8283 pred gate=device Token # 63: 113.702ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=0 prop=8283 top1=10626 accp=0.264 next=draft=303 prop=303 olap pair=108.2ms serial=191.7ms gain=83.5ms ratio=0.44 s0=5.0ms s1=186.8ms wait=0.1/44.9ms pred gate=device Token # 64: 113.556ms; value: next_token_ids=tensor([4498], device='cuda:0') mtp accept=0 prop=303 top1=4498 accp=0.031 next=draft=303 prop=303 olap pair=108.2ms serial=192.0ms gain=83.8ms ratio=0.44 s0=4.1ms s1=188.0ms wait=0.1/45.8ms pred gate=device Token # 65: 113.655ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=0 prop=303 top1=303 accp=0.690 next=draft=8353 prop=8353 olap pair=108.4ms serial=192.2ms gain=83.8ms ratio=0.44 s0=4.2ms s1=187.9ms wait=0.1/45.3ms pred gate=device Token # 66: 113.592ms; value: next_token_ids=tensor([8353], device='cuda:0') mtp accept=1 prop=8353 top1=8353 accp=1.000 next=draft=303 prop=303 olap pair=108.2ms serial=191.8ms gain=83.6ms ratio=0.44 s0=4.2ms s1=187.6ms wait=0.1/45.3ms pred gate=device Token # 67: 3.740ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=10626 prop=10626 pred gate=device Token # 68: 113.667ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=0.565 next=draft=19 prop=19 olap pair=108.4ms serial=192.3ms gain=83.9ms ratio=0.44 s0=4.2ms s1=188.1ms wait=0.1/45.4ms pred gate=device Token # 69: 3.745ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=0.934 next=pair draft=621 prop=621 pred gate=device Token # 70: 113.459ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=draft=1457 prop=1457 olap pair=108.1ms serial=191.6ms gain=83.5ms ratio=0.44 s0=4.3ms s1=187.3ms wait=0.1/44.9ms pred gate=device Token # 71: 3.748ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=pair draft=18 prop=18 pred gate=device Token # 72: 113.249ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=draft=27413 prop=27413 olap pair=108.0ms serial=191.5ms gain=83.5ms ratio=0.44 s0=4.2ms s1=187.3ms wait=0.1/45.1ms pred gate=device Token # 73: 3.804ms; value: next_token_ids=tensor([27413], device='cuda:0') mtp accept=1 prop=27413 top1=27413 accp=0.762 next=pair draft=8283 prop=8283 pred gate=device Token # 74: 112.827ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=draft=1275 prop=1275 olap pair=107.6ms serial=190.6ms gain=83.0ms ratio=0.44 s0=4.3ms s1=186.3ms wait=0.1/44.8ms pred gate=device Token # 75: 3.770ms; value: next_token_ids=tensor([673], device='cuda:0') mtp accept=0 prop=1275 top1=673 accp=0.482 next=pair draft=3907 prop=3907 pred gate=device Token # 76: 112.981ms; value: next_token_ids=tensor([3907], device='cuda:0') mtp accept=1 prop=3907 top1=3907 accp=0.737 next=draft=1122 prop=1122 olap pair=107.7ms serial=190.8ms gain=83.2ms ratio=0.44 s0=4.3ms s1=186.6ms wait=0.1/44.9ms pred gate=device Token # 77: 3.688ms; value: next_token_ids=tensor([1122], device='cuda:0') mtp accept=1 prop=1122 top1=1122 accp=0.969 next=pair draft=303 prop=303 pred gate=device Token # 78: 112.943ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.917 next=draft=2431 prop=2431 olap pair=107.7ms serial=190.9ms gain=83.2ms ratio=0.44 s0=4.2ms s1=186.7ms wait=0.1/45.2ms pred gate=device Token # 79: 3.695ms; value: next_token_ids=tensor([2431], device='cuda:0') mtp accept=1 prop=2431 top1=2431 accp=0.692 next=pair draft=58788 prop=58788 pred gate=device Token # 80: 113.653ms; value: next_token_ids=tensor([422], device='cuda:0') mtp accept=0 prop=58788 top1=422 accp=0.093 next=draft=2693 prop=2693 olap pair=108.5ms serial=192.4ms gain=84.0ms ratio=0.44 s0=4.2ms s1=188.2ms wait=0.1/45.1ms pred gate=device Token # 81: 113.546ms; value: next_token_ids=tensor([2693], device='cuda:0') mtp accept=1 prop=2693 top1=2693 accp=0.697 next=draft=4899 prop=4899 olap pair=108.2ms serial=191.9ms gain=83.7ms ratio=0.44 s0=4.1ms s1=187.9ms wait=0.1/45.4ms pred gate=device Token # 82: 3.689ms; value: next_token_ids=tensor([4899], device='cuda:0') mtp accept=1 prop=4899 top1=4899 accp=1.000 next=pair draft=320 prop=320 pred gate=device Token # 83: 113.941ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=320 accp=0.998 next=draft=1207 prop=1207 olap pair=108.8ms serial=193.2ms gain=84.4ms ratio=0.44 s0=3.7ms s1=189.5ms wait=0.1/46.4ms pred gate=device Token # 84: 3.701ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.962 next=pair draft=23153 prop=23153 pred gate=device Token # 85: 113.595ms; value: next_token_ids=tensor([13470], device='cuda:0') mtp accept=0 prop=23153 top1=13470 accp=0.133 next=draft=23153 prop=23153 olap pair=108.4ms serial=192.5ms gain=84.1ms ratio=0.44 s0=3.8ms s1=188.7ms wait=0.1/46.2ms pred gate=device Token # 86: 113.678ms; value: next_token_ids=tensor([23153], device='cuda:0') mtp accept=1 prop=23153 top1=23153 accp=0.980 next=draft=8673 prop=8673 olap pair=108.4ms serial=192.2ms gain=83.8ms ratio=0.44 s0=4.3ms s1=188.0ms wait=0.1/45.1ms pred gate=device Token # 87: 3.889ms; value: next_token_ids=tensor([389], device='cuda:0') mtp accept=0 prop=8673 top1=389 accp=0.331 next=pair draft=428 prop=428 pred gate=device Token # 88: 113.175ms; value: next_token_ids=tensor([428], device='cuda:0') mtp accept=1 prop=428 top1=428 accp=1.000 next=draft=4036 prop=4036 olap pair=107.9ms serial=191.1ms gain=83.3ms ratio=0.44 s0=4.3ms s1=186.8ms wait=0.1/45.1ms pred gate=device Token # 89: 3.738ms; value: next_token_ids=tensor([4036], device='cuda:0') mtp accept=1 prop=4036 top1=4036 accp=1.000 next=pair draft=10626 prop=10626 pred gate=device Token # 90: 113.774ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=1.000 next=draft=4231 prop=4231 olap pair=108.5ms serial=192.4ms gain=83.9ms ratio=0.44 s0=4.2ms s1=188.2ms wait=0.1/45.2ms pred gate=device Token # 91: 3.776ms; value: next_token_ids=tensor([4231], device='cuda:0') mtp accept=1 prop=4231 top1=4231 accp=0.860 next=pair draft=3996 prop=39932 pred gate=device Token # 92: 113.603ms; value: next_token_ids=tensor([39932], device='cuda:0') mtp accept=1 prop=39932 top1=39932 accp=0.197 next=draft=580 prop=3796 olap pair=108.4ms serial=192.1ms gain=83.7ms ratio=0.44 s0=4.2ms s1=187.9ms wait=0.1/44.9ms pred gate=device Token # 93: 3.697ms; value: next_token_ids=tensor([22459], device='cuda:0') mtp accept=0 prop=3796 top1=7157 accp=0.066 next=pair draft=23153 prop=478 pred gate=device Token # 94: 113.703ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=1 prop=478 top1=478 accp=0.396 next=draft=13103 prop=13103 olap pair=108.4ms serial=192.3ms gain=83.9ms ratio=0.44 s0=4.0ms s1=188.3ms wait=0.1/45.6ms pred gate=device Token # 95: 3.769ms; value: next_token_ids=tensor([4534], device='cuda:0') mtp accept=0 prop=13103 top1=29653 accp=0.102 next=pair draft=303 prop=303 pred gate=device Token # 96: 113.656ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=50687 prop=50687 olap pair=108.3ms serial=192.4ms gain=84.0ms ratio=0.44 s0=3.7ms s1=188.6ms wait=0.1/46.3ms pred gate=device Token # 97: 3.681ms; value: next_token_ids=tensor([445], device='cuda:0') mtp accept=0 prop=50687 top1=445 accp=0.540 next=pair draft=27318 prop=20143 pred gate=device Token # 98: 113.565ms; value: next_token_ids=tensor([13503], device='cuda:0') mtp accept=0 prop=20143 top1=79075 accp=0.478 next=draft=15087 prop=15087 olap pair=108.4ms serial=192.5ms gain=84.1ms ratio=0.44 s0=4.4ms s1=188.1ms wait=0.1/45.0ms pred gate=device Token # 99: 113.510ms; value: next_token_ids=tensor([15087], device='cuda:0') mtp accept=1 prop=15087 top1=15087 accp=0.601 next=draft=525 prop=525 olap pair=108.2ms serial=192.2ms gain=83.9ms ratio=0.44 s0=4.3ms s1=187.9ms wait=0.1/45.0ms pred gate=device Token # 100: 3.690ms; value: next_token_ids=tensor([525], device='cuda:0') mtp accept=1 prop=525 top1=525 accp=1.000 next=pair draft=303 prop=303 pred gate=device Token # 101: 113.454ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.991 next=draft=4036 prop=4036 olap pair=108.2ms serial=192.1ms gain=83.9ms ratio=0.44 s0=4.3ms s1=187.8ms wait=0.1/45.1ms pred gate=device Token # 102: 3.738ms; value: next_token_ids=tensor([4036], device='cuda:0') mtp accept=1 prop=4036 top1=4036 accp=0.848 next=pair draft=10626 prop=10626 pred gate=device Token # 103: 113.405ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=0.999 next=draft=1057 prop=1057 olap pair=108.2ms serial=192.1ms gain=83.9ms ratio=0.44 s0=4.3ms s1=187.8ms wait=0.1/44.7ms pred gate=device Token # 104: 3.758ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=0 prop=1057 top1=1457 accp=0.100 next=pair draft=18 prop=18 pred gate=device Token # 105: 114.281ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=draft=719 prop=719 olap pair=108.6ms serial=192.8ms gain=84.1ms ratio=0.44 s0=4.4ms s1=188.3ms wait=0.1/44.8ms pred gate=device Token # 106: 4.173ms; value: next_token_ids=tensor([719], device='cuda:0') mtp accept=1 prop=719 top1=719 accp=0.980 next=pair draft=8283 prop=8283 pred gate=device Token # 107: 113.470ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=0.848 next=draft=2431 prop=2431 olap pair=108.1ms serial=191.8ms gain=83.7ms ratio=0.44 s0=4.4ms s1=187.4ms wait=0.1/44.5ms pred gate=device Token # 108: 3.700ms; value: next_token_ids=tensor([13709], device='cuda:0') mtp accept=0 prop=2431 top1=13709 accp=0.213 next=pair draft=1334 prop=42829 pred gate=device Token # 109: 113.226ms; value: next_token_ids=tensor([42829], device='cuda:0') mtp accept=1 prop=42829 top1=42829 accp=0.468 next=draft=11925 prop=32576 olap pair=108.0ms serial=191.6ms gain=83.7ms ratio=0.44 s0=4.3ms s1=187.3ms wait=0.1/45.1ms pred gate=device Token # 110: 3.745ms; value: next_token_ids=tensor([11925], device='cuda:0') mtp accept=0 prop=32576 top1=11925 accp=0.848 next=pair draft=6322 prop=6322 pred gate=device Token # 111: 113.768ms; value: next_token_ids=tensor([6322], device='cuda:0') mtp accept=1 prop=6322 top1=6322 accp=1.000 next=draft=303 prop=303 olap pair=108.5ms serial=191.9ms gain=83.4ms ratio=0.43 s0=4.4ms s1=187.5ms wait=0.1/44.9ms pred gate=device Token # 112: 3.719ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=4225 prop=4225 pred gate=device Token # 113: 113.328ms; value: next_token_ids=tensor([4225], device='cuda:0') mtp accept=1 prop=4225 top1=1207 accp=0.476 next=draft=2431 prop=2431 olap pair=108.1ms serial=191.3ms gain=83.2ms ratio=0.43 s0=4.2ms s1=187.1ms wait=0.1/45.3ms pred gate=device Token # 114: 3.726ms; value: next_token_ids=tensor([2431], device='cuda:0') mtp accept=1 prop=2431 top1=2431 accp=0.990 next=pair draft=2099 prop=2099 pred gate=device Token # 115: 113.879ms; value: next_token_ids=tensor([2099], device='cuda:0') mtp accept=1 prop=2099 top1=2099 accp=0.935 next=draft=6640 prop=6640 olap pair=108.7ms serial=193.1ms gain=84.4ms ratio=0.44 s0=4.3ms s1=188.8ms wait=0.1/44.8ms pred gate=device Token # 116: 3.762ms; value: next_token_ids=tensor([6640], device='cuda:0') mtp accept=1 prop=6640 top1=6640 accp=0.961 next=pair draft=59857 prop=59857 pred gate=device Token # 117: 113.620ms; value: next_token_ids=tensor([17061], device='cuda:0') mtp accept=0 prop=59857 top1=7493 accp=0.304 next=draft=26560 prop=26560 olap pair=108.4ms serial=192.5ms gain=84.1ms ratio=0.44 s0=4.3ms s1=188.2ms wait=0.1/45.2ms pred gate=device Token # 118: 113.981ms; value: next_token_ids=tensor([26560], device='cuda:0') mtp accept=1 prop=26560 top1=26560 accp=1.000 next=draft=320 prop=320 olap pair=108.6ms serial=192.9ms gain=84.2ms ratio=0.44 s0=4.3ms s1=188.6ms wait=0.1/45.2ms pred gate=device Token # 119: 3.766ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=320 accp=1.000 next=pair draft=6640 prop=6640 pred gate=device Token # 120: 113.402ms; value: next_token_ids=tensor([6640], device='cuda:0') mtp accept=1 prop=6640 top1=6640 accp=0.996 next=draft=2431 prop=2431 olap pair=108.2ms serial=192.0ms gain=83.8ms ratio=0.44 s0=4.3ms s1=187.7ms wait=0.1/45.1ms pred gate=device Token # 121: 3.700ms; value: next_token_ids=tensor([2431], device='cuda:0') mtp accept=1 prop=2431 top1=2431 accp=0.991 next=pair draft=6063 prop=6063 pred gate=device Token # 122: 113.851ms; value: next_token_ids=tensor([6063], device='cuda:0') mtp accept=1 prop=6063 top1=6063 accp=0.658 next=draft=10251 prop=10251 olap pair=108.6ms serial=192.6ms gain=84.1ms ratio=0.44 s0=4.4ms s1=188.3ms wait=0.1/44.8ms pred gate=device Token # 123: 3.728ms; value: next_token_ids=tensor([10251], device='cuda:0') mtp accept=1 prop=10251 top1=10251 accp=1.000 next=pair draft=1300 prop=1316 pred gate=device Token # 124: 114.051ms; value: next_token_ids=tensor([1300], device='cuda:0') mtp accept=0 prop=1316 top1=1300 accp=0.563 next=draft=5402 prop=5402 olap pair=108.8ms serial=193.4ms gain=84.5ms ratio=0.44 s0=4.3ms s1=189.0ms wait=0.1/44.9ms pred gate=device Token # 125: 113.459ms; value: next_token_ids=tensor([5402], device='cuda:0') mtp accept=1 prop=5402 top1=5402 accp=0.979 next=draft=27095 prop=27095 olap pair=108.2ms serial=191.7ms gain=83.6ms ratio=0.44 s0=4.5ms s1=187.3ms wait=0.1/45.0ms pred gate=device Token # 126: 3.699ms; value: next_token_ids=tensor([6617], device='cuda:0') mtp accept=0 prop=27095 top1=6617 accp=0.369 next=pair draft=23153 prop=23153 pred gate=device Token # 127: 115.871ms; value: next_token_ids=tensor([23153], device='cuda:0') mtp accept=1 prop=23153 top1=23153 accp=0.988 next=draft=303 prop=303 olap pair=110.6ms serial=196.3ms gain=85.7ms ratio=0.44 s0=4.3ms s1=192.0ms wait=0.1/45.3ms pred gate=device Token # 128: 3.732ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.991 next=pair draft=4029 prop=4029 pred gate=device Token # 129: 113.842ms; value: next_token_ids=tensor([4029], device='cuda:0') mtp accept=1 prop=4029 top1=4029 accp=1.000 next=draft=5109 prop=5109 olap pair=108.6ms serial=192.4ms gain=83.9ms ratio=0.44 s0=4.2ms s1=188.3ms wait=0.1/45.3ms pred gate=device Token # 130: 3.731ms; value: next_token_ids=tensor([90779], device='cuda:0') mtp accept=0 prop=5109 top1=2431 accp=0.276 next=pair draft=1057 prop=1057 pred gate=device Token # 131: 113.296ms; value: next_token_ids=tensor([1300], device='cuda:0') mtp accept=0 prop=1057 top1=1057 accp=0.495 next=draft=32000 prop=32000 olap pair=108.1ms serial=191.5ms gain=83.4ms ratio=0.44 s0=4.3ms s1=187.3ms wait=0.1/45.2ms pred gate=device Token # 132: 114.107ms; value: next_token_ids=tensor([82558], device='cuda:0') mtp accept=0 prop=32000 top1=580 accp=0.124 next=draft=12837 prop=12837 olap pair=108.8ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.3ms s1=188.8ms wait=0.1/45.4ms pred gate=device Token # 133: 113.944ms; value: next_token_ids=tensor([14171], device='cuda:0') mtp accept=0 prop=12837 top1=14171 accp=0.050 next=draft=2920 prop=8283 olap pair=108.6ms serial=192.7ms gain=84.1ms ratio=0.44 s0=4.3ms s1=188.4ms wait=0.1/45.1ms pred gate=device Token # 134: 114.000ms; value: next_token_ids=tensor([2920], device='cuda:0') mtp accept=0 prop=8283 top1=2920 accp=0.725 next=draft=8283 prop=8283 olap pair=108.6ms serial=192.7ms gain=84.1ms ratio=0.44 s0=4.1ms s1=188.6ms wait=0.1/45.6ms pred gate=device Token # 135: 113.889ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=draft=9939 prop=9939 olap pair=108.5ms serial=192.5ms gain=83.9ms ratio=0.44 s0=4.2ms s1=188.2ms wait=0.1/45.1ms pred gate=device Token # 136: 3.720ms; value: next_token_ids=tensor([9939], device='cuda:0') mtp accept=1 prop=9939 top1=9939 accp=0.933 next=pair draft=320 prop=320 pred gate=device Token # 137: 113.629ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=320 accp=0.986 next=draft=1207 prop=1207 olap pair=108.4ms serial=192.4ms gain=84.0ms ratio=0.44 s0=4.2ms s1=188.2ms wait=0.1/45.3ms pred gate=device Token # 138: 3.714ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.996 next=pair draft=23153 prop=23153 pred gate=device Token # 139: 113.931ms; value: next_token_ids=tensor([23153], device='cuda:0') mtp accept=1 prop=23153 top1=23153 accp=0.997 next=draft=8673 prop=8673 olap pair=108.7ms serial=192.9ms gain=84.2ms ratio=0.44 s0=4.2ms s1=188.6ms wait=0.1/45.3ms pred gate=device Token # 140: 3.748ms; value: next_token_ids=tensor([8673], device='cuda:0') mtp accept=1 prop=8673 top1=8673 accp=0.999 next=pair draft=872 prop=872 pred gate=device Token # 141: 113.551ms; value: next_token_ids=tensor([872], device='cuda:0') mtp accept=1 prop=872 top1=872 accp=1.000 next=draft=428 prop=428 olap pair=108.3ms serial=192.0ms gain=83.7ms ratio=0.44 s0=4.2ms s1=187.8ms wait=0.1/45.1ms pred gate=device Token # 142: 3.783ms; value: next_token_ids=tensor([428], device='cuda:0') mtp accept=1 prop=428 top1=428 accp=1.000 next=pair draft=4916 prop=4916 pred gate=device Token # 143: 114.318ms; value: next_token_ids=tensor([4916], device='cuda:0') mtp accept=1 prop=4916 top1=4916 accp=1.000 next=draft=2935 prop=2935 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=4.2ms s1=189.5ms wait=0.1/45.5ms pred gate=device Token # 144: 3.756ms; value: next_token_ids=tensor([2935], device='cuda:0') mtp accept=1 prop=2935 top1=2935 accp=1.000 next=pair draft=6064 prop=6064 pred gate=device Token # 145: 113.472ms; value: next_token_ids=tensor([6064], device='cuda:0') mtp accept=1 prop=6064 top1=6064 accp=1.000 next=draft=4231 prop=4231 olap pair=108.2ms serial=191.9ms gain=83.6ms ratio=0.44 s0=4.3ms s1=187.6ms wait=0.1/45.1ms pred gate=device Token # 146: 3.736ms; value: next_token_ids=tensor([4231], device='cuda:0') mtp accept=1 prop=4231 top1=4231 accp=1.000 next=pair draft=2636 prop=2636 pred gate=device Token # 147: 114.291ms; value: next_token_ids=tensor([2636], device='cuda:0') mtp accept=1 prop=2636 top1=2636 accp=0.967 next=draft=55262 prop=2916 olap pair=108.2ms serial=191.5ms gain=83.3ms ratio=0.43 s0=6.3ms s1=185.2ms wait=0.2/43.3ms pred gate=device Token # 148: 4.579ms; value: next_token_ids=tensor([49554], device='cuda:0') mtp accept=0 prop=2916 top1=49554 accp=0.013 next=pair draft=3796 prop=82558 pred gate=device Token # 149: 113.600ms; value: next_token_ids=tensor([4036], device='cuda:0') mtp accept=0 prop=82558 top1=10626 accp=0.404 next=draft=10626 prop=10626 olap pair=108.2ms serial=191.5ms gain=83.4ms ratio=0.44 s0=5.0ms s1=186.6ms wait=0.1/44.7ms pred gate=device Token # 150: 114.195ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=1.000 next=draft=8283 prop=8283 olap pair=108.9ms serial=193.2ms gain=84.3ms ratio=0.44 s0=4.2ms s1=189.0ms wait=0.1/45.4ms pred gate=device Token # 151: 3.831ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=pair draft=478 prop=478 pred gate=device Token # 152: 114.151ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=1 prop=478 top1=478 accp=0.922 next=draft=13103 prop=29653 olap pair=108.9ms serial=193.5ms gain=84.6ms ratio=0.44 s0=4.1ms s1=189.5ms wait=0.1/45.7ms pred gate=device Token # 153: 3.676ms; value: next_token_ids=tensor([29653], device='cuda:0') mtp accept=1 prop=29653 top1=29653 accp=0.211 next=pair draft=28669 prop=28669 pred gate=device Token # 154: 113.719ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=0 prop=28669 top1=10626 accp=0.162 next=draft=15991 prop=15991 olap pair=108.5ms serial=192.4ms gain=83.9ms ratio=0.44 s0=4.2ms s1=188.1ms wait=0.1/45.2ms pred gate=device Token # 155: 113.866ms; value: next_token_ids=tensor([15991], device='cuda:0') mtp accept=1 prop=15991 top1=15991 accp=0.870 next=draft=303 prop=303 olap pair=108.7ms serial=192.7ms gain=84.1ms ratio=0.44 s0=4.3ms s1=188.5ms wait=0.1/45.1ms pred gate=device Token # 156: 3.649ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=18467 prop=34071 pred gate=device Token # 157: 113.757ms; value: next_token_ids=tensor([15023], device='cuda:0') mtp accept=0 prop=34071 top1=18467 accp=0.685 next=draft=18467 prop=18467 olap pair=108.5ms serial=192.5ms gain=84.0ms ratio=0.44 s0=4.2ms s1=188.2ms wait=0.1/45.2ms pred gate=device Token # 158: 114.215ms; value: next_token_ids=tensor([18467], device='cuda:0') mtp accept=1 prop=18467 top1=18467 accp=0.989 next=draft=3796 prop=82558 olap pair=109.0ms serial=193.4ms gain=84.5ms ratio=0.44 s0=4.3ms s1=189.2ms wait=0.1/45.2ms pred gate=device Token # 159: 3.711ms; value: next_token_ids=tensor([82558], device='cuda:0') mtp accept=1 prop=82558 top1=82558 accp=0.385 next=pair draft=3449 prop=3449 pred gate=device Token # 160: 113.981ms; value: next_token_ids=tensor([47321], device='cuda:0') mtp accept=0 prop=3449 top1=47321 accp=0.039 next=draft=101672 prop=101672 olap pair=108.8ms serial=193.0ms gain=84.2ms ratio=0.44 s0=4.3ms s1=188.7ms wait=0.1/44.7ms pred gate=device Token # 161: 114.269ms; value: next_token_ids=tensor([301], device='cuda:0') mtp accept=0 prop=101672 top1=301 accp=0.228 next=draft=17585 prop=17585 olap pair=108.9ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.2ms s1=189.3ms wait=0.1/45.1ms pred gate=device Token # 162: 114.166ms; value: next_token_ids=tensor([17585], device='cuda:0') mtp accept=1 prop=17585 top1=17585 accp=0.995 next=draft=303 prop=303 olap pair=108.8ms serial=193.4ms gain=84.5ms ratio=0.44 s0=3.8ms s1=189.5ms wait=0.1/46.2ms pred gate=device Token # 163: 3.800ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.894 next=pair draft=7882 prop=7882 pred gate=device Token # 164: 113.726ms; value: next_token_ids=tensor([7882], device='cuda:0') mtp accept=1 prop=7882 top1=7882 accp=0.990 next=draft=642 prop=642 olap pair=108.5ms serial=191.9ms gain=83.4ms ratio=0.43 s0=5.9ms s1=186.0ms wait=0.2/44.0ms pred gate=device Token # 165: 3.660ms; value: next_token_ids=tensor([642], device='cuda:0') mtp accept=1 prop=642 top1=642 accp=0.677 next=pair draft=33347 prop=8283 pred gate=device Token # 166: 114.068ms; value: next_token_ids=tensor([33347], device='cuda:0') mtp accept=0 prop=8283 top1=33347 accp=0.623 next=draft=2449 prop=2449 olap pair=108.8ms serial=193.2ms gain=84.4ms ratio=0.44 s0=3.8ms s1=189.4ms wait=0.1/46.3ms pred gate=device Token # 167: 115.026ms; value: next_token_ids=tensor([2449], device='cuda:0') mtp accept=1 prop=2449 top1=2449 accp=1.000 next=draft=74534 prop=74534 olap pair=108.7ms serial=191.6ms gain=82.9ms ratio=0.43 s0=8.5ms s1=183.1ms wait=0.2/41.2ms pred gate=device Token # 168: 4.580ms; value: next_token_ids=tensor([74534], device='cuda:0') mtp accept=1 prop=74534 top1=74534 accp=1.000 next=pair draft=1148 prop=1148 pred gate=device Token # 169: 114.097ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=0 prop=1148 top1=303 accp=0.069 next=draft=1207 prop=1207 olap pair=108.7ms serial=192.8ms gain=84.1ms ratio=0.44 s0=4.3ms s1=188.5ms wait=0.1/45.4ms pred gate=device Token # 170: 114.175ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.906 next=draft=14933 prop=14933 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.3ms s1=189.2ms wait=0.1/45.0ms pred gate=device Token # 171: 3.775ms; value: next_token_ids=tensor([23153], device='cuda:0') mtp accept=0 prop=14933 top1=23153 accp=0.159 next=pair draft=872 prop=872 pred gate=device Token # 172: 113.827ms; value: next_token_ids=tensor([1530], device='cuda:0') mtp accept=0 prop=872 top1=1530 accp=0.021 next=draft=16590 prop=16590 olap pair=108.6ms serial=192.4ms gain=83.8ms ratio=0.44 s0=4.4ms s1=188.0ms wait=0.1/45.0ms pred gate=device Token # 173: 113.937ms; value: next_token_ids=tensor([16590], device='cuda:0') mtp accept=1 prop=16590 top1=16590 accp=0.729 next=draft=16734 prop=16734 olap pair=108.6ms serial=192.7ms gain=84.2ms ratio=0.44 s0=4.3ms s1=188.5ms wait=0.1/44.9ms pred gate=device Token # 174: 3.759ms; value: next_token_ids=tensor([16734], device='cuda:0') mtp accept=1 prop=16734 top1=16734 accp=1.000 next=pair draft=320 prop=320 pred gate=device Token # 175: 114.213ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=320 accp=1.000 next=draft=9168 prop=9168 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.3ms s1=189.2ms wait=0.1/45.1ms pred gate=device Token # 176: 3.691ms; value: next_token_ids=tensor([9168], device='cuda:0') mtp accept=1 prop=9168 top1=9168 accp=1.000 next=pair draft=303 prop=303 pred gate=device Token # 177: 114.415ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.672 next=draft=10626 prop=10626 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=4.4ms s1=189.2ms wait=0.1/44.8ms pred gate=device Token # 178: 3.775ms; value: next_token_ids=tensor([22029], device='cuda:0') mtp accept=0 prop=10626 top1=3279 accp=0.391 next=pair draft=2056 prop=2056 pred gate=device Token # 179: 113.671ms; value: next_token_ids=tensor([2056], device='cuda:0') mtp accept=1 prop=2056 top1=6533 accp=0.451 next=draft=2431 prop=2431 olap pair=108.4ms serial=192.0ms gain=83.6ms ratio=0.44 s0=4.4ms s1=187.6ms wait=0.1/44.9ms pred gate=device Token # 180: 3.801ms; value: next_token_ids=tensor([2431], device='cuda:0') mtp accept=1 prop=2431 top1=2431 accp=0.945 next=pair draft=25422 prop=25422 pred gate=device Token # 181: 113.813ms; value: next_token_ids=tensor([25422], device='cuda:0') mtp accept=1 prop=25422 top1=25422 accp=1.000 next=draft=7849 prop=7849 olap pair=108.6ms serial=192.6ms gain=84.0ms ratio=0.44 s0=4.3ms s1=188.3ms wait=0.1/45.3ms pred gate=device Token # 182: 3.833ms; value: next_token_ids=tensor([1833], device='cuda:0') mtp accept=0 prop=7849 top1=7849 accp=0.622 next=pair draft=719 prop=719 pred gate=device Token # 183: 113.614ms; value: next_token_ids=tensor([719], device='cuda:0') mtp accept=1 prop=719 top1=719 accp=1.000 next=draft=1057 prop=1057 olap pair=108.3ms serial=192.0ms gain=83.7ms ratio=0.44 s0=4.2ms s1=187.8ms wait=0.1/45.2ms pred gate=device Token # 184: 3.741ms; value: next_token_ids=tensor([1057], device='cuda:0') mtp accept=1 prop=1057 top1=1057 accp=1.000 next=pair draft=8283 prop=8283 pred gate=device Token # 185: 114.172ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=draft=320 prop=478 olap pair=108.9ms serial=192.9ms gain=84.0ms ratio=0.44 s0=3.9ms s1=189.0ms wait=0.1/46.1ms pred gate=device Token # 186: 3.761ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=1 prop=478 top1=478 accp=0.379 next=pair draft=3968 prop=3968 pred gate=device Token # 187: 113.878ms; value: next_token_ids=tensor([3968], device='cuda:0') mtp accept=1 prop=3968 top1=3968 accp=0.631 next=draft=39668 prop=39668 olap pair=108.7ms serial=192.8ms gain=84.0ms ratio=0.44 s0=3.9ms s1=188.9ms wait=0.1/46.1ms pred gate=device Token # 188: 3.700ms; value: next_token_ids=tensor([11858], device='cuda:0') mtp accept=0 prop=39668 top1=11858 accp=0.012 next=pair draft=303 prop=303 pred gate=device Token # 189: 114.190ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=34071 prop=34071 olap pair=109.0ms serial=193.0ms gain=84.0ms ratio=0.44 s0=4.2ms s1=188.8ms wait=0.1/45.8ms pred gate=device Token # 190: 3.716ms; value: next_token_ids=tensor([34071], device='cuda:0') mtp accept=1 prop=34071 top1=34071 accp=0.484 next=pair draft=10626 prop=10626 pred gate=device Token # 191: 113.714ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=0.997 next=draft=1255 prop=1255 olap pair=108.5ms serial=191.8ms gain=83.3ms ratio=0.43 s0=5.8ms s1=186.0ms wait=0.2/44.3ms pred gate=device Token # 192: 3.740ms; value: next_token_ids=tensor([1255], device='cuda:0') mtp accept=1 prop=1255 top1=1255 accp=0.711 next=pair draft=19 prop=19 pred gate=device Token # 193: 114.183ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=1.000 next=draft=621 prop=621 olap pair=108.9ms serial=192.0ms gain=83.1ms ratio=0.43 s0=4.1ms s1=187.9ms wait=0.1/46.0ms pred gate=device Token # 194: 3.693ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=pair draft=1457 prop=1457 pred gate=device Token # 195: 114.068ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=0.999 next=draft=18 prop=18 olap pair=108.8ms serial=192.0ms gain=83.2ms ratio=0.43 s0=5.6ms s1=186.5ms wait=0.2/44.2ms pred gate=device Token # 196: 3.751ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=pair draft=76499 prop=76499 pred gate=device Token # 197: 114.125ms; value: next_token_ids=tensor([76499], device='cuda:0') mtp accept=1 prop=76499 top1=76499 accp=0.997 next=draft=303 prop=303 olap pair=108.8ms serial=193.5ms gain=84.7ms ratio=0.44 s0=3.8ms s1=189.7ms wait=0.1/46.2ms pred gate=device Token # 198: 3.771ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=1833 prop=1833 pred gate=device Token # 199: 113.986ms; value: next_token_ids=tensor([7849], device='cuda:0') mtp accept=0 prop=1833 top1=7849 accp=0.472 next=draft=8283 prop=8283 olap pair=108.7ms serial=192.7ms gain=83.9ms ratio=0.44 s0=5.6ms s1=187.0ms wait=0.2/44.4ms pred gate=device Token # 200: 114.552ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=draft=21988 prop=21988 olap pair=109.3ms serial=193.6ms gain=84.3ms ratio=0.44 s0=4.1ms s1=189.4ms wait=0.1/45.9ms pred gate=device Token # 201: 3.730ms; value: next_token_ids=tensor([21988], device='cuda:0') mtp accept=1 prop=21988 top1=21988 accp=0.998 next=pair draft=303 prop=303 pred gate=device Token # 202: 114.419ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.998 next=draft=1207 prop=1207 olap pair=109.2ms serial=192.6ms gain=83.4ms ratio=0.43 s0=4.1ms s1=188.5ms wait=0.1/46.0ms pred gate=device Token # 203: 3.709ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=1.000 next=pair draft=2684 prop=2684 pred gate=device Token # 204: 114.362ms; value: next_token_ids=tensor([2684], device='cuda:0') mtp accept=1 prop=2684 top1=2684 accp=0.982 next=draft=673 prop=673 olap pair=109.2ms serial=192.9ms gain=83.7ms ratio=0.43 s0=4.1ms s1=188.7ms wait=0.1/45.8ms pred gate=device Token # 205: 3.762ms; value: next_token_ids=tensor([673], device='cuda:0') mtp accept=1 prop=673 top1=673 accp=0.611 next=pair draft=3907 prop=3907 pred gate=device Token # 206: 113.919ms; value: next_token_ids=tensor([3907], device='cuda:0') mtp accept=1 prop=3907 top1=3907 accp=0.868 next=draft=1122 prop=1122 olap pair=108.7ms serial=192.4ms gain=83.7ms ratio=0.44 s0=4.0ms s1=188.4ms wait=0.1/46.0ms pred gate=device Token # 207: 3.723ms; value: next_token_ids=tensor([1122], device='cuda:0') mtp accept=1 prop=1122 top1=1122 accp=0.967 next=pair draft=320 prop=320 pred gate=device Token # 208: 113.714ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=320 accp=1.000 next=draft=445 prop=445 olap pair=108.5ms serial=192.0ms gain=83.5ms ratio=0.43 s0=3.9ms s1=188.1ms wait=0.1/46.3ms pred gate=device Token # 209: 3.687ms; value: next_token_ids=tensor([445], device='cuda:0') mtp accept=1 prop=445 top1=445 accp=0.433 next=pair draft=15087 prop=15087 pred gate=device Token # 210: 114.105ms; value: next_token_ids=tensor([22827], device='cuda:0') mtp accept=0 prop=15087 top1=22827 accp=0.036 next=draft=525 prop=525 olap pair=108.9ms serial=193.3ms gain=84.5ms ratio=0.44 s0=3.9ms s1=189.4ms wait=0.1/46.0ms pred gate=device Token # 211: 114.374ms; value: next_token_ids=tensor([525], device='cuda:0') mtp accept=1 prop=525 top1=525 accp=1.000 next=draft=303 prop=303 olap pair=109.1ms serial=192.0ms gain=82.9ms ratio=0.43 s0=5.4ms s1=186.6ms wait=0.1/44.8ms pred gate=device Token # 212: 3.707ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.999 next=pair draft=18912 prop=18912 pred gate=device Token # 213: 114.387ms; value: next_token_ids=tensor([6856], device='cuda:0') mtp accept=0 prop=18912 top1=2204 accp=0.463 next=draft=1122 prop=1122 olap pair=109.2ms serial=193.6ms gain=84.4ms ratio=0.44 s0=6.1ms s1=187.4ms wait=0.2/43.6ms pred gate=device Token # 214: 113.710ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=0 prop=1122 top1=10626 accp=0.036 next=draft=13709 prop=13709 olap pair=108.3ms serial=192.3ms gain=83.9ms ratio=0.44 s0=3.8ms s1=188.5ms wait=0.1/46.2ms pred gate=device Token # 215: 115.106ms; value: next_token_ids=tensor([13709], device='cuda:0') mtp accept=1 prop=13709 top1=13709 accp=0.732 next=draft=1415 prop=1415 olap pair=109.8ms serial=194.2ms gain=84.4ms ratio=0.43 s0=4.1ms s1=190.1ms wait=0.1/46.0ms pred gate=device Token # 216: 3.726ms; value: next_token_ids=tensor([1415], device='cuda:0') mtp accept=1 prop=1415 top1=1415 accp=0.761 next=pair draft=9232 prop=9232 pred gate=device Token # 217: 114.244ms; value: next_token_ids=tensor([9232], device='cuda:0') mtp accept=1 prop=9232 top1=9232 accp=1.000 next=draft=2515 prop=2515 olap pair=108.9ms serial=192.9ms gain=83.9ms ratio=0.44 s0=4.1ms s1=188.7ms wait=0.1/45.5ms pred gate=device Token # 218: 3.707ms; value: next_token_ids=tensor([2515], device='cuda:0') mtp accept=1 prop=2515 top1=2515 accp=1.000 next=pair draft=1316 prop=1316 pred gate=device Token # 219: 113.847ms; value: next_token_ids=tensor([1316], device='cuda:0') mtp accept=1 prop=1316 top1=1316 accp=0.959 next=draft=3412 prop=3412 olap pair=108.6ms serial=192.0ms gain=83.4ms ratio=0.43 s0=5.3ms s1=186.7ms wait=0.2/44.6ms pred gate=device Token # 220: 3.700ms; value: next_token_ids=tensor([3412], device='cuda:0') mtp accept=1 prop=3412 top1=3412 accp=0.935 next=pair draft=8506 prop=8506 pred gate=device Token # 221: 114.243ms; value: next_token_ids=tensor([8506], device='cuda:0') mtp accept=1 prop=8506 top1=8506 accp=0.541 next=draft=10884 prop=10884 olap pair=109.0ms serial=193.7ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.5ms wait=0.1/45.1ms pred gate=device Token # 222: 3.686ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=0 prop=10884 top1=320 accp=0.029 next=pair draft=15023 prop=13103 pred gate=device Token # 223: 114.903ms; value: next_token_ids=tensor([13103], device='cuda:0') mtp accept=1 prop=13103 top1=15023 accp=0.886 next=draft=34071 prop=34071 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.5ms wait=0.1/45.0ms pred gate=device Token # 224: 3.721ms; value: next_token_ids=tensor([34071], device='cuda:0') mtp accept=1 prop=34071 top1=34071 accp=0.576 next=pair draft=3796 prop=3796 pred gate=device Token # 225: 113.965ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=0 prop=3796 top1=10626 accp=0.493 next=draft=1057 prop=1057 olap pair=108.8ms serial=193.0ms gain=84.2ms ratio=0.44 s0=4.3ms s1=188.7ms wait=0.1/45.2ms pred gate=device Token # 226: 113.978ms; value: next_token_ids=tensor([1057], device='cuda:0') mtp accept=1 prop=1057 top1=1057 accp=0.987 next=draft=6370 prop=6370 olap pair=108.7ms serial=193.1ms gain=84.4ms ratio=0.44 s0=3.8ms s1=189.3ms wait=0.1/46.4ms pred gate=device Token # 227: 3.706ms; value: next_token_ids=tensor([6370], device='cuda:0') mtp accept=1 prop=6370 top1=6370 accp=0.971 next=pair draft=303 prop=303 pred gate=device Token # 228: 114.766ms; value: next_token_ids=tensor([4211], device='cuda:0') mtp accept=0 prop=303 top1=4211 accp=0.001 next=draft=303 prop=303 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=5.9ms s1=188.4ms wait=0.2/43.9ms pred gate=device Token # 229: 114.422ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.998 next=draft=1207 prop=1207 olap pair=109.1ms serial=193.9ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.3ms wait=0.1/46.4ms pred gate=device Token # 230: 3.660ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.980 next=pair draft=23153 prop=23153 pred gate=device Token # 231: 114.024ms; value: next_token_ids=tensor([14933], device='cuda:0') mtp accept=0 prop=23153 top1=14933 accp=0.177 next=draft=50678 prop=50678 olap pair=108.8ms serial=193.3ms gain=84.4ms ratio=0.44 s0=4.6ms s1=188.6ms wait=0.1/45.3ms pred gate=device Token # 232: 114.113ms; value: next_token_ids=tensor([50678], device='cuda:0') mtp accept=1 prop=50678 top1=50678 accp=0.949 next=draft=428 prop=428 olap pair=108.8ms serial=193.3ms gain=84.5ms ratio=0.44 s0=3.7ms s1=189.6ms wait=0.1/46.4ms pred gate=device Token # 233: 3.750ms; value: next_token_ids=tensor([428], device='cuda:0') mtp accept=1 prop=428 top1=428 accp=0.982 next=pair draft=4036 prop=4036 pred gate=device Token # 234: 114.074ms; value: next_token_ids=tensor([4036], device='cuda:0') mtp accept=1 prop=4036 top1=4036 accp=1.000 next=draft=10626 prop=10626 olap pair=108.8ms serial=193.4ms gain=84.6ms ratio=0.44 s0=3.7ms s1=189.7ms wait=0.1/46.5ms pred gate=device Token # 235: 3.669ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=1.000 next=pair draft=430 prop=430 pred gate=device Token # 236: 113.828ms; value: next_token_ids=tensor([430], device='cuda:0') mtp accept=1 prop=430 top1=430 accp=0.999 next=draft=429 prop=429 olap pair=108.5ms serial=192.7ms gain=84.2ms ratio=0.44 s0=3.6ms s1=189.1ms wait=0.1/46.6ms pred gate=device Token # 237: 3.698ms; value: next_token_ids=tensor([429], device='cuda:0') mtp accept=1 prop=429 top1=429 accp=1.000 next=pair draft=478 prop=478 pred gate=device Token # 238: 113.500ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=1 prop=478 top1=478 accp=1.000 next=draft=14785 prop=531 olap pair=108.3ms serial=192.2ms gain=83.9ms ratio=0.44 s0=4.4ms s1=187.8ms wait=0.1/45.1ms pred gate=device Token # 239: 3.704ms; value: next_token_ids=tensor([1759], device='cuda:0') mtp accept=0 prop=531 top1=14785 accp=0.326 next=pair draft=3154 prop=3154 pred gate=device Token # 240: 114.164ms; value: next_token_ids=tensor([10655], device='cuda:0') mtp accept=0 prop=3154 top1=3154 accp=0.930 next=draft=768 prop=768 olap pair=108.9ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.1ms s1=189.4ms wait=0.1/45.5ms pred gate=device Token # 241: 114.466ms; value: next_token_ids=tensor([768], device='cuda:0') mtp accept=1 prop=768 top1=768 accp=0.984 next=draft=6640 prop=6640 olap pair=109.2ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.2ms wait=0.1/46.1ms pred gate=device Token # 242: 3.649ms; value: next_token_ids=tensor([6640], device='cuda:0') mtp accept=1 prop=6640 top1=6640 accp=0.946 next=pair draft=2431 prop=2431 pred gate=device Token # 243: 114.257ms; value: next_token_ids=tensor([872], device='cuda:0') mtp accept=0 prop=2431 top1=872 accp=0.167 next=draft=428 prop=428 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.9ms s1=188.6ms wait=0.1/44.9ms pred gate=device Token # 244: 114.131ms; value: next_token_ids=tensor([428], device='cuda:0') mtp accept=1 prop=428 top1=428 accp=1.000 next=draft=4036 prop=4036 olap pair=108.9ms serial=193.3ms gain=84.5ms ratio=0.44 s0=3.9ms s1=189.5ms wait=0.1/46.1ms pred gate=device Token # 245: 3.741ms; value: next_token_ids=tensor([4036], device='cuda:0') mtp accept=1 prop=4036 top1=4036 accp=1.000 next=pair draft=10626 prop=10626 pred gate=device Token # 246: 113.561ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=1.000 next=draft=19 prop=19 olap pair=108.4ms serial=192.4ms gain=84.1ms ratio=0.44 s0=3.7ms s1=188.7ms wait=0.1/46.4ms pred gate=device Token # 247: 3.751ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=0.999 next=pair draft=621 prop=621 pred gate=device Token # 248: 113.806ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=draft=1457 prop=1457 olap pair=108.6ms serial=192.8ms gain=84.2ms ratio=0.44 s0=3.7ms s1=189.1ms wait=0.1/46.5ms pred gate=device Token # 249: 3.752ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=pair draft=18 prop=18 pred gate=device Token # 250: 114.040ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=draft=4231 prop=4231 olap pair=108.8ms serial=193.2ms gain=84.4ms ratio=0.44 s0=4.0ms s1=189.2ms wait=0.1/45.7ms pred gate=device Token # 251: 3.761ms; value: next_token_ids=tensor([4231], device='cuda:0') mtp accept=1 prop=4231 top1=4231 accp=0.997 next=pair draft=2431 prop=2431 pred gate=device Token # 252: 113.605ms; value: next_token_ids=tensor([2431], device='cuda:0') mtp accept=1 prop=2431 top1=2431 accp=0.922 next=draft=16806 prop=16806 olap pair=108.4ms serial=192.4ms gain=84.0ms ratio=0.44 s0=4.4ms s1=188.0ms wait=0.1/45.1ms pred gate=device Token # 253: 3.687ms; value: next_token_ids=tensor([16806], device='cuda:0') mtp accept=1 prop=16806 top1=16806 accp=0.768 next=pair draft=445 prop=1909 pred gate=device Token # 254: 114.074ms; value: next_token_ids=tensor([445], device='cuda:0') mtp accept=0 prop=1909 top1=445 accp=0.358 next=draft=27318 prop=15087 olap pair=108.8ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.5ms s1=188.6ms wait=0.1/44.7ms pred gate=device Token # 255: 114.332ms; value: next_token_ids=tensor([15087], device='cuda:0') mtp accept=1 prop=15087 top1=15087 accp=0.391 next=draft=525 prop=525 olap pair=109.1ms serial=193.7ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.3ms wait=0.1/44.8ms pred gate=device Token # 256: 3.727ms; value: next_token_ids=tensor([525], device='cuda:0') mtp accept=1 prop=525 top1=525 accp=1.000 next=pair draft=4036 prop=4036 pred gate=device Token # 257: 114.213ms; value: next_token_ids=tensor([4036], device='cuda:0') mtp accept=1 prop=4036 top1=4036 accp=0.855 next=draft=19874 prop=19874 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.4ms s1=189.2ms wait=0.1/45.1ms pred gate=device Token # 258: 3.714ms; value: next_token_ids=tensor([7000], device='cuda:0') mtp accept=0 prop=19874 top1=19874 accp=0.616 next=pair draft=2920 prop=2920 pred gate=device Token # 259: 114.290ms; value: next_token_ids=tensor([2920], device='cuda:0') mtp accept=1 prop=2920 top1=2920 accp=0.998 next=draft=8283 prop=8283 olap pair=108.9ms serial=193.4ms gain=84.5ms ratio=0.44 s0=4.3ms s1=189.1ms wait=0.1/45.1ms pred gate=device Token # 260: 3.735ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=pair draft=303 prop=320 pred gate=device Token # 261: 114.496ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=1 prop=320 top1=320 accp=0.469 next=draft=13470 prop=13470 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/45.3ms pred gate=device Token # 262: 3.666ms; value: next_token_ids=tensor([13470], device='cuda:0') mtp accept=1 prop=13470 top1=13470 accp=0.775 next=pair draft=28669 prop=28669 pred gate=device Token # 263: 114.225ms; value: next_token_ids=tensor([28669], device='cuda:0') mtp accept=1 prop=28669 top1=28669 accp=0.976 next=draft=18804 prop=18804 olap pair=109.1ms serial=193.8ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.4ms wait=0.1/45.0ms pred gate=device Token # 264: 3.719ms; value: next_token_ids=tensor([18804], device='cuda:0') mtp accept=1 prop=18804 top1=18804 accp=0.699 next=pair draft=36426 prop=36426 pred gate=device Token # 265: 113.645ms; value: next_token_ids=tensor([26353], device='cuda:0') mtp accept=0 prop=36426 top1=26353 accp=0.073 next=draft=303 prop=303 olap pair=108.4ms serial=192.4ms gain=84.0ms ratio=0.44 s0=4.3ms s1=188.1ms wait=0.1/45.1ms pred gate=device Token # 266: 114.260ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=34071 prop=97011 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.2ms s1=189.3ms wait=0.1/45.3ms pred gate=device Token # 267: 3.714ms; value: next_token_ids=tensor([34071], device='cuda:0') mtp accept=0 prop=97011 top1=34071 accp=0.978 next=pair draft=10626 prop=10626 pred gate=device Token # 268: 114.248ms; value: next_token_ids=tensor([14171], device='cuda:0') mtp accept=0 prop=10626 top1=14171 accp=0.026 next=draft=8283 prop=8283 olap pair=109.0ms serial=193.4ms gain=84.4ms ratio=0.44 s0=4.5ms s1=188.9ms wait=0.1/45.1ms pred gate=device Token # 269: 114.187ms; value: next_token_ids=tensor([12519], device='cuda:0') mtp accept=0 prop=8283 top1=1057 accp=0.083 next=draft=3599 prop=3599 olap pair=108.8ms serial=193.0ms gain=84.2ms ratio=0.44 s0=4.5ms s1=188.5ms wait=0.1/44.5ms pred gate=device Token # 270: 114.380ms; value: next_token_ids=tensor([3599], device='cuda:0') mtp accept=1 prop=3599 top1=3599 accp=0.968 next=draft=8283 prop=8283 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.6ms s1=188.8ms wait=0.1/44.7ms pred gate=device Token # 271: 3.729ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=pair draft=301 prop=301 pred gate=device Token # 272: 113.989ms; value: next_token_ids=tensor([301], device='cuda:0') mtp accept=1 prop=301 top1=301 accp=1.000 next=draft=18804 prop=18804 olap pair=108.7ms serial=192.9ms gain=84.1ms ratio=0.44 s0=4.8ms s1=188.0ms wait=0.1/44.3ms pred gate=device Token # 273: 3.685ms; value: next_token_ids=tensor([18804], device='cuda:0') mtp accept=1 prop=18804 top1=18804 accp=1.000 next=pair draft=478 prop=478 pred gate=device Token # 274: 114.189ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=1 prop=478 top1=478 accp=0.880 next=draft=1207 prop=1207 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.8ms s1=188.7ms wait=0.1/44.3ms pred gate=device Token # 275: 3.656ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.552 next=pair draft=3968 prop=3592 pred gate=device Token # 276: 113.711ms; value: next_token_ids=tensor([3968], device='cuda:0') mtp accept=0 prop=3592 top1=3968 accp=0.890 next=draft=39668 prop=39668 olap pair=108.6ms serial=192.8ms gain=84.3ms ratio=0.44 s0=4.4ms s1=188.4ms wait=0.1/45.2ms pred gate=device Token # 277: 114.116ms; value: next_token_ids=tensor([21843], device='cuda:0') mtp accept=0 prop=39668 top1=21843 accp=0.000 next=draft=11846 prop=11846 olap pair=108.9ms serial=193.6ms gain=84.7ms ratio=0.44 s0=3.7ms s1=189.9ms wait=0.1/46.3ms pred gate=device Token # 278: 114.360ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=0 prop=11846 top1=303 accp=0.065 next=draft=47507 prop=47507 olap pair=109.0ms serial=193.9ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/46.2ms pred gate=device Token # 279: 114.076ms; value: next_token_ids=tensor([47507], device='cuda:0') mtp accept=1 prop=47507 top1=47507 accp=0.797 next=draft=10626 prop=10626 olap pair=108.8ms serial=193.3ms gain=84.5ms ratio=0.44 s0=3.7ms s1=189.6ms wait=0.1/46.3ms pred gate=device Token # 280: 3.770ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=0.992 next=pair draft=1255 prop=1255 pred gate=device Token # 281: 114.293ms; value: next_token_ids=tensor([1255], device='cuda:0') mtp accept=1 prop=1255 top1=1255 accp=0.996 next=draft=19 prop=19 olap pair=109.0ms serial=193.8ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.0ms wait=0.1/46.3ms pred gate=device Token # 282: 3.720ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=1.000 next=pair draft=621 prop=621 pred gate=device Token # 283: 113.898ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=draft=1457 prop=1457 olap pair=108.8ms serial=193.2ms gain=84.5ms ratio=0.44 s0=3.7ms s1=189.6ms wait=0.1/46.3ms pred gate=device Token # 284: 3.690ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=pair draft=18 prop=18 pred gate=device Token # 285: 114.320ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=draft=76499 prop=76499 olap pair=109.1ms serial=193.2ms gain=84.1ms ratio=0.44 s0=4.4ms s1=188.9ms wait=0.1/45.2ms pred gate=device Token # 286: 3.767ms; value: next_token_ids=tensor([76499], device='cuda:0') mtp accept=1 prop=76499 top1=76499 accp=0.998 next=pair draft=303 prop=303 pred gate=device Token # 287: 114.364ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=1833 prop=1833 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=4.3ms s1=189.6ms wait=0.1/45.1ms pred gate=device Token # 288: 3.747ms; value: next_token_ids=tensor([7849], device='cuda:0') mtp accept=0 prop=1833 top1=7849 accp=0.602 next=pair draft=8283 prop=8283 pred gate=device Token # 289: 114.543ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=draft=21988 prop=21988 olap pair=109.3ms serial=194.1ms gain=84.9ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/45.0ms pred gate=device Token # 290: 3.711ms; value: next_token_ids=tensor([21988], device='cuda:0') mtp accept=1 prop=21988 top1=21988 accp=1.000 next=pair draft=303 prop=303 pred gate=device Token # 291: 114.515ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.998 next=draft=1207 prop=1207 olap pair=109.4ms serial=194.0ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.7ms wait=0.1/45.2ms pred gate=device Token # 292: 3.691ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.997 next=pair draft=1299 prop=3968 pred gate=device Token # 293: 114.883ms; value: next_token_ids=tensor([40871], device='cuda:0') mtp accept=0 prop=3968 top1=40871 accp=0.403 next=draft=78699 prop=22827 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/45.0ms pred gate=device Token # 294: 114.158ms; value: next_token_ids=tensor([78699], device='cuda:0') mtp accept=0 prop=22827 top1=78699 accp=0.629 next=draft=303 prop=303 olap pair=108.8ms serial=193.3ms gain=84.4ms ratio=0.44 s0=4.3ms s1=188.9ms wait=0.1/45.1ms pred gate=device Token # 295: 114.459ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=531 prop=531 olap pair=109.2ms serial=194.1ms gain=84.9ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/45.1ms pred gate=device Token # 296: 3.696ms; value: next_token_ids=tensor([2431], device='cuda:0') mtp accept=0 prop=531 top1=2431 accp=0.272 next=pair draft=1299 prop=1299 pred gate=device Token # 297: 114.058ms; value: next_token_ids=tensor([1299], device='cuda:0') mtp accept=1 prop=1299 top1=1299 accp=0.992 next=draft=10626 prop=10626 olap pair=108.8ms serial=192.9ms gain=84.1ms ratio=0.44 s0=6.1ms s1=186.8ms wait=0.2/43.5ms pred gate=device Token # 298: 3.737ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=0.989 next=pair draft=974 prop=974 pred gate=device Token # 299: 114.466ms; value: next_token_ids=tensor([23945], device='cuda:0') mtp accept=0 prop=974 top1=23945 accp=0.007 next=draft=303 prop=303 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/46.3ms pred gate=device Token # 300: 114.432ms; value: next_token_ids=tensor([1148], device='cuda:0') mtp accept=0 prop=303 top1=1148 accp=0.367 next=draft=422 prop=422 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=4.5ms s1=189.2ms wait=0.1/45.3ms pred gate=device Token # 301: 114.341ms; value: next_token_ids=tensor([422], device='cuda:0') mtp accept=1 prop=422 top1=422 accp=0.740 next=draft=303 prop=303 olap pair=109.1ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.8ms wait=0.1/46.3ms pred gate=device Token # 302: 3.695ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=14933 prop=14933 pred gate=device Token # 303: 114.348ms; value: next_token_ids=tensor([14933], device='cuda:0') mtp accept=1 prop=14933 top1=14933 accp=0.864 next=draft=1738 prop=2431 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/46.4ms pred gate=device Token # 304: 3.707ms; value: next_token_ids=tensor([422], device='cuda:0') mtp accept=0 prop=2431 top1=422 accp=0.097 next=pair draft=13380 prop=13380 pred gate=device Token # 305: 114.480ms; value: next_token_ids=tensor([13380], device='cuda:0') mtp accept=1 prop=13380 top1=13380 accp=1.000 next=draft=478 prop=478 olap pair=109.2ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.5ms wait=0.1/46.6ms pred gate=device Token # 306: 3.727ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=1 prop=478 top1=478 accp=0.801 next=pair draft=15023 prop=15023 pred gate=device Token # 307: 114.700ms; value: next_token_ids=tensor([15023], device='cuda:0') mtp accept=1 prop=15023 top1=15023 accp=0.980 next=draft=34071 prop=34071 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.9ms wait=0.1/46.3ms pred gate=device Token # 308: 3.691ms; value: next_token_ids=tensor([6640], device='cuda:0') mtp accept=0 prop=34071 top1=6640 accp=0.344 next=pair draft=5109 prop=5109 pred gate=device Token # 309: 115.387ms; value: next_token_ids=tensor([7578], device='cuda:0') mtp accept=0 prop=5109 top1=7578 accp=0.251 next=draft=3115 prop=3115 olap pair=110.1ms serial=194.8ms gain=84.7ms ratio=0.43 s0=3.7ms s1=191.1ms wait=0.1/46.5ms pred gate=device Token # 310: 114.784ms; value: next_token_ids=tensor([3115], device='cuda:0') mtp accept=1 prop=3115 top1=3115 accp=0.530 next=draft=1275 prop=10626 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/46.5ms pred gate=device Token # 311: 3.676ms; value: next_token_ids=tensor([11525], device='cuda:0') mtp accept=0 prop=10626 top1=11525 accp=0.114 next=pair draft=76499 prop=76499 pred gate=device Token # 312: 114.366ms; value: next_token_ids=tensor([76499], device='cuda:0') mtp accept=1 prop=76499 top1=76499 accp=0.997 next=draft=23884 prop=23884 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.6ms s1=190.3ms wait=0.1/46.4ms pred gate=device Token # 313: 3.691ms; value: next_token_ids=tensor([22089], device='cuda:0') mtp accept=0 prop=23884 top1=22089 accp=0.141 next=pair draft=303 prop=303 pred gate=device Token # 314: 114.156ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.998 next=draft=883 prop=883 olap pair=108.9ms serial=193.6ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.0ms wait=0.1/46.4ms pred gate=device Token # 315: 3.677ms; value: next_token_ids=tensor([883], device='cuda:0') mtp accept=1 prop=883 top1=1530 accp=0.454 next=pair draft=428 prop=428 pred gate=device Token # 316: 114.143ms; value: next_token_ids=tensor([428], device='cuda:0') mtp accept=1 prop=428 top1=428 accp=0.995 next=draft=6895 prop=6895 olap pair=108.9ms serial=193.5ms gain=84.6ms ratio=0.44 s0=3.7ms s1=189.8ms wait=0.1/46.3ms pred gate=device Token # 317: 3.787ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=0 prop=6895 top1=19 accp=0.029 next=pair draft=14 prop=223 pred gate=device Token # 318: 114.870ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.326 next=draft=20 prop=20 olap pair=108.8ms serial=193.0ms gain=84.2ms ratio=0.44 s0=5.1ms s1=187.9ms wait=0.1/44.8ms pred gate=device Token # 319: 4.584ms; value: next_token_ids=tensor([20], device='cuda:0') mtp accept=1 prop=20 top1=20 accp=1.000 next=pair draft=223 prop=223 pred gate=device Token # 320: 114.320ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=draft=21 prop=21 olap pair=108.9ms serial=192.8ms gain=83.9ms ratio=0.44 s0=6.6ms s1=186.3ms wait=0.2/43.1ms pred gate=device Token # 321: 3.690ms; value: next_token_ids=tensor([21], device='cuda:0') mtp accept=1 prop=21 top1=21 accp=1.000 next=pair draft=4588 prop=4588 pred gate=device Token # 322: 114.046ms; value: next_token_ids=tensor([4588], device='cuda:0') mtp accept=1 prop=4588 top1=4588 accp=0.978 next=draft=223 prop=223 olap pair=108.7ms serial=192.9ms gain=84.1ms ratio=0.44 s0=4.7ms s1=188.1ms wait=0.1/44.6ms pred gate=device Token # 323: 3.756ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1457 prop=1457 pred gate=device Token # 324: 114.131ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=draft=18 prop=18 olap pair=108.9ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.8ms s1=188.3ms wait=0.1/44.6ms pred gate=device Token # 325: 3.778ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=pair draft=4231 prop=4231 pred gate=device Token # 326: 114.283ms; value: next_token_ids=tensor([4999], device='cuda:0') mtp accept=0 prop=4231 top1=4999 accp=0.091 next=draft=1207 prop=1207 olap pair=109.0ms serial=193.5ms gain=84.4ms ratio=0.44 s0=4.8ms s1=188.7ms wait=0.1/44.5ms pred gate=device Token # 327: 114.074ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=1.000 next=draft=23153 prop=23153 olap pair=108.8ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.5ms s1=188.6ms wait=0.1/45.0ms pred gate=device Token # 328: 3.754ms; value: next_token_ids=tensor([23153], device='cuda:0') mtp accept=1 prop=23153 top1=23153 accp=0.641 next=pair draft=1530 prop=1530 pred gate=device Token # 329: 114.165ms; value: next_token_ids=tensor([389], device='cuda:0') mtp accept=0 prop=1530 top1=389 accp=0.121 next=draft=428 prop=428 olap pair=108.8ms serial=193.0ms gain=84.3ms ratio=0.44 s0=4.2ms s1=188.9ms wait=0.1/45.4ms pred gate=device Token # 330: 113.908ms; value: next_token_ids=tensor([428], device='cuda:0') mtp accept=1 prop=428 top1=428 accp=1.000 next=draft=4036 prop=4036 olap pair=108.6ms serial=192.8ms gain=84.2ms ratio=0.44 s0=4.1ms s1=188.8ms wait=0.1/45.7ms pred gate=device Token # 331: 3.712ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=0 prop=4036 top1=10626 accp=0.336 next=pair draft=4231 prop=4231 pred gate=device Token # 332: 114.630ms; value: next_token_ids=tensor([4231], device='cuda:0') mtp accept=1 prop=4231 top1=4231 accp=1.000 next=draft=2099 prop=2099 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.1ms wait=0.1/46.0ms pred gate=device Token # 333: 3.692ms; value: next_token_ids=tensor([2099], device='cuda:0') mtp accept=1 prop=2099 top1=2099 accp=0.963 next=pair draft=428 prop=428 pred gate=device Token # 334: 114.180ms; value: next_token_ids=tensor([428], device='cuda:0') mtp accept=1 prop=428 top1=428 accp=1.000 next=draft=2935 prop=14171 olap pair=109.0ms serial=193.5ms gain=84.6ms ratio=0.44 s0=3.7ms s1=189.8ms wait=0.1/46.3ms pred gate=device Token # 335: 3.691ms; value: next_token_ids=tensor([12799], device='cuda:0') mtp accept=0 prop=14171 top1=12799 accp=0.201 next=pair draft=11177 prop=11177 pred gate=device Token # 336: 114.421ms; value: next_token_ids=tensor([11177], device='cuda:0') mtp accept=1 prop=11177 top1=11177 accp=1.000 next=draft=531 prop=531 olap pair=109.2ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.4ms wait=0.1/46.4ms pred gate=device Token # 337: 3.657ms; value: next_token_ids=tensor([531], device='cuda:0') mtp accept=1 prop=531 top1=531 accp=0.570 next=pair draft=5919 prop=5919 pred gate=device Token # 338: 114.621ms; value: next_token_ids=tensor([5919], device='cuda:0') mtp accept=1 prop=5919 top1=5919 accp=1.000 next=draft=768 prop=768 olap pair=109.5ms serial=194.7ms gain=85.2ms ratio=0.44 s0=3.7ms s1=190.9ms wait=0.1/46.4ms pred gate=device Token # 339: 3.694ms; value: next_token_ids=tensor([768], device='cuda:0') mtp accept=1 prop=768 top1=768 accp=0.678 next=pair draft=13470 prop=13470 pred gate=device Token # 340: 114.671ms; value: next_token_ids=tensor([13470], device='cuda:0') mtp accept=1 prop=13470 top1=13470 accp=0.944 next=draft=6640 prop=6640 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.1ms s1=190.4ms wait=0.1/45.6ms pred gate=device Token # 341: 3.669ms; value: next_token_ids=tensor([28669], device='cuda:0') mtp accept=0 prop=6640 top1=28669 accp=0.028 next=pair draft=13503 prop=13503 pred gate=device Token # 342: 114.408ms; value: next_token_ids=tensor([13503], device='cuda:0') mtp accept=1 prop=13503 top1=13503 accp=0.983 next=draft=22827 prop=303 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.5ms wait=0.1/45.0ms pred gate=device Token # 343: 3.837ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=0.645 next=pair draft=34071 prop=34071 pred gate=device Token # 344: 114.408ms; value: next_token_ids=tensor([34071], device='cuda:0') mtp accept=1 prop=34071 top1=34071 accp=0.968 next=draft=14171 prop=14171 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=4.4ms s1=189.3ms wait=0.1/44.7ms pred gate=device Token # 345: 3.741ms; value: next_token_ids=tensor([14171], device='cuda:0') mtp accept=1 prop=14171 top1=14171 accp=0.960 next=pair draft=1057 prop=1057 pred gate=device Token # 346: 113.935ms; value: next_token_ids=tensor([1057], device='cuda:0') mtp accept=1 prop=1057 top1=30869 accp=0.157 next=draft=12519 prop=12519 olap pair=108.7ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.3ms s1=188.8ms wait=0.1/45.0ms pred gate=device Token # 347: 3.749ms; value: next_token_ids=tensor([1168], device='cuda:0') mtp accept=0 prop=12519 top1=1168 accp=0.478 next=pair draft=12244 prop=12244 pred gate=device Token # 348: 114.047ms; value: next_token_ids=tensor([12244], device='cuda:0') mtp accept=1 prop=12244 top1=12244 accp=1.000 next=draft=18804 prop=18804 olap pair=108.8ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.4ms s1=188.7ms wait=0.1/44.9ms pred gate=device Token # 349: 3.696ms; value: next_token_ids=tensor([18804], device='cuda:0') mtp accept=1 prop=18804 top1=18804 accp=0.773 next=pair draft=303 prop=303 pred gate=device Token # 350: 113.914ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=0 prop=303 top1=320 accp=0.227 next=draft=1207 prop=1207 olap pair=108.7ms serial=192.8ms gain=84.0ms ratio=0.44 s0=4.9ms s1=187.9ms wait=0.1/44.6ms pred gate=device Token # 351: 114.335ms; value: next_token_ids=tensor([47507], device='cuda:0') mtp accept=0 prop=1207 top1=47507 accp=0.183 next=draft=3257 prop=3257 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=4.9ms s1=188.8ms wait=0.1/44.5ms pred gate=device Token # 352: 114.675ms; value: next_token_ids=tensor([14171], device='cuda:0') mtp accept=0 prop=3257 top1=3257 accp=0.513 next=draft=1255 prop=1255 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=4.1ms s1=190.3ms wait=0.1/45.9ms pred gate=device Token # 353: 114.111ms; value: next_token_ids=tensor([1255], device='cuda:0') mtp accept=1 prop=1255 top1=1255 accp=0.849 next=draft=19 prop=19 olap pair=108.8ms serial=193.4ms gain=84.6ms ratio=0.44 s0=3.7ms s1=189.7ms wait=0.1/46.4ms pred gate=device Token # 354: 3.729ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=1.000 next=pair draft=621 prop=621 pred gate=device Token # 355: 113.915ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=draft=1457 prop=1457 olap pair=108.7ms serial=193.0ms gain=84.3ms ratio=0.44 s0=4.2ms s1=188.7ms wait=0.1/45.4ms pred gate=device Token # 356: 3.751ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=pair draft=18 prop=18 pred gate=device Token # 357: 114.419ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=draft=76499 prop=76499 olap pair=109.2ms serial=194.1ms gain=84.9ms ratio=0.44 s0=4.4ms s1=189.7ms wait=0.1/45.0ms pred gate=device Token # 358: 3.735ms; value: next_token_ids=tensor([76499], device='cuda:0') mtp accept=1 prop=76499 top1=76499 accp=0.999 next=pair draft=303 prop=303 pred gate=device Token # 359: 114.006ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=1833 prop=7849 olap pair=108.7ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.3ms s1=188.8ms wait=0.1/45.1ms pred gate=device Token # 360: 3.778ms; value: next_token_ids=tensor([7849], device='cuda:0') mtp accept=1 prop=7849 top1=1833 accp=0.730 next=pair draft=8283 prop=21988 pred gate=device Token # 361: 114.592ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=0 prop=21988 top1=8283 accp=0.634 next=draft=21988 prop=21988 olap pair=109.4ms serial=194.0ms gain=84.6ms ratio=0.44 s0=4.6ms s1=189.3ms wait=0.1/44.5ms pred gate=device Token # 362: 114.446ms; value: next_token_ids=tensor([21988], device='cuda:0') mtp accept=1 prop=21988 top1=21988 accp=0.868 next=draft=303 prop=303 olap pair=109.1ms serial=193.9ms gain=84.7ms ratio=0.44 s0=3.9ms s1=190.0ms wait=0.1/46.1ms pred gate=device Token # 363: 3.756ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=0 prop=303 top1=320 accp=0.425 next=pair draft=1207 prop=1207 pred gate=device Token # 364: 114.659ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.873 next=draft=3968 prop=3968 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/46.1ms pred gate=device Token # 365: 3.707ms; value: next_token_ids=tensor([3968], device='cuda:0') mtp accept=1 prop=3968 top1=3968 accp=0.989 next=pair draft=39668 prop=39668 pred gate=device Token # 366: 114.346ms; value: next_token_ids=tensor([39668], device='cuda:0') mtp accept=1 prop=39668 top1=39668 accp=0.999 next=draft=6322 prop=6322 olap pair=109.1ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/46.3ms pred gate=device Token # 367: 3.798ms; value: next_token_ids=tensor([6322], device='cuda:0') mtp accept=1 prop=6322 top1=6322 accp=0.999 next=pair draft=303 prop=303 pred gate=device Token # 368: 114.085ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=34071 prop=34071 olap pair=108.9ms serial=193.3ms gain=84.4ms ratio=0.44 s0=3.7ms s1=189.6ms wait=0.1/46.5ms pred gate=device Token # 369: 3.744ms; value: next_token_ids=tensor([34071], device='cuda:0') mtp accept=1 prop=34071 top1=34071 accp=0.892 next=pair draft=1833 prop=1833 pred gate=device Token # 370: 114.306ms; value: next_token_ids=tensor([1833], device='cuda:0') mtp accept=1 prop=1833 top1=1833 accp=0.999 next=draft=553 prop=553 olap pair=109.0ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.7ms s1=189.9ms wait=0.1/46.4ms pred gate=device Token # 371: 3.730ms; value: next_token_ids=tensor([719], device='cuda:0') mtp accept=0 prop=553 top1=719 accp=0.064 next=pair draft=10626 prop=1729 pred gate=device Token # 372: 114.698ms; value: next_token_ids=tensor([1729], device='cuda:0') mtp accept=1 prop=1729 top1=1729 accp=0.085 next=draft=13097 prop=13097 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.5ms pred gate=device Token # 373: 3.767ms; value: next_token_ids=tensor([13097], device='cuda:0') mtp accept=1 prop=13097 top1=13097 accp=0.877 next=pair draft=8283 prop=8283 pred gate=device Token # 374: 114.781ms; value: next_token_ids=tensor([8283], device='cuda:0') mtp accept=1 prop=8283 top1=8283 accp=1.000 next=draft=303 prop=303 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.3ms s1=190.2ms wait=0.1/45.1ms pred gate=device Token # 375: 3.720ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=7882 prop=7882 pred gate=device Token # 376: 114.676ms; value: next_token_ids=tensor([7882], device='cuda:0') mtp accept=1 prop=7882 top1=7882 accp=0.999 next=draft=553 prop=553 olap pair=109.4ms serial=193.5ms gain=84.1ms ratio=0.43 s0=4.6ms s1=188.9ms wait=0.1/45.2ms pred gate=device Token # 377: 3.807ms; value: next_token_ids=tensor([1833], device='cuda:0') mtp accept=0 prop=553 top1=1833 accp=0.214 next=pair draft=553 prop=553 pred gate=device Token # 378: 114.079ms; value: next_token_ids=tensor([719], device='cuda:0') mtp accept=0 prop=553 top1=719 accp=0.023 next=draft=553 prop=553 olap pair=108.7ms serial=192.9ms gain=84.2ms ratio=0.44 s0=4.3ms s1=188.6ms wait=0.1/45.2ms pred gate=device Token # 379: 114.733ms; value: next_token_ids=tensor([553], device='cuda:0') mtp accept=1 prop=553 top1=553 accp=1.000 next=draft=558 prop=558 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=4.6ms s1=189.5ms wait=0.1/44.5ms pred gate=device Token # 380: 3.748ms; value: next_token_ids=tensor([558], device='cuda:0') mtp accept=1 prop=558 top1=558 accp=1.000 next=pair draft=8283 prop=8283 pred gate=device Token # 381: 114.198ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=0 prop=8283 top1=478 accp=0.199 next=draft=1207 prop=1207 olap pair=109.0ms serial=193.3ms gain=84.3ms ratio=0.44 s0=4.8ms s1=188.4ms wait=0.1/44.4ms pred gate=device Token # 382: 114.401ms; value: next_token_ids=tensor([1207], device='cuda:0') mtp accept=1 prop=1207 top1=1207 accp=0.456 next=draft=23153 prop=23153 olap pair=109.1ms serial=193.7ms gain=84.5ms ratio=0.44 s0=4.8ms s1=188.9ms wait=0.1/44.7ms pred gate=device Token # 383: 3.710ms; value: next_token_ids=tensor([23153], device='cuda:0') mtp accept=1 prop=23153 top1=23153 accp=0.999 next=pair draft=1530 prop=1530 pred gate=device Token # 384: 114.299ms; value: next_token_ids=tensor([1530], device='cuda:0') mtp accept=1 prop=1530 top1=1530 accp=1.000 next=draft=16590 prop=16590 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=4.8ms s1=188.8ms wait=0.1/44.8ms pred gate=device Token # 385: 3.770ms; value: next_token_ids=tensor([16590], device='cuda:0') mtp accept=1 prop=16590 top1=16590 accp=1.000 next=pair draft=16734 prop=16734 pred gate=device Token # 386: 114.265ms; value: next_token_ids=tensor([16734], device='cuda:0') mtp accept=1 prop=16734 top1=16734 accp=1.000 next=draft=303 prop=303 olap pair=109.0ms serial=193.4ms gain=84.4ms ratio=0.44 s0=4.8ms s1=188.7ms wait=0.1/44.4ms pred gate=device Token # 387: 3.753ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=2636 prop=2636 pred gate=device Token # 388: 114.353ms; value: next_token_ids=tensor([2636], device='cuda:0') mtp accept=1 prop=2636 top1=2636 accp=0.847 next=draft=34071 prop=34071 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=4.8ms s1=188.8ms wait=0.1/44.7ms pred gate=device Token # 389: 3.719ms; value: next_token_ids=tensor([34071], device='cuda:0') mtp accept=1 prop=34071 top1=34071 accp=0.925 next=pair draft=8142 prop=8142 pred gate=device Token # 390: 114.279ms; value: next_token_ids=tensor([3461], device='cuda:0') mtp accept=0 prop=8142 top1=3461 accp=0.161 next=draft=4992 prop=4992 olap pair=109.0ms serial=193.5ms gain=84.4ms ratio=0.44 s0=4.8ms s1=188.6ms wait=0.2/44.6ms pred gate=device Token # 391: 114.505ms; value: next_token_ids=tensor([4992], device='cuda:0') mtp accept=1 prop=4992 top1=4992 accp=0.988 next=draft=74870 prop=74870 olap pair=109.3ms serial=193.9ms gain=84.7ms ratio=0.44 s0=4.8ms s1=189.1ms wait=0.1/44.5ms pred gate=device Token # 392: 3.697ms; value: next_token_ids=tensor([74870], device='cuda:0') mtp accept=1 prop=74870 top1=74870 accp=0.996 next=pair draft=301 prop=301 pred gate=device Token # 393: 114.167ms; value: next_token_ids=tensor([301], device='cuda:0') mtp accept=1 prop=301 top1=301 accp=0.988 next=draft=16734 prop=16734 olap pair=108.9ms serial=193.3ms gain=84.4ms ratio=0.44 s0=4.8ms s1=188.5ms wait=0.1/44.4ms pred gate=device Token # 394: 3.769ms; value: next_token_ids=tensor([16734], device='cuda:0') mtp accept=1 prop=16734 top1=16734 accp=1.000 next=pair draft=320 prop=320 pred gate=device Token # 395: 114.182ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=0 prop=320 top1=478 accp=0.291 next=draft=47507 prop=47507 olap pair=109.0ms serial=193.6ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.2ms wait=0.1/45.3ms pred gate=device Token # 396: 114.553ms; value: next_token_ids=tensor([47507], device='cuda:0') mtp accept=1 prop=47507 top1=47507 accp=0.637 next=draft=10626 prop=10626 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/46.4ms pred gate=device Token # 397: 3.705ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=1 prop=10626 top1=10626 accp=0.997 next=pair draft=768 prop=768 pred gate=device Token # 398: 114.522ms; value: next_token_ids=tensor([1255], device='cuda:0') mtp accept=0 prop=768 top1=1255 accp=0.071 next=draft=19 prop=19 olap pair=109.2ms serial=194.1ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.0ms wait=0.1/46.0ms pred gate=device Token # 399: 114.635ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=1.000 next=draft=621 prop=621 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/46.4ms pred gate=device Token # 400: 3.797ms; value: next_token_ids=tensor([621], device='cuda:0') mtp accept=1 prop=621 top1=621 accp=1.000 next=pair draft=1457 prop=1457 pred gate=device Token # 401: 114.244ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=draft=18 prop=18 olap pair=108.9ms serial=193.4ms gain=84.5ms ratio=0.44 s0=4.4ms s1=189.0ms wait=0.1/45.2ms pred gate=device Token # 402: 3.723ms; value: next_token_ids=tensor([18], device='cuda:0') mtp accept=1 prop=18 top1=18 accp=1.000 next=pair draft=76499 prop=76499 pred gate=device Token # 403: 114.928ms; value: next_token_ids=tensor([76499], device='cuda:0') mtp accept=1 prop=76499 top1=76499 accp=0.987 next=draft=303 prop=303 olap pair=109.5ms serial=194.1ms gain=84.6ms ratio=0.44 s0=4.6ms s1=189.5ms wait=0.1/45.0ms pred gate=device Token # 404: 3.790ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=pair draft=1833 prop=1833 pred gate=device Token # 405: 114.456ms; value: next_token_ids=tensor([1833], device='cuda:0') mtp accept=1 prop=1833 top1=1833 accp=0.961 next=draft=719 prop=719 olap pair=109.1ms serial=193.8ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.4ms wait=0.1/44.9ms pred gate=device Token # 406: 3.706ms; value: next_token_ids=tensor([719], device='cuda:0') mtp accept=1 prop=719 top1=719 accp=1.000 next=pair draft=553 prop=553 pred gate=device Token # 407: 114.802ms; value: next_token_ids=tensor([553], device='cuda:0') mtp accept=1 prop=553 top1=553 accp=0.963 next=draft=558 prop=558 olap pair=109.5ms serial=194.7ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.3ms wait=0.1/44.9ms pred gate=device Token # 408: 3.700ms; value: next_token_ids=tensor([558], device='cuda:0') mtp accept=1 prop=558 top1=558 accp=1.000 next=pair draft=303 prop=303 pred gate=device Token # 409: 113.950ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=642 prop=642 olap pair=108.7ms serial=192.8ms gain=84.1ms ratio=0.44 s0=4.4ms s1=188.3ms wait=0.1/45.5ms pred gate=device Token # 410: 3.742ms; value: next_token_ids=tensor([642], device='cuda:0') mtp accept=1 prop=642 top1=642 accp=1.000 next=pair draft=70399 prop=70399 pred gate=device Token # 411: 114.644ms; value: next_token_ids=tensor([70399], device='cuda:0') mtp accept=1 prop=70399 top1=70399 accp=0.902 next=draft=74534 prop=74534 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/46.3ms pred gate=device Token # 412: 3.713ms; value: next_token_ids=tensor([74534], device='cuda:0') mtp accept=1 prop=74534 top1=74534 accp=1.000 next=pair draft=303 prop=303 pred gate=device Token # 413: 114.183ms; value: next_token_ids=tensor([320], device='cuda:0') mtp accept=0 prop=303 top1=320 accp=0.003 next=draft=2684 prop=2684 olap pair=109.0ms serial=193.5ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.7ms wait=0.1/46.4ms pred gate=device Token # 414: 114.377ms; value: next_token_ids=tensor([2684], device='cuda:0') mtp accept=1 prop=2684 top1=2684 accp=1.000 next=draft=1860 prop=6640 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/46.4ms pred gate=device Token # 415: 3.704ms; value: next_token_ids=tensor([450], device='cuda:0') mtp accept=0 prop=6640 top1=450 accp=0.000 next=pair draft=1457 prop=1457 pred gate=device Token # 416: 114.969ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=draft=719 prop=719 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.8ms wait=0.1/46.1ms pred gate=device Token # 417: 3.781ms; value: next_token_ids=tensor([719], device='cuda:0') mtp accept=1 prop=719 top1=719 accp=1.000 next=pair draft=303 prop=303 pred gate=device Token # 418: 114.267ms; value: next_token_ids=tensor([303], device='cuda:0') mtp accept=1 prop=303 top1=303 accp=1.000 next=draft=8725 prop=4270 olap pair=109.0ms serial=193.7ms gain=84.7ms ratio=0.44 s0=3.9ms s1=189.8ms wait=0.1/46.0ms pred gate=device Token # 419: 3.754ms; value: next_token_ids=tensor([4270], device='cuda:0') mtp accept=1 prop=4270 top1=8725 accp=0.842 next=pair draft=74870 prop=74870 pred gate=device Token # 420: 114.745ms; value: next_token_ids=tensor([74870], device='cuda:0') mtp accept=1 prop=74870 top1=74870 accp=0.961 next=draft=478 prop=478 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.4ms wait=0.1/45.7ms pred gate=device Token # 421: 3.706ms; value: next_token_ids=tensor([478], device='cuda:0') mtp accept=1 prop=478 top1=478 accp=1.000 next=pair draft=2204 prop=2204 pred gate=device Token # 422: 114.989ms; value: next_token_ids=tensor([3257], device='cuda:0') mtp accept=0 prop=2204 top1=3257 accp=0.151 next=draft=768 prop=768 olap pair=109.8ms serial=194.4ms gain=84.6ms ratio=0.44 s0=4.4ms s1=190.0ms wait=0.1/45.3ms pred gate=device Token # 423: 114.544ms; value: next_token_ids=tensor([10626], device='cuda:0') mtp accept=0 prop=768 top1=10626 accp=0.002 next=draft=768 prop=768 olap pair=109.3ms serial=193.9ms gain=84.5ms ratio=0.44 s0=4.0ms s1=189.9ms wait=0.1/46.1ms pred gate=device Token # 424: 114.538ms; value: next_token_ids=tensor([768], device='cuda:0') mtp accept=1 prop=768 top1=768 accp=0.874 next=draft=128799 prop=128799 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.1ms wait=0.1/45.9ms pred gate=device Token # 425: 3.687ms; value: next_token_ids=tensor([128799], device='cuda:0') mtp accept=1 prop=128799 top1=128799 accp=1.000 next=pair draft=19 prop=19 pred gate=device Token # 426: 114.550ms; value: next_token_ids=tensor([19], device='cuda:0') mtp accept=1 prop=19 top1=19 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/46.3ms pred gate=device Token # 427: 3.759ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=20 prop=20 pred gate=device Token # 428: 114.082ms; value: next_token_ids=tensor([20], device='cuda:0') mtp accept=1 prop=20 top1=20 accp=1.000 next=draft=223 prop=223 olap pair=108.8ms serial=193.3ms gain=84.5ms ratio=0.44 s0=3.8ms s1=189.5ms wait=0.1/46.3ms pred gate=device Token # 429: 3.721ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21 prop=21 pred gate=device Token # 430: 114.190ms; value: next_token_ids=tensor([21], device='cuda:0') mtp accept=1 prop=21 top1=21 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.4ms gain=84.4ms ratio=0.44 s0=4.1ms s1=189.3ms wait=0.1/45.9ms pred gate=device Token # 431: 3.730ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22 prop=22 pred gate=device Token # 432: 114.567ms; value: next_token_ids=tensor([22], device='cuda:0') mtp accept=1 prop=22 top1=22 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/46.4ms pred gate=device Token # 433: 3.644ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23 prop=23 pred gate=device Token # 434: 114.473ms; value: next_token_ids=tensor([23], device='cuda:0') mtp accept=1 prop=23 top1=23 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/46.5ms pred gate=device Token # 435: 3.690ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24 prop=24 pred gate=device Token # 436: 115.334ms; value: next_token_ids=tensor([24], device='cuda:0') mtp accept=1 prop=24 top1=24 accp=0.997 next=draft=223 prop=223 olap pair=110.1ms serial=195.8ms gain=85.7ms ratio=0.44 s0=3.8ms s1=192.0ms wait=0.1/46.3ms pred gate=device Token # 437: 3.682ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25 prop=25 pred gate=device Token # 438: 114.758ms; value: next_token_ids=tensor([25], device='cuda:0') mtp accept=1 prop=25 top1=25 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/46.4ms pred gate=device Token # 439: 3.767ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.844 next=pair draft=26 prop=26 pred gate=device Token # 440: 114.954ms; value: next_token_ids=tensor([26], device='cuda:0') mtp accept=1 prop=26 top1=26 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.2ms ratio=0.44 s0=4.1ms s1=190.9ms wait=0.1/45.8ms pred gate=device Token # 441: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27 prop=27 pred gate=device Token # 442: 114.640ms; value: next_token_ids=tensor([27], device='cuda:0') mtp accept=1 prop=27 top1=27 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.7ms ratio=0.44 s0=4.2ms s1=190.0ms wait=0.1/45.6ms pred gate=device Token # 443: 3.766ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=553 prop=553 pred gate=device Token # 444: 114.559ms; value: next_token_ids=tensor([553], device='cuda:0') mtp accept=1 prop=553 top1=553 accp=1.000 next=draft=201 prop=201 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.3ms wait=0.1/46.0ms pred gate=device Token # 445: 3.790ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.643 next=pair draft=779 prop=779 pred gate=device Token # 446: 115.160ms; value: next_token_ids=tensor([779], device='cuda:0') mtp accept=1 prop=779 top1=779 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/46.2ms pred gate=device Token # 447: 3.732ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=736 prop=736 pred gate=device Token # 448: 115.378ms; value: next_token_ids=tensor([736], device='cuda:0') mtp accept=1 prop=736 top1=736 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=194.9ms gain=84.6ms ratio=0.43 s0=4.2ms s1=190.6ms wait=0.1/45.4ms pred gate=device Token # 449: 3.720ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=907 prop=907 pred gate=device Token # 450: 115.355ms; value: next_token_ids=tensor([907], device='cuda:0') mtp accept=1 prop=907 top1=907 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.0ms gain=83.7ms ratio=0.43 s0=8.8ms s1=184.2ms wait=0.2/41.1ms pred gate=device Token # 451: 4.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=929 prop=929 pred gate=device Token # 452: 115.293ms; value: next_token_ids=tensor([929], device='cuda:0') mtp accept=1 prop=929 top1=929 accp=1.000 next=draft=223 prop=929 olap pair=110.0ms serial=195.1ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.9ms wait=0.1/45.8ms pred gate=device Token # 453: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=929 top1=223 accp=0.840 next=pair draft=856 prop=856 pred gate=device Token # 454: 114.722ms; value: next_token_ids=tensor([856], device='cuda:0') mtp accept=1 prop=856 top1=856 accp=1.000 next=draft=856 prop=856 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.2ms wait=0.1/45.4ms pred gate=device Token # 455: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=856 top1=223 accp=0.317 next=pair draft=926 prop=926 pred gate=device Token # 456: 114.599ms; value: next_token_ids=tensor([926], device='cuda:0') mtp accept=1 prop=926 top1=926 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/46.4ms pred gate=device Token # 457: 3.719ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1002 prop=1002 pred gate=device Token # 458: 114.869ms; value: next_token_ids=tensor([1002], device='cuda:0') mtp accept=1 prop=1002 top1=1002 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.3ms wait=0.1/46.4ms pred gate=device Token # 459: 3.721ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.985 next=pair draft=864 prop=864 pred gate=device Token # 460: 115.019ms; value: next_token_ids=tensor([864], device='cuda:0') mtp accept=1 prop=864 top1=864 accp=1.000 next=draft=223 prop=864 olap pair=109.9ms serial=195.0ms gain=85.1ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.3ms pred gate=device Token # 461: 3.744ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=864 top1=223 accp=0.745 next=pair draft=511 prop=511 pred gate=device Token # 462: 115.189ms; value: next_token_ids=tensor([511], device='cuda:0') mtp accept=1 prop=511 top1=511 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=193.3ms gain=83.3ms ratio=0.43 s0=4.2ms s1=189.1ms wait=0.1/46.0ms pred gate=device Token # 463: 3.759ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.717 next=pair draft=397 prop=397 pred gate=device Token # 464: 115.309ms; value: next_token_ids=tensor([397], device='cuda:0') mtp accept=1 prop=397 top1=397 accp=1.000 next=draft=201 prop=201 olap pair=110.2ms serial=194.1ms gain=83.9ms ratio=0.43 s0=4.2ms s1=189.9ms wait=0.1/46.0ms pred gate=device Token # 465: 3.703ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=1602 prop=1602 pred gate=device Token # 466: 114.764ms; value: next_token_ids=tensor([1602], device='cuda:0') mtp accept=1 prop=1602 top1=1602 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.3ms pred gate=device Token # 467: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1302 prop=1302 pred gate=device Token # 468: 115.642ms; value: next_token_ids=tensor([1302], device='cuda:0') mtp accept=1 prop=1302 top1=1302 accp=1.000 next=draft=1302 prop=223 olap pair=110.5ms serial=196.0ms gain=85.5ms ratio=0.44 s0=3.9ms s1=192.0ms wait=0.1/46.1ms pred gate=device Token # 469: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.273 next=pair draft=1349 prop=1349 pred gate=device Token # 470: 114.780ms; value: next_token_ids=tensor([1349], device='cuda:0') mtp accept=1 prop=1349 top1=1349 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.4ms gain=84.8ms ratio=0.44 s0=5.1ms s1=189.2ms wait=0.1/44.8ms pred gate=device Token # 471: 3.701ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1173 prop=1173 pred gate=device Token # 472: 116.102ms; value: next_token_ids=tensor([1173], device='cuda:0') mtp accept=1 prop=1173 top1=1173 accp=1.000 next=draft=223 prop=1173 olap pair=109.9ms serial=195.3ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/46.3ms pred gate=device Token # 473: 3.885ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=1173 top1=223 accp=0.508 next=pair draft=1069 prop=1069 pred gate=device Token # 474: 114.819ms; value: next_token_ids=tensor([1069], device='cuda:0') mtp accept=1 prop=1069 top1=1069 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.1ms gain=84.5ms ratio=0.44 s0=4.7ms s1=189.4ms wait=0.1/44.9ms pred gate=device Token # 475: 3.749ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.910 next=pair draft=1450 prop=1450 pred gate=device Token # 476: 114.545ms; value: next_token_ids=tensor([1450], device='cuda:0') mtp accept=1 prop=1450 top1=1450 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.8ms ratio=0.44 s0=4.8ms s1=189.3ms wait=0.1/44.8ms pred gate=device Token # 477: 3.764ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=1477 prop=1477 pred gate=device Token # 478: 114.822ms; value: next_token_ids=tensor([1477], device='cuda:0') mtp accept=1 prop=1477 top1=1477 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=4.8ms s1=189.8ms wait=0.1/44.9ms pred gate=device Token # 479: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.978 next=pair draft=1449 prop=1449 pred gate=device Token # 480: 115.327ms; value: next_token_ids=tensor([1449], device='cuda:0') mtp accept=1 prop=1449 top1=1449 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=196.1ms gain=85.9ms ratio=0.44 s0=4.5ms s1=191.6ms wait=0.1/45.3ms pred gate=device Token # 481: 3.760ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.971 next=pair draft=1557 prop=1557 pred gate=device Token # 482: 114.269ms; value: next_token_ids=tensor([1557], device='cuda:0') mtp accept=1 prop=1557 top1=1557 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/46.3ms pred gate=device Token # 483: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1059 prop=1059 pred gate=device Token # 484: 114.999ms; value: next_token_ids=tensor([1059], device='cuda:0') mtp accept=1 prop=1059 top1=1059 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=194.6ms gain=84.8ms ratio=0.44 s0=4.3ms s1=190.3ms wait=0.1/45.6ms pred gate=device Token # 485: 3.756ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=2181 prop=2181 pred gate=device Token # 486: 114.473ms; value: next_token_ids=tensor([2181], device='cuda:0') mtp accept=1 prop=2181 top1=2181 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=84.9ms ratio=0.44 s0=4.4ms s1=189.9ms wait=0.1/45.1ms pred gate=device Token # 487: 3.759ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2111 prop=2111 pred gate=device Token # 488: 114.856ms; value: next_token_ids=tensor([2111], device='cuda:0') mtp accept=1 prop=2111 top1=2111 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=4.1ms s1=190.8ms wait=0.1/45.9ms pred gate=device Token # 489: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1671 prop=1671 pred gate=device Token # 490: 114.235ms; value: next_token_ids=tensor([1671], device='cuda:0') mtp accept=1 prop=1671 top1=1671 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.7ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.4ms wait=0.1/45.4ms pred gate=device Token # 491: 3.780ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2012 prop=2012 pred gate=device Token # 492: 114.730ms; value: next_token_ids=tensor([2012], device='cuda:0') mtp accept=1 prop=2012 top1=2012 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.5ms wait=0.1/45.4ms pred gate=device Token # 493: 3.754ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.988 next=pair draft=1810 prop=1810 pred gate=device Token # 494: 114.575ms; value: next_token_ids=tensor([1810], device='cuda:0') mtp accept=1 prop=1810 top1=1810 accp=1.000 next=draft=223 prop=1810 olap pair=109.4ms serial=194.6ms gain=85.2ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/46.3ms pred gate=device Token # 495: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=1810 top1=223 accp=0.611 next=pair draft=1872 prop=1872 pred gate=device Token # 496: 114.864ms; value: next_token_ids=tensor([1872], device='cuda:0') mtp accept=1 prop=1872 top1=1872 accp=1.000 next=draft=1872 prop=1872 olap pair=109.6ms serial=194.4ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/46.3ms pred gate=device Token # 497: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=1872 top1=223 accp=0.315 next=pair draft=1942 prop=1942 pred gate=device Token # 498: 114.509ms; value: next_token_ids=tensor([1942], device='cuda:0') mtp accept=1 prop=1942 top1=1942 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/46.5ms pred gate=device Token # 499: 3.698ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2080 prop=2080 pred gate=device Token # 500: 115.125ms; value: next_token_ids=tensor([2080], device='cuda:0') mtp accept=1 prop=2080 top1=2080 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.4ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.7ms wait=0.1/46.5ms pred gate=device Token # 501: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=2116 prop=2116 pred gate=device Token # 502: 115.769ms; value: next_token_ids=tensor([2116], device='cuda:0') mtp accept=1 prop=2116 top1=2116 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.6ms gain=84.8ms ratio=0.44 s0=5.8ms s1=188.8ms wait=0.2/44.1ms pred gate=device Token # 503: 4.735ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1484 prop=1484 pred gate=device Token # 504: 115.077ms; value: next_token_ids=tensor([1484], device='cuda:0') mtp accept=1 prop=1484 top1=1484 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=193.4ms gain=84.0ms ratio=0.43 s0=8.9ms s1=184.5ms wait=0.2/40.7ms pred gate=device Token # 505: 3.726ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=3286 prop=3286 pred gate=device Token # 506: 114.792ms; value: next_token_ids=tensor([3286], device='cuda:0') mtp accept=1 prop=3286 top1=3286 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.3ms gain=84.7ms ratio=0.44 s0=6.8ms s1=187.5ms wait=0.2/42.9ms pred gate=device Token # 507: 3.741ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=3180 prop=3180 pred gate=device Token # 508: 114.933ms; value: next_token_ids=tensor([3180], device='cuda:0') mtp accept=1 prop=3180 top1=3180 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.2ms gain=85.4ms ratio=0.44 s0=4.4ms s1=190.8ms wait=0.1/45.2ms pred gate=device Token # 509: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=3354 prop=3354 pred gate=device Token # 510: 114.552ms; value: next_token_ids=tensor([3354], device='cuda:0') mtp accept=1 prop=3354 top1=3354 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/46.4ms pred gate=device Token # 511: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.969 next=pair draft=2240 prop=2240 pred gate=device Token # 512: 114.429ms; value: next_token_ids=tensor([2240], device='cuda:0') mtp accept=1 prop=2240 top1=2240 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.8ms gain=84.5ms ratio=0.44 s0=5.5ms s1=188.3ms wait=0.2/44.2ms pred gate=device Token # 513: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1883 prop=1883 pred gate=device Token # 514: 114.830ms; value: next_token_ids=tensor([1883], device='cuda:0') mtp accept=1 prop=1883 top1=1883 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.3ms ratio=0.44 s0=4.2ms s1=190.8ms wait=0.1/45.7ms pred gate=device Token # 515: 3.724ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.657 next=pair draft=2372 prop=2372 pred gate=device Token # 516: 114.344ms; value: next_token_ids=tensor([2372], device='cuda:0') mtp accept=1 prop=2372 top1=2372 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=4.4ms s1=189.6ms wait=0.1/45.3ms pred gate=device Token # 517: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2491 prop=2491 pred gate=device Token # 518: 114.132ms; value: next_token_ids=tensor([2491], device='cuda:0') mtp accept=1 prop=2491 top1=2491 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.6ms s1=189.0ms wait=0.1/45.0ms pred gate=device Token # 519: 3.742ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=2170 prop=2170 pred gate=device Token # 520: 114.059ms; value: next_token_ids=tensor([2170], device='cuda:0') mtp accept=1 prop=2170 top1=2170 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.2ms gain=84.3ms ratio=0.44 s0=4.8ms s1=188.4ms wait=0.1/44.8ms pred gate=device Token # 521: 3.761ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.946 next=pair draft=2505 prop=2505 pred gate=device Token # 522: 114.248ms; value: next_token_ids=tensor([2505], device='cuda:0') mtp accept=1 prop=2505 top1=2505 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=4.8ms s1=188.9ms wait=0.1/44.9ms pred gate=device Token # 523: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=1328 prop=1328 pred gate=device Token # 524: 114.376ms; value: next_token_ids=tensor([1328], device='cuda:0') mtp accept=1 prop=1328 top1=1328 accp=1.000 next=draft=201 prop=201 olap pair=109.2ms serial=194.1ms gain=84.9ms ratio=0.44 s0=4.2ms s1=189.8ms wait=0.1/45.7ms pred gate=device Token # 525: 3.747ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=4287 prop=4287 pred gate=device Token # 526: 114.646ms; value: next_token_ids=tensor([4287], device='cuda:0') mtp accept=1 prop=4287 top1=4287 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.3ms s1=190.3ms wait=0.1/45.5ms pred gate=device Token # 527: 3.744ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4157 prop=4157 pred gate=device Token # 528: 114.231ms; value: next_token_ids=tensor([4157], device='cuda:0') mtp accept=1 prop=4157 top1=4157 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.2ms wait=0.1/46.4ms pred gate=device Token # 529: 3.710ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4414 prop=4414 pred gate=device Token # 530: 115.175ms; value: next_token_ids=tensor([4414], device='cuda:0') mtp accept=1 prop=4414 top1=4414 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.8ms gain=85.7ms ratio=0.44 s0=3.7ms s1=192.1ms wait=0.1/46.4ms pred gate=device Token # 531: 3.720ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4364 prop=4364 pred gate=device Token # 532: 114.627ms; value: next_token_ids=tensor([4364], device='cuda:0') mtp accept=1 prop=4364 top1=4364 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.7ms gain=85.2ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.3ms pred gate=device Token # 533: 3.740ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2315 prop=2315 pred gate=device Token # 534: 114.234ms; value: next_token_ids=tensor([2315], device='cuda:0') mtp accept=1 prop=2315 top1=2315 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/46.3ms pred gate=device Token # 535: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=3661 prop=3661 pred gate=device Token # 536: 114.376ms; value: next_token_ids=tensor([3661], device='cuda:0') mtp accept=1 prop=3661 top1=3661 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/46.5ms pred gate=device Token # 537: 3.715ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=3351 prop=3351 pred gate=device Token # 538: 114.284ms; value: next_token_ids=tensor([3351], device='cuda:0') mtp accept=1 prop=3351 top1=3351 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/46.4ms pred gate=device Token # 539: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=3175 prop=3175 pred gate=device Token # 540: 114.720ms; value: next_token_ids=tensor([3175], device='cuda:0') mtp accept=1 prop=3175 top1=3175 accp=1.000 next=draft=223 prop=3175 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.1ms wait=0.1/46.5ms pred gate=device Token # 541: 3.728ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=3175 top1=223 accp=0.758 next=pair draft=3318 prop=3318 pred gate=device Token # 542: 114.482ms; value: next_token_ids=tensor([3318], device='cuda:0') mtp accept=1 prop=3318 top1=3318 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.3ms wait=0.1/46.0ms pred gate=device Token # 543: 3.737ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1683 prop=1683 pred gate=device Token # 544: 114.539ms; value: next_token_ids=tensor([1683], device='cuda:0') mtp accept=1 prop=1683 top1=1683 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.2ms gain=84.9ms ratio=0.44 s0=4.0ms s1=190.3ms wait=0.1/46.3ms pred gate=device Token # 545: 3.754ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=4739 prop=4739 pred gate=device Token # 546: 114.388ms; value: next_token_ids=tensor([4739], device='cuda:0') mtp accept=1 prop=4739 top1=4739 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.9ms gain=84.6ms ratio=0.44 s0=4.0ms s1=189.9ms wait=0.1/46.2ms pred gate=device Token # 547: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=4858 prop=4858 pred gate=device Token # 548: 114.978ms; value: next_token_ids=tensor([4858], device='cuda:0') mtp accept=1 prop=4858 top1=4858 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.3ms gain=84.8ms ratio=0.44 s0=4.4ms s1=189.9ms wait=0.1/45.8ms pred gate=device Token # 549: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.866 next=pair draft=4774 prop=4774 pred gate=device Token # 550: 115.007ms; value: next_token_ids=tensor([4774], device='cuda:0') mtp accept=1 prop=4774 top1=4774 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.2ms gain=84.4ms ratio=0.43 s0=4.0ms s1=190.2ms wait=0.1/46.0ms pred gate=device Token # 551: 3.724ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2892 prop=2892 pred gate=device Token # 552: 114.737ms; value: next_token_ids=tensor([2892], device='cuda:0') mtp accept=1 prop=2892 top1=2892 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.7ms gain=84.2ms ratio=0.43 s0=4.3ms s1=189.4ms wait=0.1/45.8ms pred gate=device Token # 553: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2738 prop=2738 pred gate=device Token # 554: 114.196ms; value: next_token_ids=tensor([2738], device='cuda:0') mtp accept=1 prop=2738 top1=2738 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.7ms gain=84.6ms ratio=0.44 s0=3.9ms s1=189.8ms wait=0.1/46.2ms pred gate=device Token # 555: 3.768ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=2574 prop=2574 pred gate=device Token # 556: 114.129ms; value: next_token_ids=tensor([2574], device='cuda:0') mtp accept=1 prop=2574 top1=2574 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.7ms wait=0.1/46.4ms pred gate=device Token # 557: 3.725ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=3186 prop=3186 pred gate=device Token # 558: 114.408ms; value: next_token_ids=tensor([3186], device='cuda:0') mtp accept=1 prop=3186 top1=3186 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=4.2ms s1=189.8ms wait=0.1/45.6ms pred gate=device Token # 559: 3.729ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2973 prop=2973 pred gate=device Token # 560: 115.503ms; value: next_token_ids=tensor([2973], device='cuda:0') mtp accept=1 prop=2973 top1=2973 accp=1.000 next=draft=223 prop=223 olap pair=110.4ms serial=195.7ms gain=85.4ms ratio=0.44 s0=5.8ms s1=189.9ms wait=0.2/43.8ms pred gate=device Token # 561: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=3259 prop=3259 pred gate=device Token # 562: 114.429ms; value: next_token_ids=tensor([3259], device='cuda:0') mtp accept=1 prop=3259 top1=3259 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.5ms wait=0.1/46.5ms pred gate=device Token # 563: 3.794ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2122 prop=2122 pred gate=device Token # 564: 115.370ms; value: next_token_ids=tensor([2122], device='cuda:0') mtp accept=1 prop=2122 top1=2122 accp=1.000 next=draft=201 prop=201 olap pair=110.2ms serial=196.1ms gain=85.9ms ratio=0.44 s0=3.8ms s1=192.3ms wait=0.1/46.5ms pred gate=device Token # 565: 3.757ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=5863 prop=5863 pred gate=device Token # 566: 114.822ms; value: next_token_ids=tensor([5863], device='cuda:0') mtp accept=1 prop=5863 top1=5863 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.5ms pred gate=device Token # 567: 3.725ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4610 prop=4610 pred gate=device Token # 568: 115.265ms; value: next_token_ids=tensor([4610], device='cuda:0') mtp accept=1 prop=4610 top1=4610 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.4ms gain=84.7ms ratio=0.44 s0=5.8ms s1=188.6ms wait=0.2/44.1ms pred gate=device Token # 569: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=5817 prop=5817 pred gate=device Token # 570: 114.576ms; value: next_token_ids=tensor([5817], device='cuda:0') mtp accept=1 prop=5817 top1=5817 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=4.4ms s1=190.0ms wait=0.1/45.2ms pred gate=device Token # 571: 3.798ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6048 prop=6048 pred gate=device Token # 572: 115.809ms; value: next_token_ids=tensor([6048], device='cuda:0') mtp accept=1 prop=6048 top1=6048 accp=1.000 next=draft=223 prop=223 olap pair=110.7ms serial=196.4ms gain=85.7ms ratio=0.44 s0=4.4ms s1=191.9ms wait=0.1/45.2ms pred gate=device Token # 573: 3.794ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.986 next=pair draft=2402 prop=2402 pred gate=device Token # 574: 115.293ms; value: next_token_ids=tensor([2402], device='cuda:0') mtp accept=1 prop=2402 top1=2402 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=4.0ms s1=191.8ms wait=0.1/46.3ms pred gate=device Token # 575: 3.736ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.956 next=pair draft=4307 prop=4307 pred gate=device Token # 576: 114.629ms; value: next_token_ids=tensor([4307], device='cuda:0') mtp accept=1 prop=4307 top1=4307 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.8ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.1ms wait=0.1/46.5ms pred gate=device Token # 577: 3.721ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=3045 prop=3045 pred gate=device Token # 578: 114.653ms; value: next_token_ids=tensor([3045], device='cuda:0') mtp accept=1 prop=3045 top1=3045 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/46.5ms pred gate=device Token # 579: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2597 prop=2597 pred gate=device Token # 580: 114.535ms; value: next_token_ids=tensor([2597], device='cuda:0') mtp accept=1 prop=2597 top1=2597 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.9ms wait=0.1/46.6ms pred gate=device Token # 581: 3.741ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=3981 prop=3981 pred gate=device Token # 582: 114.683ms; value: next_token_ids=tensor([3981], device='cuda:0') mtp accept=1 prop=3981 top1=3981 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/46.5ms pred gate=device Token # 583: 3.738ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1892 prop=1892 pred gate=device Token # 584: 114.682ms; value: next_token_ids=tensor([1892], device='cuda:0') mtp accept=1 prop=1892 top1=1892 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.8ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.1ms wait=0.1/46.5ms pred gate=device Token # 585: 3.760ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.997 next=pair draft=5929 prop=5929 pred gate=device Token # 586: 114.834ms; value: next_token_ids=tensor([5929], device='cuda:0') mtp accept=1 prop=5929 top1=5929 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/46.3ms pred gate=device Token # 587: 3.758ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6078 prop=6078 pred gate=device Token # 588: 114.582ms; value: next_token_ids=tensor([6078], device='cuda:0') mtp accept=1 prop=6078 top1=6078 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/46.4ms pred gate=device Token # 589: 3.740ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.952 next=pair draft=6131 prop=6131 pred gate=device Token # 590: 115.026ms; value: next_token_ids=tensor([6131], device='cuda:0') mtp accept=1 prop=6131 top1=6131 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.3ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.6ms wait=0.1/46.5ms pred gate=device Token # 591: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=5844 prop=5844 pred gate=device Token # 592: 114.558ms; value: next_token_ids=tensor([5844], device='cuda:0') mtp accept=1 prop=5844 top1=5844 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.6ms gain=85.2ms ratio=0.44 s0=3.7ms s1=190.9ms wait=0.1/46.5ms pred gate=device Token # 593: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.797 next=pair draft=5361 prop=5361 pred gate=device Token # 594: 115.168ms; value: next_token_ids=tensor([5361], device='cuda:0') mtp accept=1 prop=5361 top1=5361 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.6ms gain=85.7ms ratio=0.44 s0=3.8ms s1=191.8ms wait=0.1/46.3ms pred gate=device Token # 595: 3.734ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=5926 prop=5926 pred gate=device Token # 596: 115.088ms; value: next_token_ids=tensor([5926], device='cuda:0') mtp accept=1 prop=5926 top1=5926 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.7ms gain=85.7ms ratio=0.44 s0=3.7ms s1=192.0ms wait=0.1/46.5ms pred gate=device Token # 597: 3.765ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=5198 prop=5198 pred gate=device Token # 598: 114.832ms; value: next_token_ids=tensor([5198], device='cuda:0') mtp accept=1 prop=5198 top1=5198 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.3ms wait=0.1/46.5ms pred gate=device Token # 599: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2851 prop=2851 pred gate=device Token # 600: 115.246ms; value: next_token_ids=tensor([2851], device='cuda:0') mtp accept=1 prop=2851 top1=2851 accp=1.000 next=draft=2851 prop=2851 olap pair=110.1ms serial=196.0ms gain=85.9ms ratio=0.44 s0=3.7ms s1=192.3ms wait=0.1/46.6ms pred gate=device Token # 601: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=2851 top1=223 accp=0.439 next=pair draft=4362 prop=4362 pred gate=device Token # 602: 115.256ms; value: next_token_ids=tensor([4362], device='cuda:0') mtp accept=1 prop=4362 top1=4362 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=193.9ms gain=83.8ms ratio=0.43 s0=4.1ms s1=189.8ms wait=0.1/46.2ms pred gate=device Token # 603: 3.810ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.881 next=pair draft=2225 prop=2225 pred gate=device Token # 604: 115.398ms; value: next_token_ids=tensor([2225], device='cuda:0') mtp accept=1 prop=2225 top1=2225 accp=1.000 next=draft=201 prop=201 olap pair=110.2ms serial=194.4ms gain=84.2ms ratio=0.43 s0=4.2ms s1=190.3ms wait=0.1/45.9ms pred gate=device Token # 605: 3.764ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.975 next=pair draft=6207 prop=6207 pred gate=device Token # 606: 114.582ms; value: next_token_ids=tensor([6207], device='cuda:0') mtp accept=1 prop=6207 top1=6207 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.3ms s1=190.2ms wait=0.1/45.2ms pred gate=device Token # 607: 3.723ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6152 prop=6152 pred gate=device Token # 608: 114.850ms; value: next_token_ids=tensor([6152], device='cuda:0') mtp accept=1 prop=6152 top1=6152 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.1ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.4ms wait=0.1/46.5ms pred gate=device Token # 609: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.968 next=pair draft=6420 prop=6420 pred gate=device Token # 610: 114.573ms; value: next_token_ids=tensor([6420], device='cuda:0') mtp accept=1 prop=6420 top1=6420 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.7ms gain=85.2ms ratio=0.44 s0=4.0ms s1=190.7ms wait=0.1/45.9ms pred gate=device Token # 611: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6338 prop=6338 pred gate=device Token # 612: 114.663ms; value: next_token_ids=tensor([6338], device='cuda:0') mtp accept=1 prop=6338 top1=6338 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.3ms s1=190.3ms wait=0.1/45.4ms pred gate=device Token # 613: 3.749ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.987 next=pair draft=2875 prop=2875 pred gate=device Token # 614: 114.442ms; value: next_token_ids=tensor([2875], device='cuda:0') mtp accept=1 prop=2875 top1=2875 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.9ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/45.2ms pred gate=device Token # 615: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=5936 prop=5936 pred gate=device Token # 616: 114.780ms; value: next_token_ids=tensor([5936], device='cuda:0') mtp accept=1 prop=5936 top1=5936 accp=1.000 next=draft=5936 prop=5936 olap pair=109.6ms serial=194.9ms gain=85.3ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/45.3ms pred gate=device Token # 617: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=5936 top1=223 accp=0.438 next=pair draft=5106 prop=5106 pred gate=device Token # 618: 115.621ms; value: next_token_ids=tensor([5106], device='cuda:0') mtp accept=1 prop=5106 top1=5106 accp=1.000 next=draft=223 prop=223 olap pair=110.4ms serial=194.8ms gain=84.4ms ratio=0.43 s0=4.3ms s1=190.6ms wait=0.1/45.8ms pred gate=device Token # 619: 3.759ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.942 next=pair draft=3565 prop=3565 pred gate=device Token # 620: 115.791ms; value: next_token_ids=tensor([3565], device='cuda:0') mtp accept=1 prop=3565 top1=3565 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=194.3ms gain=84.2ms ratio=0.43 s0=6.5ms s1=187.8ms wait=0.2/43.6ms pred gate=device Token # 621: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.983 next=pair draft=1977 prop=1977 pred gate=device Token # 622: 115.495ms; value: next_token_ids=tensor([1977], device='cuda:0') mtp accept=1 prop=1977 top1=1977 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.8ms gain=85.5ms ratio=0.44 s0=4.7ms s1=191.1ms wait=0.1/45.1ms pred gate=device Token # 623: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1457 prop=1457 pred gate=device Token # 624: 115.473ms; value: next_token_ids=tensor([1457], device='cuda:0') mtp accept=1 prop=1457 top1=1457 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=192.8ms gain=83.3ms ratio=0.43 s0=6.7ms s1=186.1ms wait=0.2/43.5ms pred gate=device Token # 625: 4.713ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=4460 prop=4460 pred gate=device Token # 626: 115.115ms; value: next_token_ids=tensor([4460], device='cuda:0') mtp accept=1 prop=4460 top1=4460 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=193.9ms gain=84.1ms ratio=0.43 s0=4.7ms s1=189.2ms wait=0.1/45.5ms pred gate=device Token # 627: 3.737ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=5769 prop=5769 pred gate=device Token # 628: 115.079ms; value: next_token_ids=tensor([5769], device='cuda:0') mtp accept=1 prop=5769 top1=5769 accp=1.000 next=draft=5769 prop=5769 olap pair=109.9ms serial=194.8ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.6ms wait=0.1/45.7ms pred gate=device Token # 629: 3.725ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=5769 top1=223 accp=0.297 next=pair draft=6650 prop=6650 pred gate=device Token # 630: 115.500ms; value: next_token_ids=tensor([6650], device='cuda:0') mtp accept=1 prop=6650 top1=6650 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.1ms gain=84.7ms ratio=0.43 s0=6.3ms s1=188.8ms wait=0.2/44.0ms pred gate=device Token # 631: 3.752ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.986 next=pair draft=7163 prop=7163 pred gate=device Token # 632: 114.896ms; value: next_token_ids=tensor([7163], device='cuda:0') mtp accept=1 prop=7163 top1=7163 accp=1.000 next=draft=7163 prop=7163 olap pair=109.8ms serial=195.1ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.3ms wait=0.1/46.4ms pred gate=device Token # 633: 3.736ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=7163 top1=223 accp=0.218 next=pair draft=6992 prop=6992 pred gate=device Token # 634: 115.150ms; value: next_token_ids=tensor([6992], device='cuda:0') mtp accept=1 prop=6992 top1=6992 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.5ms gain=85.5ms ratio=0.44 s0=3.7ms s1=191.8ms wait=0.1/46.6ms pred gate=device Token # 635: 3.727ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=7336 prop=7336 pred gate=device Token # 636: 115.954ms; value: next_token_ids=tensor([7336], device='cuda:0') mtp accept=1 prop=7336 top1=7336 accp=1.000 next=draft=7336 prop=7336 olap pair=110.8ms serial=196.3ms gain=85.5ms ratio=0.44 s0=3.7ms s1=192.6ms wait=0.1/46.7ms pred gate=device Token # 637: 3.747ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=7336 top1=223 accp=0.023 next=pair draft=7792 prop=7792 pred gate=device Token # 638: 114.501ms; value: next_token_ids=tensor([7792], device='cuda:0') mtp accept=1 prop=7792 top1=7792 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.6ms pred gate=device Token # 639: 3.765ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.966 next=pair draft=6924 prop=6924 pred gate=device Token # 640: 114.899ms; value: next_token_ids=tensor([6924], device='cuda:0') mtp accept=1 prop=6924 top1=6924 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.2ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.3ms wait=0.1/46.2ms pred gate=device Token # 641: 3.750ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.993 next=pair draft=7335 prop=7335 pred gate=device Token # 642: 114.656ms; value: next_token_ids=tensor([7335], device='cuda:0') mtp accept=1 prop=7335 top1=7335 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.2ms gain=84.7ms ratio=0.44 s0=4.0ms s1=190.2ms wait=0.1/46.1ms pred gate=device Token # 643: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.925 next=pair draft=5234 prop=5234 pred gate=device Token # 644: 114.986ms; value: next_token_ids=tensor([5234], device='cuda:0') mtp accept=1 prop=5234 top1=5234 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=195.3ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/46.5ms pred gate=device Token # 645: 3.736ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=5822 prop=5822 pred gate=device Token # 646: 114.803ms; value: next_token_ids=tensor([5822], device='cuda:0') mtp accept=1 prop=5822 top1=5822 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.5ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.2ms wait=0.1/45.6ms pred gate=device Token # 647: 3.781ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.944 next=pair draft=7534 prop=7534 pred gate=device Token # 648: 114.637ms; value: next_token_ids=tensor([7534], device='cuda:0') mtp accept=1 prop=7534 top1=7534 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.2ms s1=190.3ms wait=0.1/45.7ms pred gate=device Token # 649: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.937 next=pair draft=8302 prop=8302 pred gate=device Token # 650: 115.050ms; value: next_token_ids=tensor([8302], device='cuda:0') mtp accept=1 prop=8302 top1=8302 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.1ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.8ms wait=0.1/45.3ms pred gate=device Token # 651: 3.768ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=8594 prop=8594 pred gate=device Token # 652: 114.473ms; value: next_token_ids=tensor([8594], device='cuda:0') mtp accept=1 prop=8594 top1=8594 accp=1.000 next=draft=223 prop=8594 olap pair=109.2ms serial=193.6ms gain=84.4ms ratio=0.44 s0=4.2ms s1=189.4ms wait=0.1/45.9ms pred gate=device Token # 653: 3.831ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=8594 top1=223 accp=0.511 next=pair draft=8059 prop=8059 pred gate=device Token # 654: 114.612ms; value: next_token_ids=tensor([8059], device='cuda:0') mtp accept=1 prop=8059 top1=8059 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/46.3ms pred gate=device Token # 655: 3.732ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=8401 prop=8401 pred gate=device Token # 656: 115.898ms; value: next_token_ids=tensor([8401], device='cuda:0') mtp accept=1 prop=8401 top1=8401 accp=1.000 next=draft=223 prop=223 olap pair=110.7ms serial=195.7ms gain=84.9ms ratio=0.43 s0=4.3ms s1=191.3ms wait=0.1/45.6ms pred gate=device Token # 657: 3.782ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.817 next=pair draft=8717 prop=8717 pred gate=device Token # 658: 115.089ms; value: next_token_ids=tensor([8717], device='cuda:0') mtp accept=1 prop=8717 top1=8717 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/46.5ms pred gate=device Token # 659: 3.754ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=8610 prop=8610 pred gate=device Token # 660: 114.756ms; value: next_token_ids=tensor([8610], device='cuda:0') mtp accept=1 prop=8610 top1=8610 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/46.6ms pred gate=device Token # 661: 3.727ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9127 prop=9127 pred gate=device Token # 662: 114.706ms; value: next_token_ids=tensor([9127], device='cuda:0') mtp accept=1 prop=9127 top1=9127 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/46.7ms pred gate=device Token # 663: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4870 prop=4870 pred gate=device Token # 664: 114.821ms; value: next_token_ids=tensor([4870], device='cuda:0') mtp accept=1 prop=4870 top1=4870 accp=1.000 next=draft=201 prop=201 olap pair=109.6ms serial=194.8ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.9ms wait=0.1/46.6ms pred gate=device Token # 665: 3.757ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=8245 prop=8245 pred gate=device Token # 666: 114.727ms; value: next_token_ids=tensor([8245], device='cuda:0') mtp accept=1 prop=8245 top1=8245 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.4ms gain=84.8ms ratio=0.44 s0=5.4ms s1=189.0ms wait=0.2/44.7ms pred gate=device Token # 667: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=8519 prop=8519 pred gate=device Token # 668: 115.325ms; value: next_token_ids=tensor([8519], device='cuda:0') mtp accept=1 prop=8519 top1=8519 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.7ms gain=85.5ms ratio=0.44 s0=4.1ms s1=191.6ms wait=0.1/46.0ms pred gate=device Token # 669: 3.746ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6895 prop=6895 pred gate=device Token # 670: 114.498ms; value: next_token_ids=tensor([6895], device='cuda:0') mtp accept=1 prop=6895 top1=6895 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=3.9ms s1=190.1ms wait=0.1/46.3ms pred gate=device Token # 671: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=9006 prop=9006 pred gate=device Token # 672: 114.814ms; value: next_token_ids=tensor([9006], device='cuda:0') mtp accept=1 prop=9006 top1=9006 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.9ms s1=191.0ms wait=0.1/46.3ms pred gate=device Token # 673: 3.769ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=7207 prop=7207 pred gate=device Token # 674: 114.743ms; value: next_token_ids=tensor([7207], device='cuda:0') mtp accept=1 prop=7207 top1=7207 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=4.1ms s1=190.6ms wait=0.1/45.9ms pred gate=device Token # 675: 3.729ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9155 prop=9155 pred gate=device Token # 676: 114.068ms; value: next_token_ids=tensor([9155], device='cuda:0') mtp accept=1 prop=9155 top1=9155 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.3ms gain=84.4ms ratio=0.44 s0=4.1ms s1=189.2ms wait=0.1/46.0ms pred gate=device Token # 677: 3.735ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=8870 prop=8870 pred gate=device Token # 678: 114.870ms; value: next_token_ids=tensor([8870], device='cuda:0') mtp accept=1 prop=8870 top1=8870 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.5ms pred gate=device Token # 679: 3.715ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=7833 prop=7833 pred gate=device Token # 680: 114.877ms; value: next_token_ids=tensor([7833], device='cuda:0') mtp accept=1 prop=7833 top1=7833 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.1ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.4ms wait=0.1/46.6ms pred gate=device Token # 681: 3.741ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9603 prop=9603 pred gate=device Token # 682: 114.568ms; value: next_token_ids=tensor([9603], device='cuda:0') mtp accept=1 prop=9603 top1=9603 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.6ms pred gate=device Token # 683: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=7013 prop=7013 pred gate=device Token # 684: 115.980ms; value: next_token_ids=tensor([7013], device='cuda:0') mtp accept=1 prop=7013 top1=7013 accp=1.000 next=draft=201 prop=201 olap pair=110.9ms serial=196.2ms gain=85.4ms ratio=0.43 s0=3.7ms s1=192.5ms wait=0.1/46.5ms pred gate=device Token # 685: 3.732ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=9049 prop=9049 pred gate=device Token # 686: 114.926ms; value: next_token_ids=tensor([9049], device='cuda:0') mtp accept=1 prop=9049 top1=9049 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.1ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.3ms wait=0.1/46.4ms pred gate=device Token # 687: 3.717ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9309 prop=9309 pred gate=device Token # 688: 115.231ms; value: next_token_ids=tensor([9309], device='cuda:0') mtp accept=1 prop=9309 top1=9309 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.5ms gain=85.4ms ratio=0.44 s0=4.2ms s1=191.4ms wait=0.1/45.9ms pred gate=device Token # 689: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=9250 prop=9250 pred gate=device Token # 690: 114.871ms; value: next_token_ids=tensor([9250], device='cuda:0') mtp accept=1 prop=9250 top1=9250 accp=1.000 next=draft=223 prop=9250 olap pair=109.8ms serial=195.3ms gain=85.5ms ratio=0.44 s0=3.7ms s1=191.5ms wait=0.1/46.5ms pred gate=device Token # 691: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=9250 top1=223 accp=0.768 next=pair draft=9451 prop=9451 pred gate=device Token # 692: 115.270ms; value: next_token_ids=tensor([9451], device='cuda:0') mtp accept=1 prop=9451 top1=9451 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.9ms gain=85.8ms ratio=0.44 s0=3.7ms s1=192.2ms wait=0.1/46.5ms pred gate=device Token # 693: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9107 prop=9107 pred gate=device Token # 694: 114.717ms; value: next_token_ids=tensor([9107], device='cuda:0') mtp accept=1 prop=9107 top1=9107 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.4ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/46.2ms pred gate=device Token # 695: 3.690ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9679 prop=9679 pred gate=device Token # 696: 114.709ms; value: next_token_ids=tensor([9679], device='cuda:0') mtp accept=1 prop=9679 top1=9679 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.5ms pred gate=device Token # 697: 3.744ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.924 next=pair draft=9559 prop=9559 pred gate=device Token # 698: 114.651ms; value: next_token_ids=tensor([9559], device='cuda:0') mtp accept=1 prop=9559 top1=9559 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.5ms wait=0.1/46.4ms pred gate=device Token # 699: 3.781ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=10363 prop=10363 pred gate=device Token # 700: 114.973ms; value: next_token_ids=tensor([10363], device='cuda:0') mtp accept=1 prop=10363 top1=10363 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.0ms gain=85.2ms ratio=0.44 s0=4.2ms s1=190.8ms wait=0.1/45.8ms pred gate=device Token # 701: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=10334 prop=10334 pred gate=device Token # 702: 115.275ms; value: next_token_ids=tensor([10334], device='cuda:0') mtp accept=1 prop=10334 top1=10334 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.6ms gain=85.4ms ratio=0.44 s0=4.3ms s1=191.3ms wait=0.1/45.6ms pred gate=device Token # 703: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.994 next=pair draft=7331 prop=7331 pred gate=device Token # 704: 115.088ms; value: next_token_ids=tensor([7331], device='cuda:0') mtp accept=1 prop=7331 top1=7331 accp=1.000 next=draft=201 prop=201 olap pair=109.9ms serial=195.2ms gain=85.3ms ratio=0.44 s0=4.3ms s1=190.9ms wait=0.1/45.5ms pred gate=device Token # 705: 3.740ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.972 next=pair draft=9926 prop=9926 pred gate=device Token # 706: 114.451ms; value: next_token_ids=tensor([9926], device='cuda:0') mtp accept=1 prop=9926 top1=9926 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/46.3ms pred gate=device Token # 707: 3.774ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10193 prop=10193 pred gate=device Token # 708: 114.975ms; value: next_token_ids=tensor([10193], device='cuda:0') mtp accept=1 prop=10193 top1=10193 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.9ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.7ms wait=0.1/45.8ms pred gate=device Token # 709: 3.766ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10543 prop=10543 pred gate=device Token # 710: 114.427ms; value: next_token_ids=tensor([10543], device='cuda:0') mtp accept=1 prop=10543 top1=10543 accp=1.000 next=draft=10543 prop=10543 olap pair=109.3ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/46.6ms pred gate=device Token # 711: 3.746ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=10543 top1=223 accp=0.328 next=pair draft=9775 prop=9775 pred gate=device Token # 712: 114.928ms; value: next_token_ids=tensor([9775], device='cuda:0') mtp accept=1 prop=9775 top1=9775 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.6ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/46.4ms pred gate=device Token # 713: 3.745ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10186 prop=10186 pred gate=device Token # 714: 114.458ms; value: next_token_ids=tensor([10186], device='cuda:0') mtp accept=1 prop=10186 top1=10186 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.6ms pred gate=device Token # 715: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=10765 prop=10765 pred gate=device Token # 716: 115.162ms; value: next_token_ids=tensor([10765], device='cuda:0') mtp accept=1 prop=10765 top1=10765 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=193.9ms gain=83.9ms ratio=0.43 s0=4.1ms s1=189.8ms wait=0.1/46.3ms pred gate=device Token # 717: 3.717ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10594 prop=10594 pred gate=device Token # 718: 114.441ms; value: next_token_ids=tensor([10594], device='cuda:0') mtp accept=1 prop=10594 top1=10594 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.6ms gain=84.3ms ratio=0.44 s0=4.0ms s1=189.6ms wait=0.1/46.4ms pred gate=device Token # 719: 3.749ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=11065 prop=11065 pred gate=device Token # 720: 114.758ms; value: next_token_ids=tensor([11065], device='cuda:0') mtp accept=1 prop=11065 top1=11065 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.1ms wait=0.1/46.5ms pred gate=device Token # 721: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=10751 prop=10751 pred gate=device Token # 722: 114.486ms; value: next_token_ids=tensor([10751], device='cuda:0') mtp accept=1 prop=10751 top1=10751 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/46.4ms pred gate=device Token # 723: 3.759ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4980 prop=4980 pred gate=device Token # 724: 114.454ms; value: next_token_ids=tensor([4980], device='cuda:0') mtp accept=1 prop=4980 top1=4980 accp=1.000 next=draft=201 prop=201 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/46.4ms pred gate=device Token # 725: 3.758ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=10092 prop=10092 pred gate=device Token # 726: 115.617ms; value: next_token_ids=tensor([10092], device='cuda:0') mtp accept=1 prop=10092 top1=10092 accp=1.000 next=draft=223 prop=223 olap pair=110.5ms serial=196.5ms gain=86.1ms ratio=0.44 s0=3.8ms s1=192.8ms wait=0.1/46.5ms pred gate=device Token # 727: 3.704ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10073 prop=10073 pred gate=device Token # 728: 114.738ms; value: next_token_ids=tensor([10073], device='cuda:0') mtp accept=1 prop=10073 top1=10073 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.7ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/46.3ms pred gate=device Token # 729: 3.700ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10183 prop=10183 pred gate=device Token # 730: 115.085ms; value: next_token_ids=tensor([10183], device='cuda:0') mtp accept=1 prop=10183 top1=10183 accp=1.000 next=draft=10183 prop=223 olap pair=109.9ms serial=195.5ms gain=85.6ms ratio=0.44 s0=3.8ms s1=191.7ms wait=0.1/46.5ms pred gate=device Token # 731: 3.714ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.281 next=pair draft=10761 prop=10761 pred gate=device Token # 732: 114.791ms; value: next_token_ids=tensor([10761], device='cuda:0') mtp accept=1 prop=10761 top1=10761 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.3ms wait=0.1/46.6ms pred gate=device Token # 733: 3.761ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10081 prop=10081 pred gate=device Token # 734: 114.086ms; value: next_token_ids=tensor([10081], device='cuda:0') mtp accept=1 prop=10081 top1=10081 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.0ms s1=189.4ms wait=0.1/46.2ms pred gate=device Token # 735: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10650 prop=10650 pred gate=device Token # 736: 114.650ms; value: next_token_ids=tensor([10650], device='cuda:0') mtp accept=1 prop=10650 top1=10650 accp=1.000 next=draft=10650 prop=10650 olap pair=109.4ms serial=193.9ms gain=84.5ms ratio=0.44 s0=5.2ms s1=188.7ms wait=0.1/44.8ms pred gate=device Token # 737: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=10650 top1=223 accp=0.210 next=pair draft=10862 prop=10862 pred gate=device Token # 738: 114.877ms; value: next_token_ids=tensor([10862], device='cuda:0') mtp accept=1 prop=10862 top1=10862 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.9ms wait=0.1/46.4ms pred gate=device Token # 739: 3.727ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=11249 prop=11249 pred gate=device Token # 740: 114.661ms; value: next_token_ids=tensor([11249], device='cuda:0') mtp accept=1 prop=11249 top1=11249 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=84.9ms ratio=0.44 s0=4.0ms s1=190.4ms wait=0.1/46.1ms pred gate=device Token # 741: 3.774ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=10969 prop=10969 pred gate=device Token # 742: 114.783ms; value: next_token_ids=tensor([10969], device='cuda:0') mtp accept=1 prop=10969 top1=10969 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.5ms pred gate=device Token # 743: 3.744ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6970 prop=6970 pred gate=device Token # 744: 115.287ms; value: next_token_ids=tensor([6970], device='cuda:0') mtp accept=1 prop=6970 top1=6970 accp=1.000 next=draft=201 prop=201 olap pair=110.2ms serial=195.5ms gain=85.4ms ratio=0.44 s0=4.1ms s1=191.4ms wait=0.1/45.9ms pred gate=device Token # 745: 3.834ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=10410 prop=10410 pred gate=device Token # 746: 114.234ms; value: next_token_ids=tensor([10410], device='cuda:0') mtp accept=1 prop=10410 top1=10410 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=4.3ms s1=189.4ms wait=0.1/45.6ms pred gate=device Token # 747: 3.732ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10669 prop=10669 pred gate=device Token # 748: 114.625ms; value: next_token_ids=tensor([10669], device='cuda:0') mtp accept=1 prop=10669 top1=10669 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=4.3ms s1=190.0ms wait=0.1/45.6ms pred gate=device Token # 749: 3.768ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9265 prop=9265 pred gate=device Token # 750: 115.103ms; value: next_token_ids=tensor([9265], device='cuda:0') mtp accept=1 prop=9265 top1=9265 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.2ms gain=84.3ms ratio=0.43 s0=4.1ms s1=190.1ms wait=0.1/46.1ms pred gate=device Token # 751: 3.768ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10095 prop=10095 pred gate=device Token # 752: 114.069ms; value: next_token_ids=tensor([10095], device='cuda:0') mtp accept=1 prop=10095 top1=10095 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.8ms wait=0.1/46.4ms pred gate=device Token # 753: 3.758ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10488 prop=10488 pred gate=device Token # 754: 114.447ms; value: next_token_ids=tensor([10488], device='cuda:0') mtp accept=1 prop=10488 top1=10488 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.9ms gain=84.7ms ratio=0.44 s0=4.0ms s1=190.0ms wait=0.1/46.1ms pred gate=device Token # 755: 3.709ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10739 prop=10739 pred gate=device Token # 756: 115.510ms; value: next_token_ids=tensor([10739], device='cuda:0') mtp accept=1 prop=10739 top1=10739 accp=1.000 next=draft=223 prop=223 olap pair=110.4ms serial=196.3ms gain=85.9ms ratio=0.44 s0=3.9ms s1=192.4ms wait=0.1/46.5ms pred gate=device Token # 757: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=11185 prop=11185 pred gate=device Token # 758: 115.158ms; value: next_token_ids=tensor([11185], device='cuda:0') mtp accept=1 prop=11185 top1=11185 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.4ms gain=85.3ms ratio=0.44 s0=4.0ms s1=191.4ms wait=0.1/46.1ms pred gate=device Token # 759: 3.744ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10475 prop=10475 pred gate=device Token # 760: 114.757ms; value: next_token_ids=tensor([10475], device='cuda:0') mtp accept=1 prop=10475 top1=10475 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=193.4ms gain=83.8ms ratio=0.43 s0=4.0ms s1=189.4ms wait=0.1/46.5ms pred gate=device Token # 761: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.954 next=pair draft=11508 prop=11508 pred gate=device Token # 762: 114.580ms; value: next_token_ids=tensor([11508], device='cuda:0') mtp accept=1 prop=11508 top1=11508 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.6ms wait=0.1/46.3ms pred gate=device Token # 763: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=8778 prop=8778 pred gate=device Token # 764: 114.585ms; value: next_token_ids=tensor([8778], device='cuda:0') mtp accept=1 prop=8778 top1=8778 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/46.5ms pred gate=device Token # 765: 3.776ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=10857 prop=10857 pred gate=device Token # 766: 114.237ms; value: next_token_ids=tensor([10857], device='cuda:0') mtp accept=1 prop=10857 top1=10857 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.8ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.0ms wait=0.1/46.5ms pred gate=device Token # 767: 3.795ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10925 prop=10925 pred gate=device Token # 768: 114.709ms; value: next_token_ids=tensor([10925], device='cuda:0') mtp accept=1 prop=10925 top1=10925 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/46.6ms pred gate=device Token # 769: 3.845ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.961 next=pair draft=11454 prop=11454 pred gate=device Token # 770: 114.600ms; value: next_token_ids=tensor([11454], device='cuda:0') mtp accept=1 prop=11454 top1=11454 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/46.5ms pred gate=device Token # 771: 3.742ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=11251 prop=11251 pred gate=device Token # 772: 114.669ms; value: next_token_ids=tensor([11251], device='cuda:0') mtp accept=1 prop=11251 top1=11251 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=4.1ms s1=190.2ms wait=0.1/45.8ms pred gate=device Token # 773: 3.752ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9700 prop=9700 pred gate=device Token # 774: 114.794ms; value: next_token_ids=tensor([9700], device='cuda:0') mtp accept=1 prop=9700 top1=9700 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=4.3ms s1=190.3ms wait=0.1/45.5ms pred gate=device Token # 775: 3.744ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10437 prop=10437 pred gate=device Token # 776: 114.646ms; value: next_token_ids=tensor([10437], device='cuda:0') mtp accept=1 prop=10437 top1=10437 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.6ms wait=0.1/46.3ms pred gate=device Token # 777: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10170 prop=10170 pred gate=device Token # 778: 115.349ms; value: next_token_ids=tensor([10170], device='cuda:0') mtp accept=1 prop=10170 top1=10170 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.6ms ratio=0.44 s0=4.8ms s1=189.2ms wait=0.1/45.4ms pred gate=device Token # 779: 4.737ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9827 prop=9827 pred gate=device Token # 780: 115.122ms; value: next_token_ids=tensor([9827], device='cuda:0') mtp accept=1 prop=9827 top1=9827 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.5ms gain=84.7ms ratio=0.44 s0=6.2ms s1=188.3ms wait=0.2/44.0ms pred gate=device Token # 781: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9598 prop=9598 pred gate=device Token # 782: 115.521ms; value: next_token_ids=tensor([9598], device='cuda:0') mtp accept=1 prop=9598 top1=9598 accp=1.000 next=draft=223 prop=223 olap pair=110.4ms serial=196.4ms gain=86.0ms ratio=0.44 s0=3.7ms s1=192.7ms wait=0.1/46.6ms pred gate=device Token # 783: 3.758ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=5895 prop=5895 pred gate=device Token # 784: 115.088ms; value: next_token_ids=tensor([5895], device='cuda:0') mtp accept=1 prop=5895 top1=5895 accp=1.000 next=draft=201 prop=201 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=4.0ms s1=191.3ms wait=0.1/46.1ms pred gate=device Token # 785: 3.761ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=8939 prop=8939 pred gate=device Token # 786: 114.548ms; value: next_token_ids=tensor([8939], device='cuda:0') mtp accept=1 prop=8939 top1=8939 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/45.5ms pred gate=device Token # 787: 3.685ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=8961 prop=8961 pred gate=device Token # 788: 114.873ms; value: next_token_ids=tensor([8961], device='cuda:0') mtp accept=1 prop=8961 top1=8961 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.5ms gain=84.8ms ratio=0.44 s0=4.2ms s1=190.3ms wait=0.1/45.7ms pred gate=device Token # 789: 3.848ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=8491 prop=8491 pred gate=device Token # 790: 114.715ms; value: next_token_ids=tensor([8491], device='cuda:0') mtp accept=1 prop=8491 top1=8491 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.8ms ratio=0.44 s0=4.2ms s1=190.1ms wait=0.1/45.7ms pred gate=device Token # 791: 3.713ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.831 next=pair draft=7965 prop=7965 pred gate=device Token # 792: 115.441ms; value: next_token_ids=tensor([7965], device='cuda:0') mtp accept=1 prop=7965 top1=7965 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.8ms gain=85.6ms ratio=0.44 s0=4.1ms s1=191.7ms wait=0.1/46.1ms pred gate=device Token # 793: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.784 next=pair draft=7593 prop=7593 pred gate=device Token # 794: 114.647ms; value: next_token_ids=tensor([7593], device='cuda:0') mtp accept=1 prop=7593 top1=7593 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.2ms wait=0.1/45.9ms pred gate=device Token # 795: 3.750ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6917 prop=6917 pred gate=device Token # 796: 114.497ms; value: next_token_ids=tensor([6917], device='cuda:0') mtp accept=1 prop=6917 top1=6917 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/45.4ms pred gate=device Token # 797: 3.768ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.879 next=pair draft=7201 prop=7201 pred gate=device Token # 798: 114.660ms; value: next_token_ids=tensor([7201], device='cuda:0') mtp accept=1 prop=7201 top1=7201 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.3ms s1=190.1ms wait=0.1/45.7ms pred gate=device Token # 799: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6860 prop=6860 pred gate=device Token # 800: 115.459ms; value: next_token_ids=tensor([6860], device='cuda:0') mtp accept=1 prop=6860 top1=6860 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=4.3ms s1=191.6ms wait=0.1/45.6ms pred gate=device Token # 801: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6432 prop=6432 pred gate=device Token # 802: 114.951ms; value: next_token_ids=tensor([6432], device='cuda:0') mtp accept=1 prop=6432 top1=6432 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.2ms ratio=0.44 s0=4.2ms s1=190.7ms wait=0.1/45.6ms pred gate=device Token # 803: 3.780ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=5151 prop=5151 pred gate=device Token # 804: 114.398ms; value: next_token_ids=tensor([5151], device='cuda:0') mtp accept=1 prop=5151 top1=5151 accp=1.000 next=draft=201 prop=201 olap pair=109.2ms serial=193.8ms gain=84.6ms ratio=0.44 s0=4.3ms s1=189.5ms wait=0.1/45.7ms pred gate=device Token # 805: 3.727ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=4470 prop=4470 pred gate=device Token # 806: 115.368ms; value: next_token_ids=tensor([4470], device='cuda:0') mtp accept=1 prop=4470 top1=4470 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=4.3ms s1=191.5ms wait=0.1/45.6ms pred gate=device Token # 807: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4215 prop=4215 pred gate=device Token # 808: 115.254ms; value: next_token_ids=tensor([4215], device='cuda:0') mtp accept=1 prop=4215 top1=4215 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.6ms gain=85.5ms ratio=0.44 s0=4.6ms s1=191.0ms wait=0.1/45.5ms pred gate=device Token # 809: 3.782ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.968 next=pair draft=3885 prop=3885 pred gate=device Token # 810: 114.667ms; value: next_token_ids=tensor([3885], device='cuda:0') mtp accept=1 prop=3885 top1=3885 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.2ms s1=190.3ms wait=0.1/45.8ms pred gate=device Token # 811: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=3464 prop=3464 pred gate=device Token # 812: 115.029ms; value: next_token_ids=tensor([3464], device='cuda:0') mtp accept=1 prop=3464 top1=3464 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.1ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.8ms wait=0.1/45.7ms pred gate=device Token # 813: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=3298 prop=3298 pred gate=device Token # 814: 114.972ms; value: next_token_ids=tensor([3298], device='cuda:0') mtp accept=1 prop=3298 top1=3298 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.1ms gain=85.3ms ratio=0.44 s0=4.1ms s1=191.0ms wait=0.1/46.1ms pred gate=device Token # 815: 3.760ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.971 next=pair draft=2722 prop=2722 pred gate=device Token # 816: 115.108ms; value: next_token_ids=tensor([2722], device='cuda:0') mtp accept=1 prop=2722 top1=2722 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.1ms gain=84.5ms ratio=0.44 s0=6.1ms s1=188.0ms wait=0.2/43.9ms pred gate=device Token # 817: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=2254 prop=2254 pred gate=device Token # 818: 115.077ms; value: next_token_ids=tensor([2254], device='cuda:0') mtp accept=1 prop=2254 top1=2254 accp=1.000 next=draft=2254 prop=223 olap pair=109.4ms serial=193.7ms gain=84.3ms ratio=0.44 s0=5.4ms s1=188.3ms wait=0.1/44.6ms pred gate=device Token # 819: 3.843ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.099 next=pair draft=1809 prop=1809 pred gate=device Token # 820: 114.577ms; value: next_token_ids=tensor([1809], device='cuda:0') mtp accept=1 prop=1809 top1=1809 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/46.4ms pred gate=device Token # 821: 3.913ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=1357 prop=1357 pred gate=device Token # 822: 114.638ms; value: next_token_ids=tensor([1357], device='cuda:0') mtp accept=1 prop=1357 top1=1357 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.7ms gain=84.3ms ratio=0.44 s0=4.2ms s1=189.5ms wait=0.1/45.8ms pred gate=device Token # 823: 3.789ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=792 prop=792 pred gate=device Token # 824: 114.559ms; value: next_token_ids=tensor([792], device='cuda:0') mtp accept=1 prop=792 top1=792 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.2ms s1=189.9ms wait=0.1/45.8ms pred gate=device Token # 825: 3.706ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=643 prop=643 pred gate=device Token # 826: 114.707ms; value: next_token_ids=tensor([643], device='cuda:0') mtp accept=1 prop=643 top1=643 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.0ms s1=190.5ms wait=0.1/46.3ms pred gate=device Token # 827: 3.811ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=939 prop=939 pred gate=device Token # 828: 114.871ms; value: next_token_ids=tensor([939], device='cuda:0') mtp accept=1 prop=939 top1=939 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=4.5ms s1=190.4ms wait=0.1/45.7ms pred gate=device Token # 829: 3.760ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=9146 prop=9146 pred gate=device Token # 830: 114.630ms; value: next_token_ids=tensor([9146], device='cuda:0') mtp accept=1 prop=9146 top1=9146 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.9ms wait=0.1/46.6ms pred gate=device Token # 831: 3.834ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.991 next=pair draft=11154 prop=11154 pred gate=device Token # 832: 114.749ms; value: next_token_ids=tensor([11154], device='cuda:0') mtp accept=1 prop=11154 top1=11154 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=4.0ms s1=190.6ms wait=0.1/46.3ms pred gate=device Token # 833: 3.732ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=11773 prop=11773 pred gate=device Token # 834: 114.716ms; value: next_token_ids=tensor([11773], device='cuda:0') mtp accept=1 prop=11773 top1=11773 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.6ms pred gate=device Token # 835: 3.822ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=13476 prop=13476 pred gate=device Token # 836: 114.547ms; value: next_token_ids=tensor([13476], device='cuda:0') mtp accept=1 prop=13476 top1=13476 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.7ms pred gate=device Token # 837: 3.798ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.948 next=pair draft=13423 prop=13423 pred gate=device Token # 838: 114.660ms; value: next_token_ids=tensor([13423], device='cuda:0') mtp accept=1 prop=13423 top1=13423 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.0ms s1=190.4ms wait=0.1/46.3ms pred gate=device Token # 839: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=13489 prop=13489 pred gate=device Token # 840: 114.449ms; value: next_token_ids=tensor([13489], device='cuda:0') mtp accept=1 prop=13489 top1=13489 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=4.1ms s1=190.0ms wait=0.1/46.1ms pred gate=device Token # 841: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=13959 prop=13959 pred gate=device Token # 842: 114.728ms; value: next_token_ids=tensor([13959], device='cuda:0') mtp accept=1 prop=13959 top1=13959 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/46.9ms pred gate=device Token # 843: 3.755ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10036 prop=10036 pred gate=device Token # 844: 114.616ms; value: next_token_ids=tensor([10036], device='cuda:0') mtp accept=1 prop=10036 top1=10036 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.3ms s1=190.1ms wait=0.1/45.5ms pred gate=device Token # 845: 3.778ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=12321 prop=12321 pred gate=device Token # 846: 114.516ms; value: next_token_ids=tensor([12321], device='cuda:0') mtp accept=1 prop=12321 top1=12321 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=4.4ms s1=189.9ms wait=0.1/45.4ms pred gate=device Token # 847: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=12326 prop=12326 pred gate=device Token # 848: 115.103ms; value: next_token_ids=tensor([12326], device='cuda:0') mtp accept=1 prop=12326 top1=12326 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.4ms gain=85.5ms ratio=0.44 s0=4.0ms s1=191.4ms wait=0.1/46.2ms pred gate=device Token # 849: 3.793ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.967 next=pair draft=13636 prop=13636 pred gate=device Token # 850: 114.701ms; value: next_token_ids=tensor([13636], device='cuda:0') mtp accept=1 prop=13636 top1=13636 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.4ms pred gate=device Token # 851: 3.740ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=13441 prop=13441 pred gate=device Token # 852: 114.662ms; value: next_token_ids=tensor([13441], device='cuda:0') mtp accept=1 prop=13441 top1=13441 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.4ms wait=0.1/45.8ms pred gate=device Token # 853: 3.728ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.991 next=pair draft=13923 prop=13923 pred gate=device Token # 854: 114.645ms; value: next_token_ids=tensor([13923], device='cuda:0') mtp accept=1 prop=13923 top1=13923 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.6ms gain=85.2ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/46.5ms pred gate=device Token # 855: 3.752ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=13822 prop=13822 pred gate=device Token # 856: 114.654ms; value: next_token_ids=tensor([13822], device='cuda:0') mtp accept=1 prop=13822 top1=13822 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.8ms wait=0.1/46.4ms pred gate=device Token # 857: 3.731ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.977 next=pair draft=14632 prop=14632 pred gate=device Token # 858: 114.764ms; value: next_token_ids=tensor([14632], device='cuda:0') mtp accept=1 prop=14632 top1=14632 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.7ms gain=85.1ms ratio=0.44 s0=4.1ms s1=190.6ms wait=0.1/45.9ms pred gate=device Token # 859: 3.719ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15375 prop=15375 pred gate=device Token # 860: 114.583ms; value: next_token_ids=tensor([15375], device='cuda:0') mtp accept=1 prop=15375 top1=15375 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.3ms gain=83.9ms ratio=0.43 s0=4.8ms s1=188.5ms wait=0.1/45.1ms pred gate=device Token # 861: 3.798ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=14917 prop=14917 pred gate=device Token # 862: 117.573ms; value: next_token_ids=tensor([14917], device='cuda:0') mtp accept=1 prop=14917 top1=14917 accp=1.000 next=draft=223 prop=223 olap pair=112.4ms serial=199.4ms gain=87.0ms ratio=0.44 s0=4.3ms s1=195.1ms wait=0.1/45.5ms pred gate=device Token # 863: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=9663 prop=9663 pred gate=device Token # 864: 115.253ms; value: next_token_ids=tensor([9663], device='cuda:0') mtp accept=1 prop=9663 top1=9663 accp=1.000 next=draft=201 prop=201 olap pair=110.1ms serial=194.1ms gain=84.0ms ratio=0.43 s0=4.2ms s1=189.9ms wait=0.1/46.4ms pred gate=device Token # 865: 3.711ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=13723 prop=13723 pred gate=device Token # 866: 114.713ms; value: next_token_ids=tensor([13723], device='cuda:0') mtp accept=1 prop=13723 top1=13723 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.5ms pred gate=device Token # 867: 3.738ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=12398 prop=12398 pred gate=device Token # 868: 115.452ms; value: next_token_ids=tensor([12398], device='cuda:0') mtp accept=1 prop=12398 top1=12398 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.8ms gain=85.5ms ratio=0.44 s0=4.1ms s1=191.6ms wait=0.1/46.0ms pred gate=device Token # 869: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=13035 prop=13035 pred gate=device Token # 870: 115.045ms; value: next_token_ids=tensor([13035], device='cuda:0') mtp accept=1 prop=13035 top1=13035 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.2ms gain=85.3ms ratio=0.44 s0=4.1ms s1=191.1ms wait=0.1/46.2ms pred gate=device Token # 871: 3.752ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=13635 prop=13635 pred gate=device Token # 872: 115.293ms; value: next_token_ids=tensor([13635], device='cuda:0') mtp accept=1 prop=13635 top1=13635 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.3ms gain=85.2ms ratio=0.44 s0=6.4ms s1=189.0ms wait=0.2/43.7ms pred gate=device Token # 873: 3.733ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=12825 prop=12825 pred gate=device Token # 874: 115.037ms; value: next_token_ids=tensor([12825], device='cuda:0') mtp accept=1 prop=12825 top1=12825 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.3ms gain=84.4ms ratio=0.43 s0=7.6ms s1=186.7ms wait=0.2/42.3ms pred gate=device Token # 875: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.978 next=pair draft=15573 prop=15573 pred gate=device Token # 876: 115.029ms; value: next_token_ids=tensor([15573], device='cuda:0') mtp accept=1 prop=15573 top1=15573 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.1ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.5ms pred gate=device Token # 877: 3.754ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.601 next=pair draft=15578 prop=15578 pred gate=device Token # 878: 114.931ms; value: next_token_ids=tensor([15578], device='cuda:0') mtp accept=1 prop=15578 top1=15578 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.0ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/46.6ms pred gate=device Token # 879: 3.719ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15539 prop=15539 pred gate=device Token # 880: 114.356ms; value: next_token_ids=tensor([15539], device='cuda:0') mtp accept=1 prop=15539 top1=15539 accp=1.000 next=draft=223 prop=15539 olap pair=109.2ms serial=194.2ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/46.6ms pred gate=device Token # 881: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=15539 top1=223 accp=0.549 next=pair draft=15731 prop=15731 pred gate=device Token # 882: 114.715ms; value: next_token_ids=tensor([15731], device='cuda:0') mtp accept=1 prop=15731 top1=15731 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=5.0ms s1=189.3ms wait=0.2/45.1ms pred gate=device Token # 883: 3.714ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=11342 prop=11342 pred gate=device Token # 884: 114.532ms; value: next_token_ids=tensor([11342], device='cuda:0') mtp accept=1 prop=11342 top1=11342 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=4.1ms s1=190.1ms wait=0.1/46.1ms pred gate=device Token # 885: 3.757ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=14082 prop=14082 pred gate=device Token # 886: 115.449ms; value: next_token_ids=tensor([14082], device='cuda:0') mtp accept=1 prop=14082 top1=14082 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.7ms gain=85.4ms ratio=0.44 s0=4.3ms s1=191.5ms wait=0.1/45.9ms pred gate=device Token # 887: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=14466 prop=14466 pred gate=device Token # 888: 114.928ms; value: next_token_ids=tensor([14466], device='cuda:0') mtp accept=1 prop=14466 top1=14466 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.2ms gain=85.4ms ratio=0.44 s0=4.1ms s1=191.1ms wait=0.1/46.0ms pred gate=device Token # 889: 3.761ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15004 prop=15004 pred gate=device Token # 890: 115.519ms; value: next_token_ids=tensor([15004], device='cuda:0') mtp accept=1 prop=15004 top1=15004 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.4ms gain=84.5ms ratio=0.43 s0=6.5ms s1=188.0ms wait=0.2/43.6ms pred gate=device Token # 891: 3.837ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.974 next=pair draft=14456 prop=14456 pred gate=device Token # 892: 114.933ms; value: next_token_ids=tensor([14456], device='cuda:0') mtp accept=1 prop=14456 top1=14456 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=4.1ms s1=190.7ms wait=0.1/46.1ms pred gate=device Token # 893: 3.764ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=14843 prop=14843 pred gate=device Token # 894: 115.269ms; value: next_token_ids=tensor([14843], device='cuda:0') mtp accept=1 prop=14843 top1=14843 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.6ms gain=85.5ms ratio=0.44 s0=4.2ms s1=191.4ms wait=0.1/45.9ms pred gate=device Token # 895: 3.764ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.938 next=pair draft=16259 prop=16259 pred gate=device Token # 896: 114.609ms; value: next_token_ids=tensor([16259], device='cuda:0') mtp accept=1 prop=16259 top1=16259 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.1ms gain=84.6ms ratio=0.44 s0=4.6ms s1=189.5ms wait=0.2/45.8ms pred gate=device Token # 897: 3.798ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=16208 prop=16208 pred gate=device Token # 898: 114.434ms; value: next_token_ids=tensor([16208], device='cuda:0') mtp accept=1 prop=16208 top1=16208 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/46.7ms pred gate=device Token # 899: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=15894 prop=15894 pred gate=device Token # 900: 114.806ms; value: next_token_ids=tensor([15894], device='cuda:0') mtp accept=1 prop=15894 top1=15894 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.9ms s1=191.1ms wait=0.1/46.4ms pred gate=device Token # 901: 3.816ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=16146 prop=16146 pred gate=device Token # 902: 114.519ms; value: next_token_ids=tensor([16146], device='cuda:0') mtp accept=1 prop=16146 top1=16146 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.2ms gain=83.8ms ratio=0.43 s0=4.3ms s1=188.8ms wait=0.1/45.9ms pred gate=device Token # 903: 3.828ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.857 next=pair draft=9186 prop=9186 pred gate=device Token # 904: 116.569ms; value: next_token_ids=tensor([9186], device='cuda:0') mtp accept=1 prop=9186 top1=9186 accp=1.000 next=draft=201 prop=201 olap pair=110.6ms serial=193.6ms gain=83.0ms ratio=0.43 s0=7.2ms s1=186.4ms wait=0.2/43.0ms pred gate=device Token # 905: 4.758ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.999 next=pair draft=15191 prop=15191 pred gate=device Token # 906: 114.887ms; value: next_token_ids=tensor([15191], device='cuda:0') mtp accept=1 prop=15191 top1=15191 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.0ms gain=83.5ms ratio=0.43 s0=7.8ms s1=185.2ms wait=0.2/42.2ms pred gate=device Token # 907: 3.794ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15724 prop=15724 pred gate=device Token # 908: 115.222ms; value: next_token_ids=tensor([15724], device='cuda:0') mtp accept=1 prop=15724 top1=15724 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.2ms gain=85.1ms ratio=0.44 s0=6.3ms s1=188.8ms wait=0.2/43.2ms pred gate=device Token # 909: 3.793ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15659 prop=15659 pred gate=device Token # 910: 115.052ms; value: next_token_ids=tensor([15659], device='cuda:0') mtp accept=1 prop=15659 top1=15659 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.4ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.7ms wait=0.1/46.6ms pred gate=device Token # 911: 3.808ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.986 next=pair draft=15676 prop=15676 pred gate=device Token # 912: 114.938ms; value: next_token_ids=tensor([15676], device='cuda:0') mtp accept=1 prop=15676 top1=15676 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.2ms gain=85.5ms ratio=0.44 s0=3.7ms s1=191.5ms wait=0.1/46.8ms pred gate=device Token # 913: 3.826ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.962 next=pair draft=14972 prop=14972 pred gate=device Token # 914: 115.319ms; value: next_token_ids=tensor([14972], device='cuda:0') mtp accept=1 prop=14972 top1=14972 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=196.0ms gain=85.8ms ratio=0.44 s0=3.9ms s1=192.1ms wait=0.1/46.6ms pred gate=device Token # 915: 3.782ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.984 next=pair draft=16256 prop=16256 pred gate=device Token # 916: 114.551ms; value: next_token_ids=tensor([16256], device='cuda:0') mtp accept=1 prop=16256 top1=16256 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/46.6ms pred gate=device Token # 917: 3.794ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15971 prop=15971 pred gate=device Token # 918: 114.676ms; value: next_token_ids=tensor([15971], device='cuda:0') mtp accept=1 prop=15971 top1=15971 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.8ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/46.6ms pred gate=device Token # 919: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=16222 prop=16222 pred gate=device Token # 920: 115.890ms; value: next_token_ids=tensor([16222], device='cuda:0') mtp accept=1 prop=16222 top1=16222 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.6ms gain=84.7ms ratio=0.44 s0=6.1ms s1=188.5ms wait=0.2/44.3ms pred gate=device Token # 921: 4.696ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.985 next=pair draft=16368 prop=16368 pred gate=device Token # 922: 115.703ms; value: next_token_ids=tensor([16368], device='cuda:0') mtp accept=1 prop=16368 top1=16368 accp=1.000 next=draft=223 prop=223 olap pair=110.4ms serial=195.2ms gain=84.9ms ratio=0.43 s0=7.3ms s1=187.9ms wait=0.2/42.8ms pred gate=device Token # 923: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=6793 prop=6793 pred gate=device Token # 924: 115.458ms; value: next_token_ids=tensor([6793], device='cuda:0') mtp accept=1 prop=6793 top1=6793 accp=1.000 next=draft=201 prop=201 olap pair=110.3ms serial=195.3ms gain=85.0ms ratio=0.44 s0=4.0ms s1=191.4ms wait=0.1/46.4ms pred gate=device Token # 925: 3.770ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.997 next=pair draft=15555 prop=15555 pred gate=device Token # 926: 115.765ms; value: next_token_ids=tensor([15555], device='cuda:0') mtp accept=1 prop=15555 top1=15555 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.7ms gain=84.8ms ratio=0.44 s0=6.0ms s1=188.6ms wait=0.2/44.2ms pred gate=device Token # 927: 4.632ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=14639 prop=14639 pred gate=device Token # 928: 115.278ms; value: next_token_ids=tensor([14639], device='cuda:0') mtp accept=1 prop=14639 top1=14639 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.1ms gain=85.2ms ratio=0.44 s0=4.7ms s1=190.4ms wait=0.1/45.5ms pred gate=device Token # 929: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=15637 prop=15637 pred gate=device Token # 930: 115.484ms; value: next_token_ids=tensor([15637], device='cuda:0') mtp accept=1 prop=15637 top1=15637 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=196.0ms gain=85.8ms ratio=0.44 s0=4.1ms s1=192.0ms wait=0.1/46.2ms pred gate=device Token # 931: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.987 next=pair draft=15796 prop=15796 pred gate=device Token # 932: 114.425ms; value: next_token_ids=tensor([15796], device='cuda:0') mtp accept=1 prop=15796 top1=15796 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=4.0ms s1=190.2ms wait=0.1/46.2ms pred gate=device Token # 933: 3.752ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.854 next=pair draft=9489 prop=9489 pred gate=device Token # 934: 114.660ms; value: next_token_ids=tensor([9489], device='cuda:0') mtp accept=1 prop=9489 top1=9489 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.4ms s1=190.2ms wait=0.1/45.3ms pred gate=device Token # 935: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=9636 prop=9636 pred gate=device Token # 936: 114.602ms; value: next_token_ids=tensor([9636], device='cuda:0') mtp accept=1 prop=9636 top1=9636 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.4ms s1=190.2ms wait=0.1/45.4ms pred gate=device Token # 937: 3.782ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.989 next=pair draft=17153 prop=17153 pred gate=device Token # 938: 114.847ms; value: next_token_ids=tensor([17153], device='cuda:0') mtp accept=1 prop=17153 top1=17153 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.9ms s1=190.8ms wait=0.1/46.6ms pred gate=device Token # 939: 3.741ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=17015 prop=17015 pred gate=device Token # 940: 114.686ms; value: next_token_ids=tensor([17015], device='cuda:0') mtp accept=1 prop=17015 top1=17015 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.6ms wait=0.1/46.3ms pred gate=device Token # 941: 3.771ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=17227 prop=17227 pred gate=device Token # 942: 114.520ms; value: next_token_ids=tensor([17227], device='cuda:0') mtp accept=1 prop=17227 top1=17227 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.8ms ratio=0.44 s0=5.9ms s1=188.3ms wait=0.2/44.3ms pred gate=device Token # 943: 3.793ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=13555 prop=13555 pred gate=device Token # 944: 114.715ms; value: next_token_ids=tensor([13555], device='cuda:0') mtp accept=1 prop=13555 top1=13555 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/46.5ms pred gate=device Token # 945: 3.708ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=17384 prop=17384 pred gate=device Token # 946: 114.331ms; value: next_token_ids=tensor([17384], device='cuda:0') mtp accept=1 prop=17384 top1=17384 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.0ms wait=0.1/46.5ms pred gate=device Token # 947: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=16625 prop=16625 pred gate=device Token # 948: 114.842ms; value: next_token_ids=tensor([16625], device='cuda:0') mtp accept=1 prop=16625 top1=16625 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=4.0ms s1=191.0ms wait=0.1/46.2ms pred gate=device Token # 949: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.859 next=pair draft=17034 prop=17034 pred gate=device Token # 950: 114.389ms; value: next_token_ids=tensor([17034], device='cuda:0') mtp accept=1 prop=17034 top1=17034 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=4.8ms s1=189.1ms wait=0.1/44.9ms pred gate=device Token # 951: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.986 next=pair draft=16326 prop=16326 pred gate=device Token # 952: 114.791ms; value: next_token_ids=tensor([16326], device='cuda:0') mtp accept=1 prop=16326 top1=16326 accp=1.000 next=draft=16326 prop=16326 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.5ms wait=0.1/45.8ms pred gate=device Token # 953: 3.742ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=16326 top1=223 accp=0.088 next=pair draft=16674 prop=16674 pred gate=device Token # 954: 114.973ms; value: next_token_ids=tensor([16674], device='cuda:0') mtp accept=1 prop=16674 top1=16674 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.8ms gain=85.0ms ratio=0.44 s0=5.1ms s1=189.8ms wait=0.1/45.2ms pred gate=device Token # 955: 3.754ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=17831 prop=17831 pred gate=device Token # 956: 114.662ms; value: next_token_ids=tensor([17831], device='cuda:0') mtp accept=1 prop=17831 top1=17831 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/46.6ms pred gate=device Token # 957: 3.744ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.992 next=pair draft=17889 prop=17889 pred gate=device Token # 958: 114.701ms; value: next_token_ids=tensor([17889], device='cuda:0') mtp accept=1 prop=17889 top1=17889 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.8ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/46.7ms pred gate=device Token # 959: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.994 next=pair draft=17528 prop=17528 pred gate=device Token # 960: 115.361ms; value: next_token_ids=tensor([17528], device='cuda:0') mtp accept=1 prop=17528 top1=17528 accp=1.000 next=draft=17528 prop=17528 olap pair=110.2ms serial=196.2ms gain=86.0ms ratio=0.44 s0=3.7ms s1=192.5ms wait=0.1/46.8ms pred gate=device Token # 961: 3.749ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=17528 top1=223 accp=0.262 next=pair draft=17895 prop=17895 pred gate=device Token # 962: 114.552ms; value: next_token_ids=tensor([17895], device='cuda:0') mtp accept=1 prop=17895 top1=17895 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.9ms pred gate=device Token # 963: 3.794ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=13587 prop=13587 pred gate=device Token # 964: 115.008ms; value: next_token_ids=tensor([13587], device='cuda:0') mtp accept=1 prop=13587 top1=13587 accp=1.000 next=draft=201 prop=201 olap pair=109.9ms serial=195.4ms gain=85.5ms ratio=0.44 s0=3.6ms s1=191.8ms wait=0.1/46.8ms pred gate=device Token # 965: 3.736ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=17508 prop=17508 pred gate=device Token # 966: 114.733ms; value: next_token_ids=tensor([17508], device='cuda:0') mtp accept=1 prop=17508 top1=17508 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.4ms gain=84.7ms ratio=0.44 s0=3.9ms s1=190.5ms wait=0.1/46.5ms pred gate=device Token # 967: 3.718ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=16879 prop=16879 pred gate=device Token # 968: 115.614ms; value: next_token_ids=tensor([16879], device='cuda:0') mtp accept=1 prop=16879 top1=16879 accp=1.000 next=draft=16879 prop=16879 olap pair=110.5ms serial=195.0ms gain=84.5ms ratio=0.43 s0=4.1ms s1=190.8ms wait=0.1/46.4ms pred gate=device Token # 969: 3.779ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=16879 top1=223 accp=0.431 next=pair draft=17078 prop=17078 pred gate=device Token # 970: 114.764ms; value: next_token_ids=tensor([17078], device='cuda:0') mtp accept=1 prop=17078 top1=17078 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.7ms gain=85.2ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.7ms pred gate=device Token # 971: 3.771ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=17923 prop=17923 pred gate=device Token # 972: 115.118ms; value: next_token_ids=tensor([17923], device='cuda:0') mtp accept=1 prop=17923 top1=17923 accp=1.000 next=draft=17923 prop=17923 olap pair=110.0ms serial=195.5ms gain=85.5ms ratio=0.44 s0=3.9ms s1=191.6ms wait=0.1/46.5ms pred gate=device Token # 973: 3.771ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=17923 top1=223 accp=0.428 next=pair draft=16245 prop=16245 pred gate=device Token # 974: 114.706ms; value: next_token_ids=tensor([16245], device='cuda:0') mtp accept=1 prop=16245 top1=16245 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/46.6ms pred gate=device Token # 975: 3.738ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=17893 prop=17893 pred gate=device Token # 976: 115.034ms; value: next_token_ids=tensor([17893], device='cuda:0') mtp accept=1 prop=17893 top1=17893 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.4ms gain=85.5ms ratio=0.44 s0=3.7ms s1=191.7ms wait=0.1/46.7ms pred gate=device Token # 977: 3.726ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.829 next=pair draft=17255 prop=17255 pred gate=device Token # 978: 114.798ms; value: next_token_ids=tensor([17255], device='cuda:0') mtp accept=1 prop=17255 top1=17255 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/46.8ms pred gate=device Token # 979: 3.819ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=18369 prop=18369 pred gate=device Token # 980: 115.051ms; value: next_token_ids=tensor([18369], device='cuda:0') mtp accept=1 prop=18369 top1=18369 accp=1.000 next=draft=18369 prop=18369 olap pair=109.9ms serial=195.1ms gain=85.2ms ratio=0.44 s0=4.8ms s1=190.3ms wait=0.1/45.5ms pred gate=device Token # 981: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=18369 top1=223 accp=0.060 next=pair draft=18085 prop=18085 pred gate=device Token # 982: 114.945ms; value: next_token_ids=tensor([18085], device='cuda:0') mtp accept=1 prop=18085 top1=18085 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.2ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.4ms wait=0.1/46.7ms pred gate=device Token # 983: 3.741ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=13161 prop=13161 pred gate=device Token # 984: 114.699ms; value: next_token_ids=tensor([13161], device='cuda:0') mtp accept=1 prop=13161 top1=13161 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/46.7ms pred gate=device Token # 985: 3.768ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=17348 prop=17348 pred gate=device Token # 986: 114.692ms; value: next_token_ids=tensor([17348], device='cuda:0') mtp accept=1 prop=17348 top1=17348 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/46.7ms pred gate=device Token # 987: 3.830ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=17961 prop=17961 pred gate=device Token # 988: 114.603ms; value: next_token_ids=tensor([17961], device='cuda:0') mtp accept=1 prop=17961 top1=17961 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.6ms wait=0.1/46.4ms pred gate=device Token # 989: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.652 next=pair draft=17872 prop=17872 pred gate=device Token # 990: 114.857ms; value: next_token_ids=tensor([17872], device='cuda:0') mtp accept=1 prop=17872 top1=17872 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.2ms gain=84.5ms ratio=0.44 s0=4.1ms s1=190.1ms wait=0.1/46.4ms pred gate=device Token # 991: 3.834ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=18460 prop=18460 pred gate=device Token # 992: 115.175ms; value: next_token_ids=tensor([18460], device='cuda:0') mtp accept=1 prop=18460 top1=18460 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.4ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.6ms wait=0.1/46.6ms pred gate=device Token # 993: 3.794ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.990 next=pair draft=17257 prop=17257 pred gate=device Token # 994: 115.428ms; value: next_token_ids=tensor([17257], device='cuda:0') mtp accept=1 prop=17257 top1=17257 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.2ms ratio=0.44 s0=3.9ms s1=191.0ms wait=0.1/46.5ms pred gate=device Token # 995: 4.830ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.974 next=pair draft=18614 prop=18614 pred gate=device Token # 996: 115.199ms; value: next_token_ids=tensor([18614], device='cuda:0') mtp accept=1 prop=18614 top1=18614 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.6ms gain=84.8ms ratio=0.44 s0=5.0ms s1=189.6ms wait=0.1/45.3ms pred gate=device Token # 997: 3.986ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.769 next=pair draft=18754 prop=18754 pred gate=device Token # 998: 115.261ms; value: next_token_ids=tensor([18754], device='cuda:0') mtp accept=1 prop=18754 top1=18754 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.7ms gain=85.7ms ratio=0.44 s0=3.9ms s1=191.9ms wait=0.1/46.6ms pred gate=device Token # 999: 3.768ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=17391 prop=17391 pred gate=device Token # 1000: 115.365ms; value: next_token_ids=tensor([17391], device='cuda:0') mtp accept=1 prop=17391 top1=17391 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.5ms gain=85.3ms ratio=0.44 s0=4.0ms s1=191.6ms wait=0.1/46.5ms pred gate=device Token # 1001: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.529 next=pair draft=18767 prop=18767 pred gate=device Token # 1002: 114.810ms; value: next_token_ids=tensor([18767], device='cuda:0') mtp accept=1 prop=18767 top1=18767 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.8ms pred gate=device Token # 1003: 3.820ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=16234 prop=16234 pred gate=device Token # 1004: 114.745ms; value: next_token_ids=tensor([16234], device='cuda:0') mtp accept=1 prop=16234 top1=16234 accp=1.000 next=draft=201 prop=201 olap pair=109.6ms serial=194.3ms gain=84.7ms ratio=0.44 s0=4.0ms s1=190.3ms wait=0.1/46.5ms pred gate=device Token # 1005: 3.758ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=17979 prop=17979 pred gate=device Token # 1006: 115.293ms; value: next_token_ids=tensor([17979], device='cuda:0') mtp accept=1 prop=17979 top1=17979 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.2ms gain=85.2ms ratio=0.44 s0=4.0ms s1=191.2ms wait=0.1/46.4ms pred gate=device Token # 1007: 3.888ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18307 prop=18307 pred gate=device Token # 1008: 114.554ms; value: next_token_ids=tensor([18307], device='cuda:0') mtp accept=1 prop=18307 top1=18307 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/46.7ms pred gate=device Token # 1009: 3.856ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.989 next=pair draft=17928 prop=17928 pred gate=device Token # 1010: 115.092ms; value: next_token_ids=tensor([17928], device='cuda:0') mtp accept=1 prop=17928 top1=17928 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.6ms gain=85.6ms ratio=0.44 s0=3.8ms s1=191.8ms wait=0.1/46.8ms pred gate=device Token # 1011: 3.766ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=18894 prop=18894 pred gate=device Token # 1012: 114.678ms; value: next_token_ids=tensor([18894], device='cuda:0') mtp accept=1 prop=18894 top1=18894 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.3ms wait=0.1/46.7ms pred gate=device Token # 1013: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=18528 prop=18528 pred gate=device Token # 1014: 114.892ms; value: next_token_ids=tensor([18528], device='cuda:0') mtp accept=1 prop=18528 top1=18528 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.2ms gain=84.6ms ratio=0.44 s0=4.3ms s1=190.0ms wait=0.1/45.9ms pred gate=device Token # 1015: 3.792ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.966 next=pair draft=18743 prop=18743 pred gate=device Token # 1016: 114.462ms; value: next_token_ids=tensor([18743], device='cuda:0') mtp accept=1 prop=18743 top1=18743 accp=1.000 next=draft=18743 prop=18743 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/46.9ms pred gate=device Token # 1017: 3.818ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=18743 top1=223 accp=0.139 next=pair draft=18731 prop=18731 pred gate=device Token # 1018: 115.374ms; value: next_token_ids=tensor([18731], device='cuda:0') mtp accept=1 prop=18731 top1=18731 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.9ms gain=85.7ms ratio=0.44 s0=3.8ms s1=192.0ms wait=0.1/46.7ms pred gate=device Token # 1019: 3.731ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18755 prop=18755 pred gate=device Token # 1020: 114.983ms; value: next_token_ids=tensor([18755], device='cuda:0') mtp accept=1 prop=18755 top1=18755 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.4ms gain=84.6ms ratio=0.43 s0=4.0ms s1=190.5ms wait=0.1/46.5ms pred gate=device Token # 1021: 3.804ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=17685 prop=17685 pred gate=device Token # 1022: 114.805ms; value: next_token_ids=tensor([17685], device='cuda:0') mtp accept=1 prop=17685 top1=17685 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.5ms gain=84.8ms ratio=0.44 s0=4.8ms s1=189.6ms wait=0.1/45.7ms pred gate=device Token # 1023: 3.750ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=4314 prop=4314 pred gate=device Token # 1024: 114.671ms; value: next_token_ids=tensor([4314], device='cuda:0') mtp accept=1 prop=4314 top1=4314 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/46.9ms pred gate=device Token # 1025: 3.760ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.978 next=pair draft=14882 prop=14882 pred gate=device Token # 1026: 115.249ms; value: next_token_ids=tensor([14882], device='cuda:0') mtp accept=1 prop=14882 top1=14882 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=194.7ms gain=84.6ms ratio=0.43 s0=4.1ms s1=190.6ms wait=0.1/46.4ms pred gate=device Token # 1027: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15836 prop=15836 pred gate=device Token # 1028: 115.065ms; value: next_token_ids=tensor([15836], device='cuda:0') mtp accept=1 prop=15836 top1=15836 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.4ms gain=85.5ms ratio=0.44 s0=4.4ms s1=191.0ms wait=0.1/45.4ms pred gate=device Token # 1029: 3.754ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15458 prop=15458 pred gate=device Token # 1030: 115.009ms; value: next_token_ids=tensor([15458], device='cuda:0') mtp accept=1 prop=15458 top1=15458 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.3ms gain=85.5ms ratio=0.44 s0=4.3ms s1=191.1ms wait=0.1/45.8ms pred gate=device Token # 1031: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=15525 prop=15525 pred gate=device Token # 1032: 114.794ms; value: next_token_ids=tensor([15525], device='cuda:0') mtp accept=1 prop=15525 top1=15525 accp=1.000 next=draft=15525 prop=15525 olap pair=109.7ms serial=195.0ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.8ms pred gate=device Token # 1033: 3.789ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=15525 top1=223 accp=0.061 next=pair draft=16553 prop=16553 pred gate=device Token # 1034: 115.050ms; value: next_token_ids=tensor([16553], device='cuda:0') mtp accept=1 prop=16553 top1=16553 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.2ms gain=85.3ms ratio=0.44 s0=3.9ms s1=191.3ms wait=0.1/46.6ms pred gate=device Token # 1035: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=17622 prop=17622 pred gate=device Token # 1036: 115.209ms; value: next_token_ids=tensor([17622], device='cuda:0') mtp accept=1 prop=17622 top1=17622 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.7ms gain=85.6ms ratio=0.44 s0=3.8ms s1=191.8ms wait=0.1/46.7ms pred gate=device Token # 1037: 3.746ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.976 next=pair draft=17804 prop=17804 pred gate=device Token # 1038: 114.542ms; value: next_token_ids=tensor([17804], device='cuda:0') mtp accept=1 prop=17804 top1=17804 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.5ms wait=0.1/46.6ms pred gate=device Token # 1039: 3.747ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=17527 prop=17527 pred gate=device Token # 1040: 115.311ms; value: next_token_ids=tensor([17527], device='cuda:0') mtp accept=1 prop=17527 top1=17527 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=194.2ms gain=84.0ms ratio=0.43 s0=4.2ms s1=190.0ms wait=0.1/46.3ms pred gate=device Token # 1041: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.907 next=pair draft=18086 prop=18086 pred gate=device Token # 1042: 114.803ms; value: next_token_ids=tensor([18086], device='cuda:0') mtp accept=1 prop=18086 top1=18086 accp=1.000 next=draft=223 prop=18086 olap pair=109.7ms serial=194.3ms gain=84.6ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/45.8ms pred gate=device Token # 1043: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=18086 top1=223 accp=0.548 next=pair draft=14666 prop=14666 pred gate=device Token # 1044: 116.529ms; value: next_token_ids=tensor([14666], device='cuda:0') mtp accept=1 prop=14666 top1=14666 accp=1.000 next=draft=201 prop=201 olap pair=110.6ms serial=194.3ms gain=83.7ms ratio=0.43 s0=8.0ms s1=186.4ms wait=0.2/42.2ms pred gate=device Token # 1045: 3.814ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=16909 prop=16909 pred gate=device Token # 1046: 114.863ms; value: next_token_ids=tensor([16909], device='cuda:0') mtp accept=1 prop=16909 top1=16909 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.4ms gain=84.7ms ratio=0.44 s0=6.4ms s1=188.0ms wait=0.2/43.9ms pred gate=device Token # 1047: 3.740ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=16020 prop=16020 pred gate=device Token # 1048: 115.181ms; value: next_token_ids=tensor([16020], device='cuda:0') mtp accept=1 prop=16020 top1=16020 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.4ms gain=85.4ms ratio=0.44 s0=4.3ms s1=191.1ms wait=0.1/45.7ms pred gate=device Token # 1049: 3.767ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=17298 prop=17298 pred gate=device Token # 1050: 115.038ms; value: next_token_ids=tensor([17298], device='cuda:0') mtp accept=1 prop=17298 top1=17298 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.3ms gain=85.5ms ratio=0.44 s0=4.2ms s1=191.2ms wait=0.1/46.0ms pred gate=device Token # 1051: 3.747ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18321 prop=18321 pred gate=device Token # 1052: 114.862ms; value: next_token_ids=tensor([18321], device='cuda:0') mtp accept=1 prop=18321 top1=18321 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.7ms pred gate=device Token # 1053: 3.717ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.989 next=pair draft=17070 prop=17070 pred gate=device Token # 1054: 114.890ms; value: next_token_ids=tensor([17070], device='cuda:0') mtp accept=1 prop=17070 top1=17070 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.9ms s1=191.0ms wait=0.1/46.7ms pred gate=device Token # 1055: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=17998 prop=17998 pred gate=device Token # 1056: 114.944ms; value: next_token_ids=tensor([17998], device='cuda:0') mtp accept=1 prop=17998 top1=17998 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.3ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/46.7ms pred gate=device Token # 1057: 3.733ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.993 next=pair draft=18573 prop=18573 pred gate=device Token # 1058: 115.012ms; value: next_token_ids=tensor([18573], device='cuda:0') mtp accept=1 prop=18573 top1=18573 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.6ms gain=84.8ms ratio=0.44 s0=4.1ms s1=190.5ms wait=0.1/46.3ms pred gate=device Token # 1059: 3.755ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=19083 prop=19083 pred gate=device Token # 1060: 114.782ms; value: next_token_ids=tensor([19083], device='cuda:0') mtp accept=1 prop=19083 top1=19083 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/46.9ms pred gate=device Token # 1061: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18862 prop=18862 pred gate=device Token # 1062: 114.188ms; value: next_token_ids=tensor([18862], device='cuda:0') mtp accept=1 prop=18862 top1=18862 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.7ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.0ms wait=0.1/47.0ms pred gate=device Token # 1063: 3.815ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=12715 prop=12715 pred gate=device Token # 1064: 114.445ms; value: next_token_ids=tensor([12715], device='cuda:0') mtp accept=1 prop=12715 top1=12715 accp=1.000 next=draft=201 prop=201 olap pair=109.3ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.2ms wait=0.1/46.6ms pred gate=device Token # 1065: 3.756ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=17561 prop=17561 pred gate=device Token # 1066: 114.651ms; value: next_token_ids=tensor([17561], device='cuda:0') mtp accept=1 prop=17561 top1=17561 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.3ms gain=83.8ms ratio=0.43 s0=4.4ms s1=188.9ms wait=0.1/46.0ms pred gate=device Token # 1067: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.702 next=pair draft=18462 prop=18462 pred gate=device Token # 1068: 114.225ms; value: next_token_ids=tensor([18462], device='cuda:0') mtp accept=1 prop=18462 top1=18462 accp=1.000 next=draft=18462 prop=18462 olap pair=109.1ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/46.6ms pred gate=device Token # 1069: 3.839ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=18462 top1=223 accp=0.018 next=pair draft=18033 prop=18033 pred gate=device Token # 1070: 114.639ms; value: next_token_ids=tensor([18033], device='cuda:0') mtp accept=1 prop=18033 top1=18033 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.7ms gain=85.3ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/46.8ms pred gate=device Token # 1071: 3.819ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=18339 prop=18339 pred gate=device Token # 1072: 114.370ms; value: next_token_ids=tensor([18339], device='cuda:0') mtp accept=1 prop=18339 top1=18339 accp=1.000 next=draft=223 prop=18339 olap pair=109.2ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/46.7ms pred gate=device Token # 1073: 3.789ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=18339 top1=223 accp=0.902 next=pair draft=17332 prop=17332 pred gate=device Token # 1074: 115.400ms; value: next_token_ids=tensor([17332], device='cuda:0') mtp accept=1 prop=17332 top1=17332 accp=1.000 next=draft=17332 prop=17332 olap pair=110.0ms serial=195.1ms gain=85.1ms ratio=0.44 s0=4.6ms s1=190.5ms wait=0.1/45.6ms pred gate=device Token # 1075: 3.768ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=17332 top1=223 accp=0.063 next=pair draft=19853 prop=19853 pred gate=device Token # 1076: 114.795ms; value: next_token_ids=tensor([19853], device='cuda:0') mtp accept=1 prop=19853 top1=19853 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.2ms ratio=0.44 s0=4.4ms s1=190.4ms wait=0.1/45.5ms pred gate=device Token # 1077: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19352 prop=19352 pred gate=device Token # 1078: 114.485ms; value: next_token_ids=tensor([19352], device='cuda:0') mtp accept=1 prop=19352 top1=19352 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.6ms gain=84.3ms ratio=0.44 s0=4.3ms s1=189.3ms wait=0.1/45.9ms pred gate=device Token # 1079: 3.791ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.941 next=pair draft=19222 prop=19222 pred gate=device Token # 1080: 115.138ms; value: next_token_ids=tensor([19222], device='cuda:0') mtp accept=1 prop=19222 top1=19222 accp=1.000 next=draft=19222 prop=19222 olap pair=109.9ms serial=194.6ms gain=84.7ms ratio=0.44 s0=7.5ms s1=187.1ms wait=0.2/42.6ms pred gate=device Token # 1081: 3.782ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=19222 top1=223 accp=0.175 next=pair draft=19978 prop=19978 pred gate=device Token # 1082: 115.035ms; value: next_token_ids=tensor([19978], device='cuda:0') mtp accept=1 prop=19978 top1=19978 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.0ms gain=84.2ms ratio=0.43 s0=7.7ms s1=186.3ms wait=0.2/42.3ms pred gate=device Token # 1083: 3.820ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=14461 prop=14461 pred gate=device Token # 1084: 114.882ms; value: next_token_ids=tensor([14461], device='cuda:0') mtp accept=1 prop=14461 top1=14461 accp=1.000 next=draft=201 prop=201 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/46.9ms pred gate=device Token # 1085: 3.848ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.999 next=pair draft=17922 prop=17922 pred gate=device Token # 1086: 114.547ms; value: next_token_ids=tensor([17922], device='cuda:0') mtp accept=1 prop=17922 top1=17922 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.9ms pred gate=device Token # 1087: 3.812ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.767 next=pair draft=18822 prop=18822 pred gate=device Token # 1088: 114.524ms; value: next_token_ids=tensor([18822], device='cuda:0') mtp accept=1 prop=18822 top1=18822 accp=1.000 next=draft=223 prop=18822 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.9ms pred gate=device Token # 1089: 3.761ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=18822 top1=223 accp=0.576 next=pair draft=11722 prop=11722 pred gate=device Token # 1090: 114.886ms; value: next_token_ids=tensor([11722], device='cuda:0') mtp accept=1 prop=11722 top1=11722 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=195.1ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.3ms wait=0.1/46.6ms pred gate=device Token # 1091: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19669 prop=19669 pred gate=device Token # 1092: 114.345ms; value: next_token_ids=tensor([19669], device='cuda:0') mtp accept=1 prop=19669 top1=19669 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.2ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/46.7ms pred gate=device Token # 1093: 3.834ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.907 next=pair draft=19204 prop=19204 pred gate=device Token # 1094: 114.262ms; value: next_token_ids=tensor([19204], device='cuda:0') mtp accept=1 prop=19204 top1=19204 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.9ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/46.9ms pred gate=device Token # 1095: 3.817ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=18930 prop=18930 pred gate=device Token # 1096: 115.146ms; value: next_token_ids=tensor([18930], device='cuda:0') mtp accept=1 prop=18930 top1=18930 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=194.7ms gain=84.7ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/46.9ms pred gate=device Token # 1097: 3.747ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.991 next=pair draft=19819 prop=19819 pred gate=device Token # 1098: 114.512ms; value: next_token_ids=tensor([19819], device='cuda:0') mtp accept=1 prop=19819 top1=19819 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.5ms wait=0.1/46.6ms pred gate=device Token # 1099: 3.779ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.954 next=pair draft=18335 prop=18335 pred gate=device Token # 1100: 114.266ms; value: next_token_ids=tensor([18335], device='cuda:0') mtp accept=1 prop=18335 top1=18335 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=194.1ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.4ms wait=0.1/47.0ms pred gate=device Token # 1101: 3.740ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.647 next=pair draft=13506 prop=13506 pred gate=device Token # 1102: 114.403ms; value: next_token_ids=tensor([13506], device='cuda:0') mtp accept=1 prop=13506 top1=13506 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/46.8ms pred gate=device Token # 1103: 3.793ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.974 next=pair draft=16275 prop=16275 pred gate=device Token # 1104: 114.544ms; value: next_token_ids=tensor([16275], device='cuda:0') mtp accept=1 prop=16275 top1=16275 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.7ms wait=0.1/46.2ms pred gate=device Token # 1105: 3.733ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.977 next=pair draft=19700 prop=19700 pred gate=device Token # 1106: 114.542ms; value: next_token_ids=tensor([19700], device='cuda:0') mtp accept=1 prop=19700 top1=19700 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/46.8ms pred gate=device Token # 1107: 3.755ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19642 prop=19642 pred gate=device Token # 1108: 114.273ms; value: next_token_ids=tensor([19642], device='cuda:0') mtp accept=1 prop=19642 top1=19642 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.0ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.3ms wait=0.1/46.9ms pred gate=device Token # 1109: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.853 next=pair draft=19575 prop=19575 pred gate=device Token # 1110: 114.716ms; value: next_token_ids=tensor([19575], device='cuda:0') mtp accept=1 prop=19575 top1=19575 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/46.9ms pred gate=device Token # 1111: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19533 prop=19533 pred gate=device Token # 1112: 114.606ms; value: next_token_ids=tensor([19533], device='cuda:0') mtp accept=1 prop=19533 top1=19533 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=5.2ms s1=189.1ms wait=0.1/45.4ms pred gate=device Token # 1113: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18014 prop=18014 pred gate=device Token # 1114: 114.574ms; value: next_token_ids=tensor([18014], device='cuda:0') mtp accept=1 prop=18014 top1=18014 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/47.0ms pred gate=device Token # 1115: 3.818ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.759 next=pair draft=20583 prop=20583 pred gate=device Token # 1116: 114.768ms; value: next_token_ids=tensor([20583], device='cuda:0') mtp accept=1 prop=20583 top1=20583 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/46.9ms pred gate=device Token # 1117: 3.733ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20845 prop=20845 pred gate=device Token # 1118: 114.735ms; value: next_token_ids=tensor([20845], device='cuda:0') mtp accept=1 prop=20845 top1=20845 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/47.0ms pred gate=device Token # 1119: 3.689ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20634 prop=20634 pred gate=device Token # 1120: 114.573ms; value: next_token_ids=tensor([20634], device='cuda:0') mtp accept=1 prop=20634 top1=20634 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.8ms gain=84.4ms ratio=0.44 s0=5.8ms s1=188.0ms wait=0.2/44.5ms pred gate=device Token # 1121: 3.697ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20964 prop=20964 pred gate=device Token # 1122: 115.254ms; value: next_token_ids=tensor([20964], device='cuda:0') mtp accept=1 prop=20964 top1=20964 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.7ms gain=85.6ms ratio=0.44 s0=3.9ms s1=191.8ms wait=0.1/46.5ms pred gate=device Token # 1123: 3.811ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=10996 prop=10996 pred gate=device Token # 1124: 114.728ms; value: next_token_ids=tensor([10996], device='cuda:0') mtp accept=1 prop=10996 top1=10996 accp=1.000 next=draft=201 prop=201 olap pair=109.6ms serial=194.2ms gain=84.6ms ratio=0.44 s0=5.5ms s1=188.6ms wait=0.2/44.8ms pred gate=device Token # 1125: 3.760ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=19615 prop=19615 pred gate=device Token # 1126: 114.616ms; value: next_token_ids=tensor([19615], device='cuda:0') mtp accept=1 prop=19615 top1=19615 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.6ms gain=85.2ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/46.7ms pred gate=device Token # 1127: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19357 prop=19357 pred gate=device Token # 1128: 115.046ms; value: next_token_ids=tensor([19357], device='cuda:0') mtp accept=1 prop=19357 top1=19357 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.1ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.4ms wait=0.1/46.9ms pred gate=device Token # 1129: 3.753ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19966 prop=19966 pred gate=device Token # 1130: 115.001ms; value: next_token_ids=tensor([19966], device='cuda:0') mtp accept=1 prop=19966 top1=19966 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.1ms gain=85.4ms ratio=0.44 s0=4.0ms s1=191.2ms wait=0.1/46.5ms pred gate=device Token # 1131: 3.708ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20475 prop=20475 pred gate=device Token # 1132: 114.546ms; value: next_token_ids=tensor([20475], device='cuda:0') mtp accept=1 prop=20475 top1=20475 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/45.8ms pred gate=device Token # 1133: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19663 prop=19663 pred gate=device Token # 1134: 115.401ms; value: next_token_ids=tensor([19663], device='cuda:0') mtp accept=1 prop=19663 top1=19663 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=4.3ms s1=191.6ms wait=0.1/45.7ms pred gate=device Token # 1135: 3.788ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20512 prop=20512 pred gate=device Token # 1136: 114.846ms; value: next_token_ids=tensor([20512], device='cuda:0') mtp accept=1 prop=20512 top1=20512 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=4.3ms s1=190.5ms wait=0.1/45.9ms pred gate=device Token # 1137: 3.732ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20530 prop=20530 pred gate=device Token # 1138: 114.648ms; value: next_token_ids=tensor([20530], device='cuda:0') mtp accept=1 prop=20530 top1=20530 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.2ms wait=0.1/46.1ms pred gate=device Token # 1139: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20892 prop=20892 pred gate=device Token # 1140: 114.513ms; value: next_token_ids=tensor([20892], device='cuda:0') mtp accept=1 prop=20892 top1=20892 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/46.8ms pred gate=device Token # 1141: 3.788ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21453 prop=21453 pred gate=device Token # 1142: 114.995ms; value: next_token_ids=tensor([21453], device='cuda:0') mtp accept=1 prop=21453 top1=21453 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.2ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.5ms wait=0.1/46.9ms pred gate=device Token # 1143: 3.807ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.990 next=pair draft=10758 prop=10758 pred gate=device Token # 1144: 114.479ms; value: next_token_ids=tensor([10758], device='cuda:0') mtp accept=1 prop=10758 top1=10758 accp=1.000 next=draft=201 prop=201 olap pair=109.3ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/47.1ms pred gate=device Token # 1145: 3.639ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=20192 prop=20192 pred gate=device Token # 1146: 115.000ms; value: next_token_ids=tensor([20192], device='cuda:0') mtp accept=1 prop=20192 top1=20192 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.3ms gain=84.4ms ratio=0.43 s0=4.0ms s1=190.3ms wait=0.1/46.8ms pred gate=device Token # 1147: 3.731ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21010 prop=21010 pred gate=device Token # 1148: 115.092ms; value: next_token_ids=tensor([21010], device='cuda:0') mtp accept=1 prop=21010 top1=21010 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.4ms gain=84.5ms ratio=0.43 s0=3.9ms s1=190.5ms wait=0.1/46.6ms pred gate=device Token # 1149: 3.765ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=20260 prop=20260 pred gate=device Token # 1150: 114.196ms; value: next_token_ids=tensor([20260], device='cuda:0') mtp accept=1 prop=20260 top1=20260 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.7ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.0ms wait=0.1/46.9ms pred gate=device Token # 1151: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.933 next=pair draft=20021 prop=20021 pred gate=device Token # 1152: 114.497ms; value: next_token_ids=tensor([20021], device='cuda:0') mtp accept=1 prop=20021 top1=20021 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.2ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/46.9ms pred gate=device Token # 1153: 3.767ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15551 prop=15551 pred gate=device Token # 1154: 115.203ms; value: next_token_ids=tensor([15551], device='cuda:0') mtp accept=1 prop=15551 top1=15551 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.9ms gain=85.9ms ratio=0.44 s0=4.1ms s1=191.8ms wait=0.1/46.3ms pred gate=device Token # 1155: 3.784ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=21154 prop=21154 pred gate=device Token # 1156: 114.332ms; value: next_token_ids=tensor([21154], device='cuda:0') mtp accept=1 prop=21154 top1=21154 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.1ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/46.8ms pred gate=device Token # 1157: 3.794ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.799 next=pair draft=21398 prop=21398 pred gate=device Token # 1158: 114.747ms; value: next_token_ids=tensor([21398], device='cuda:0') mtp accept=1 prop=21398 top1=21398 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/46.8ms pred gate=device Token # 1159: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=21127 prop=21127 pred gate=device Token # 1160: 114.674ms; value: next_token_ids=tensor([21127], device='cuda:0') mtp accept=1 prop=21127 top1=21127 accp=1.000 next=draft=21127 prop=21127 olap pair=109.5ms serial=194.2ms gain=84.7ms ratio=0.44 s0=5.6ms s1=188.6ms wait=0.2/44.8ms pred gate=device Token # 1161: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21127 top1=223 accp=0.454 next=pair draft=21445 prop=21445 pred gate=device Token # 1162: 114.756ms; value: next_token_ids=tensor([21445], device='cuda:0') mtp accept=1 prop=21445 top1=21445 accp=1.000 next=draft=21445 prop=21445 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=4.7ms s1=190.1ms wait=0.1/45.7ms pred gate=device Token # 1163: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21445 top1=223 accp=0.119 next=pair draft=17565 prop=17565 pred gate=device Token # 1164: 114.273ms; value: next_token_ids=tensor([17565], device='cuda:0') mtp accept=1 prop=17565 top1=17565 accp=1.000 next=draft=201 prop=201 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=5.4ms s1=188.2ms wait=0.2/45.1ms pred gate=device Token # 1165: 3.661ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=20411 prop=20411 pred gate=device Token # 1166: 115.184ms; value: next_token_ids=tensor([20411], device='cuda:0') mtp accept=1 prop=20411 top1=20411 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.8ms gain=85.8ms ratio=0.44 s0=4.1ms s1=191.7ms wait=0.1/46.5ms pred gate=device Token # 1167: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21422 prop=21422 pred gate=device Token # 1168: 114.890ms; value: next_token_ids=tensor([21422], device='cuda:0') mtp accept=1 prop=21422 top1=21422 accp=1.000 next=draft=21422 prop=21422 olap pair=109.7ms serial=195.2ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.4ms wait=0.1/46.9ms pred gate=device Token # 1169: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21422 top1=223 accp=0.345 next=pair draft=20578 prop=20578 pred gate=device Token # 1170: 114.661ms; value: next_token_ids=tensor([20578], device='cuda:0') mtp accept=1 prop=20578 top1=20578 accp=1.000 next=draft=20578 prop=20578 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.9ms wait=0.1/47.1ms pred gate=device Token # 1171: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=20578 top1=223 accp=0.273 next=pair draft=21237 prop=21237 pred gate=device Token # 1172: 114.434ms; value: next_token_ids=tensor([21237], device='cuda:0') mtp accept=1 prop=21237 top1=21237 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/46.9ms pred gate=device Token # 1173: 3.825ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.987 next=pair draft=16919 prop=16919 pred gate=device Token # 1174: 114.550ms; value: next_token_ids=tensor([16919], device='cuda:0') mtp accept=1 prop=16919 top1=16919 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.6ms gain=85.2ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/47.1ms pred gate=device Token # 1175: 3.796ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20980 prop=20980 pred gate=device Token # 1176: 114.676ms; value: next_token_ids=tensor([20980], device='cuda:0') mtp accept=1 prop=20980 top1=20980 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.2ms gain=84.7ms ratio=0.44 s0=4.0ms s1=190.2ms wait=0.1/46.7ms pred gate=device Token # 1177: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20628 prop=20628 pred gate=device Token # 1178: 114.744ms; value: next_token_ids=tensor([20628], device='cuda:0') mtp accept=1 prop=20628 top1=20628 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.2ms gain=84.6ms ratio=0.44 s0=4.0ms s1=190.2ms wait=0.1/46.7ms pred gate=device Token # 1179: 3.747ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.981 next=pair draft=21796 prop=21796 pred gate=device Token # 1180: 115.522ms; value: next_token_ids=tensor([21796], device='cuda:0') mtp accept=1 prop=21796 top1=21796 accp=1.000 next=draft=223 prop=223 olap pair=110.4ms serial=196.3ms gain=85.9ms ratio=0.44 s0=4.1ms s1=192.2ms wait=0.1/46.1ms pred gate=device Token # 1181: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.952 next=pair draft=22329 prop=22329 pred gate=device Token # 1182: 114.655ms; value: next_token_ids=tensor([22329], device='cuda:0') mtp accept=1 prop=22329 top1=22329 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/46.8ms pred gate=device Token # 1183: 3.846ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=17419 prop=17419 pred gate=device Token # 1184: 114.166ms; value: next_token_ids=tensor([17419], device='cuda:0') mtp accept=1 prop=17419 top1=17419 accp=1.000 next=draft=201 prop=201 olap pair=109.0ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.7ms wait=0.1/46.7ms pred gate=device Token # 1185: 3.750ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.999 next=pair draft=21821 prop=21821 pred gate=device Token # 1186: 114.754ms; value: next_token_ids=tensor([21821], device='cuda:0') mtp accept=1 prop=21821 top1=21821 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/46.9ms pred gate=device Token # 1187: 3.764ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21899 prop=21899 pred gate=device Token # 1188: 114.479ms; value: next_token_ids=tensor([21899], device='cuda:0') mtp accept=1 prop=21899 top1=21899 accp=1.000 next=draft=21899 prop=21899 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.5ms wait=0.1/46.9ms pred gate=device Token # 1189: 3.752ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21899 top1=223 accp=0.070 next=pair draft=21712 prop=21712 pred gate=device Token # 1190: 115.741ms; value: next_token_ids=tensor([21712], device='cuda:0') mtp accept=1 prop=21712 top1=21712 accp=1.000 next=draft=21712 prop=21712 olap pair=110.6ms serial=195.9ms gain=85.4ms ratio=0.44 s0=3.9ms s1=192.1ms wait=0.1/46.7ms pred gate=device Token # 1191: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21712 top1=223 accp=0.011 next=pair draft=17986 prop=17986 pred gate=device Token # 1192: 114.729ms; value: next_token_ids=tensor([17986], device='cuda:0') mtp accept=1 prop=17986 top1=17986 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.4ms wait=0.1/46.5ms pred gate=device Token # 1193: 3.747ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.792 next=pair draft=21466 prop=21466 pred gate=device Token # 1194: 114.368ms; value: next_token_ids=tensor([21466], device='cuda:0') mtp accept=1 prop=21466 top1=21466 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.3ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.5ms wait=0.1/46.9ms pred gate=device Token # 1195: 3.760ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.989 next=pair draft=21214 prop=21214 pred gate=device Token # 1196: 114.494ms; value: next_token_ids=tensor([21214], device='cuda:0') mtp accept=1 prop=21214 top1=21214 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/46.9ms pred gate=device Token # 1197: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22476 prop=22476 pred gate=device Token # 1198: 114.908ms; value: next_token_ids=tensor([22476], device='cuda:0') mtp accept=1 prop=22476 top1=22476 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.3ms gain=85.5ms ratio=0.44 s0=3.7ms s1=191.6ms wait=0.1/47.1ms pred gate=device Token # 1199: 3.726ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21895 prop=21895 pred gate=device Token # 1200: 114.416ms; value: next_token_ids=tensor([21895], device='cuda:0') mtp accept=1 prop=21895 top1=21895 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/47.0ms pred gate=device Token # 1201: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22297 prop=22297 pred gate=device Token # 1202: 114.854ms; value: next_token_ids=tensor([22297], device='cuda:0') mtp accept=1 prop=22297 top1=22297 accp=1.000 next=draft=22297 prop=22297 olap pair=109.7ms serial=195.2ms gain=85.4ms ratio=0.44 s0=3.7ms s1=191.5ms wait=0.1/47.1ms pred gate=device Token # 1203: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=22297 top1=223 accp=0.157 next=pair draft=19124 prop=19124 pred gate=device Token # 1204: 114.705ms; value: next_token_ids=tensor([19124], device='cuda:0') mtp accept=1 prop=19124 top1=19124 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.7ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/47.2ms pred gate=device Token # 1205: 3.665ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=23005 prop=23005 pred gate=device Token # 1206: 114.120ms; value: next_token_ids=tensor([23005], device='cuda:0') mtp accept=1 prop=23005 top1=23005 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.7ms gain=84.7ms ratio=0.44 s0=4.0ms s1=189.7ms wait=0.1/46.6ms pred gate=device Token # 1207: 3.835ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.941 next=pair draft=21817 prop=21817 pred gate=device Token # 1208: 114.822ms; value: next_token_ids=tensor([21817], device='cuda:0') mtp accept=1 prop=21817 top1=21817 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.6ms wait=0.1/46.0ms pred gate=device Token # 1209: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.661 next=pair draft=21625 prop=21625 pred gate=device Token # 1210: 115.046ms; value: next_token_ids=tensor([21625], device='cuda:0') mtp accept=1 prop=21625 top1=21625 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.9ms s1=191.3ms wait=0.1/46.7ms pred gate=device Token # 1211: 3.739ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23317 prop=23317 pred gate=device Token # 1212: 114.666ms; value: next_token_ids=tensor([23317], device='cuda:0') mtp accept=1 prop=23317 top1=23317 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=4.0ms s1=190.4ms wait=0.1/46.4ms pred gate=device Token # 1213: 3.733ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=21577 prop=21577 pred gate=device Token # 1214: 114.970ms; value: next_token_ids=tensor([21577], device='cuda:0') mtp accept=1 prop=21577 top1=21577 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.1ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/45.7ms pred gate=device Token # 1215: 3.670ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21834 prop=21834 pred gate=device Token # 1216: 114.614ms; value: next_token_ids=tensor([21834], device='cuda:0') mtp accept=1 prop=21834 top1=21834 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.5ms wait=0.1/46.5ms pred gate=device Token # 1217: 3.759ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23123 prop=23123 pred gate=device Token # 1218: 114.585ms; value: next_token_ids=tensor([23123], device='cuda:0') mtp accept=1 prop=23123 top1=23123 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/46.9ms pred gate=device Token # 1219: 3.838ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.906 next=pair draft=23427 prop=23427 pred gate=device Token # 1220: 115.139ms; value: next_token_ids=tensor([23427], device='cuda:0') mtp accept=1 prop=23427 top1=23427 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.3ms gain=85.3ms ratio=0.44 s0=4.2ms s1=191.2ms wait=0.1/46.0ms pred gate=device Token # 1221: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=20375 prop=20375 pred gate=device Token # 1222: 115.285ms; value: next_token_ids=tensor([20375], device='cuda:0') mtp accept=1 prop=20375 top1=20375 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.5ms gain=85.4ms ratio=0.44 s0=4.3ms s1=191.3ms wait=0.1/45.8ms pred gate=device Token # 1223: 3.824ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=5126 prop=5126 pred gate=device Token # 1224: 115.789ms; value: next_token_ids=tensor([5126], device='cuda:0') mtp accept=1 prop=5126 top1=5126 accp=1.000 next=draft=201 prop=201 olap pair=110.3ms serial=195.7ms gain=85.5ms ratio=0.44 s0=4.1ms s1=191.6ms wait=0.1/46.3ms pred gate=device Token # 1225: 3.779ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=15482 prop=15482 pred gate=device Token # 1226: 115.249ms; value: next_token_ids=tensor([15482], device='cuda:0') mtp accept=1 prop=15482 top1=15482 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.3ms gain=84.6ms ratio=0.44 s0=4.8ms s1=189.5ms wait=0.1/45.3ms pred gate=device Token # 1227: 3.824ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18572 prop=18572 pred gate=device Token # 1228: 114.388ms; value: next_token_ids=tensor([18572], device='cuda:0') mtp accept=1 prop=18572 top1=18572 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.6ms gain=84.4ms ratio=0.44 s0=4.3ms s1=189.3ms wait=0.1/45.8ms pred gate=device Token # 1229: 3.798ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19355 prop=19355 pred gate=device Token # 1230: 115.100ms; value: next_token_ids=tensor([19355], device='cuda:0') mtp accept=1 prop=19355 top1=19355 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.1ms gain=85.2ms ratio=0.44 s0=4.2ms s1=190.8ms wait=0.1/45.9ms pred gate=device Token # 1231: 3.750ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=15613 prop=15613 pred gate=device Token # 1232: 114.912ms; value: next_token_ids=tensor([15613], device='cuda:0') mtp accept=1 prop=15613 top1=15613 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.9ms gain=85.1ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/45.7ms pred gate=device Token # 1233: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19839 prop=19839 pred gate=device Token # 1234: 114.748ms; value: next_token_ids=tensor([19839], device='cuda:0') mtp accept=1 prop=19839 top1=19839 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.8ms ratio=0.44 s0=4.2ms s1=190.1ms wait=0.1/45.8ms pred gate=device Token # 1235: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20544 prop=20544 pred gate=device Token # 1236: 114.587ms; value: next_token_ids=tensor([20544], device='cuda:0') mtp accept=1 prop=20544 top1=20544 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/45.9ms pred gate=device Token # 1237: 3.753ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21189 prop=21189 pred gate=device Token # 1238: 114.590ms; value: next_token_ids=tensor([21189], device='cuda:0') mtp accept=1 prop=21189 top1=21189 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/45.7ms pred gate=device Token # 1239: 3.834ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20890 prop=20890 pred gate=device Token # 1240: 114.675ms; value: next_token_ids=tensor([20890], device='cuda:0') mtp accept=1 prop=20890 top1=20890 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.7ms wait=0.1/46.5ms pred gate=device Token # 1241: 3.814ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20282 prop=20282 pred gate=device Token # 1242: 114.699ms; value: next_token_ids=tensor([20282], device='cuda:0') mtp accept=1 prop=20282 top1=20282 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.4ms wait=0.1/46.0ms pred gate=device Token # 1243: 3.817ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=17726 prop=17726 pred gate=device Token # 1244: 114.696ms; value: next_token_ids=tensor([17726], device='cuda:0') mtp accept=1 prop=17726 top1=17726 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.8ms gain=85.2ms ratio=0.44 s0=4.0ms s1=190.8ms wait=0.1/46.6ms pred gate=device Token # 1245: 3.713ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=20731 prop=20731 pred gate=device Token # 1246: 114.548ms; value: next_token_ids=tensor([20731], device='cuda:0') mtp accept=1 prop=20731 top1=20731 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/47.0ms pred gate=device Token # 1247: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.896 next=pair draft=20982 prop=20982 pred gate=device Token # 1248: 115.332ms; value: next_token_ids=tensor([20982], device='cuda:0') mtp accept=1 prop=20982 top1=20982 accp=1.000 next=draft=20982 prop=223 olap pair=110.2ms serial=195.9ms gain=85.8ms ratio=0.44 s0=3.9ms s1=192.1ms wait=0.1/46.4ms pred gate=device Token # 1249: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.345 next=pair draft=21143 prop=21143 pred gate=device Token # 1250: 116.594ms; value: next_token_ids=tensor([21143], device='cuda:0') mtp accept=1 prop=21143 top1=21143 accp=1.000 next=draft=223 prop=223 olap pair=111.4ms serial=197.8ms gain=86.4ms ratio=0.44 s0=4.5ms s1=193.3ms wait=0.1/45.5ms pred gate=device Token # 1251: 3.764ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=20763 prop=20763 pred gate=device Token # 1252: 114.790ms; value: next_token_ids=tensor([20763], device='cuda:0') mtp accept=1 prop=20763 top1=20763 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.5ms gain=84.9ms ratio=0.44 s0=5.7ms s1=188.8ms wait=0.2/44.7ms pred gate=device Token # 1253: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.973 next=pair draft=19453 prop=19453 pred gate=device Token # 1254: 114.455ms; value: next_token_ids=tensor([19453], device='cuda:0') mtp accept=1 prop=19453 top1=19453 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/47.0ms pred gate=device Token # 1255: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20447 prop=20447 pred gate=device Token # 1256: 114.714ms; value: next_token_ids=tensor([20447], device='cuda:0') mtp accept=1 prop=20447 top1=20447 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.8ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.1ms wait=0.1/47.1ms pred gate=device Token # 1257: 3.808ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22309 prop=22309 pred gate=device Token # 1258: 115.102ms; value: next_token_ids=tensor([22309], device='cuda:0') mtp accept=1 prop=22309 top1=22309 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.5ms gain=85.5ms ratio=0.44 s0=4.1ms s1=191.4ms wait=0.1/46.1ms pred gate=device Token # 1259: 3.799ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=24024 prop=24024 pred gate=device Token # 1260: 114.705ms; value: next_token_ids=tensor([24024], device='cuda:0') mtp accept=1 prop=24024 top1=24024 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=84.9ms ratio=0.44 s0=4.5ms s1=189.9ms wait=0.1/45.4ms pred gate=device Token # 1261: 3.808ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.967 next=pair draft=22903 prop=22903 pred gate=device Token # 1262: 114.888ms; value: next_token_ids=tensor([22903], device='cuda:0') mtp accept=1 prop=22903 top1=22903 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=4.0ms s1=191.1ms wait=0.1/46.6ms pred gate=device Token # 1263: 3.746ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.990 next=pair draft=16046 prop=16046 pred gate=device Token # 1264: 114.514ms; value: next_token_ids=tensor([16046], device='cuda:0') mtp accept=1 prop=16046 top1=16046 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=4.1ms s1=190.1ms wait=0.1/46.4ms pred gate=device Token # 1265: 3.744ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.840 next=pair draft=21365 prop=21365 pred gate=device Token # 1266: 114.603ms; value: next_token_ids=tensor([21365], device='cuda:0') mtp accept=1 prop=21365 top1=21365 accp=1.000 next=draft=223 prop=21365 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/47.2ms pred gate=device Token # 1267: 3.807ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21365 top1=223 accp=0.801 next=pair draft=21605 prop=21605 pred gate=device Token # 1268: 114.972ms; value: next_token_ids=tensor([21605], device='cuda:0') mtp accept=1 prop=21605 top1=21605 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.8ms gain=85.0ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/46.8ms pred gate=device Token # 1269: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.945 next=pair draft=22307 prop=22307 pred gate=device Token # 1270: 114.656ms; value: next_token_ids=tensor([22307], device='cuda:0') mtp accept=1 prop=22307 top1=22307 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.8ms wait=0.1/46.5ms pred gate=device Token # 1271: 3.771ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.932 next=pair draft=21932 prop=21932 pred gate=device Token # 1272: 114.589ms; value: next_token_ids=tensor([21932], device='cuda:0') mtp accept=1 prop=21932 top1=21932 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.4ms gain=84.0ms ratio=0.43 s0=4.3ms s1=189.0ms wait=0.1/46.5ms pred gate=device Token # 1273: 3.740ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.984 next=pair draft=20088 prop=20088 pred gate=device Token # 1274: 114.578ms; value: next_token_ids=tensor([20088], device='cuda:0') mtp accept=1 prop=20088 top1=20088 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/46.5ms pred gate=device Token # 1275: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23559 prop=23559 pred gate=device Token # 1276: 114.638ms; value: next_token_ids=tensor([23559], device='cuda:0') mtp accept=1 prop=23559 top1=23559 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/47.6ms pred gate=device Token # 1277: 3.691ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23103 prop=23103 pred gate=device Token # 1278: 114.788ms; value: next_token_ids=tensor([23103], device='cuda:0') mtp accept=1 prop=23103 top1=23103 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/47.5ms pred gate=device Token # 1279: 3.761ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21810 prop=21810 pred gate=device Token # 1280: 115.382ms; value: next_token_ids=tensor([21810], device='cuda:0') mtp accept=1 prop=21810 top1=21810 accp=1.000 next=draft=21810 prop=21810 olap pair=110.2ms serial=196.0ms gain=85.7ms ratio=0.44 s0=3.8ms s1=192.2ms wait=0.1/47.7ms pred gate=device Token # 1281: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21810 top1=223 accp=0.072 next=pair draft=22627 prop=22627 pred gate=device Token # 1282: 115.428ms; value: next_token_ids=tensor([22627], device='cuda:0') mtp accept=1 prop=22627 top1=22627 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=3.8ms s1=192.0ms wait=0.1/47.7ms pred gate=device Token # 1283: 3.751ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=18736 prop=18736 pred gate=device Token # 1284: 115.026ms; value: next_token_ids=tensor([18736], device='cuda:0') mtp accept=1 prop=18736 top1=18736 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=193.2ms gain=83.4ms ratio=0.43 s0=4.3ms s1=188.9ms wait=0.1/47.0ms pred gate=device Token # 1285: 3.708ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.886 next=pair draft=21835 prop=21835 pred gate=device Token # 1286: 115.145ms; value: next_token_ids=tensor([21835], device='cuda:0') mtp accept=1 prop=21835 top1=21835 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=193.5ms gain=83.5ms ratio=0.43 s0=6.2ms s1=187.2ms wait=0.2/44.9ms pred gate=device Token # 1287: 3.796ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.972 next=pair draft=20208 prop=20208 pred gate=device Token # 1288: 116.209ms; value: next_token_ids=tensor([20208], device='cuda:0') mtp accept=1 prop=20208 top1=20208 accp=1.000 next=draft=20208 prop=20208 olap pair=110.3ms serial=193.4ms gain=83.1ms ratio=0.43 s0=4.6ms s1=188.8ms wait=0.1/46.9ms pred gate=device Token # 1289: 4.488ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=20208 top1=223 accp=0.004 next=pair draft=21726 prop=21726 pred gate=device Token # 1290: 115.165ms; value: next_token_ids=tensor([21726], device='cuda:0') mtp accept=1 prop=21726 top1=21726 accp=1.000 next=draft=223 prop=21726 olap pair=109.9ms serial=194.9ms gain=85.0ms ratio=0.44 s0=5.2ms s1=189.7ms wait=0.1/45.9ms pred gate=device Token # 1291: 3.737ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21726 top1=223 accp=0.739 next=pair draft=22741 prop=22741 pred gate=device Token # 1292: 115.273ms; value: next_token_ids=tensor([22741], device='cuda:0') mtp accept=1 prop=22741 top1=22741 accp=1.000 next=draft=22741 prop=22741 olap pair=110.1ms serial=193.8ms gain=83.7ms ratio=0.43 s0=4.2ms s1=189.6ms wait=0.1/47.3ms pred gate=device Token # 1293: 3.716ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=22741 top1=223 accp=0.228 next=pair draft=21391 prop=21391 pred gate=device Token # 1294: 114.897ms; value: next_token_ids=tensor([21391], device='cuda:0') mtp accept=1 prop=21391 top1=21391 accp=1.000 next=draft=223 prop=21391 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/47.7ms pred gate=device Token # 1295: 3.760ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=21391 top1=223 accp=0.979 next=pair draft=23587 prop=23587 pred gate=device Token # 1296: 114.830ms; value: next_token_ids=tensor([23587], device='cuda:0') mtp accept=1 prop=23587 top1=23587 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.7ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.7ms wait=0.1/47.4ms pred gate=device Token # 1297: 3.756ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=23207 prop=23207 pred gate=device Token # 1298: 114.761ms; value: next_token_ids=tensor([23207], device='cuda:0') mtp accept=1 prop=23207 top1=23207 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.0ms gain=84.5ms ratio=0.44 s0=3.9ms s1=190.1ms wait=0.1/47.6ms pred gate=device Token # 1299: 3.829ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.994 next=pair draft=24594 prop=24594 pred gate=device Token # 1300: 115.025ms; value: next_token_ids=tensor([24594], device='cuda:0') mtp accept=1 prop=24594 top1=24594 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/47.6ms pred gate=device Token # 1301: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23760 prop=23760 pred gate=device Token # 1302: 115.968ms; value: next_token_ids=tensor([23760], device='cuda:0') mtp accept=1 prop=23760 top1=23760 accp=1.000 next=draft=23760 prop=23760 olap pair=110.3ms serial=195.6ms gain=85.3ms ratio=0.44 s0=4.6ms s1=191.1ms wait=0.1/46.9ms pred gate=device Token # 1303: 3.807ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=23760 top1=223 accp=0.007 next=pair draft=17847 prop=17847 pred gate=device Token # 1304: 115.026ms; value: next_token_ids=tensor([17847], device='cuda:0') mtp accept=1 prop=17847 top1=17847 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=195.0ms gain=85.2ms ratio=0.44 s0=4.7ms s1=190.3ms wait=0.1/46.7ms pred gate=device Token # 1305: 3.670ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.737 next=pair draft=21873 prop=21873 pred gate=device Token # 1306: 116.057ms; value: next_token_ids=tensor([21873], device='cuda:0') mtp accept=1 prop=21873 top1=21873 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=194.4ms gain=84.3ms ratio=0.43 s0=7.9ms s1=186.5ms wait=0.2/43.3ms pred gate=device Token # 1307: 4.610ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23384 prop=23384 pred gate=device Token # 1308: 115.851ms; value: next_token_ids=tensor([23384], device='cuda:0') mtp accept=1 prop=23384 top1=23384 accp=1.000 next=draft=23384 prop=23384 olap pair=110.4ms serial=195.4ms gain=85.0ms ratio=0.44 s0=6.5ms s1=188.9ms wait=0.2/44.7ms pred gate=device Token # 1309: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=23384 top1=223 accp=0.270 next=pair draft=22354 prop=22354 pred gate=device Token # 1310: 115.258ms; value: next_token_ids=tensor([22354], device='cuda:0') mtp accept=1 prop=22354 top1=22354 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=194.2ms gain=84.1ms ratio=0.43 s0=5.9ms s1=188.3ms wait=0.2/45.5ms pred gate=device Token # 1311: 3.814ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.722 next=pair draft=18966 prop=18966 pred gate=device Token # 1312: 114.708ms; value: next_token_ids=tensor([18966], device='cuda:0') mtp accept=1 prop=18966 top1=18966 accp=1.000 next=draft=18966 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.3ms wait=0.1/47.2ms pred gate=device Token # 1313: 3.875ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.259 next=pair draft=23121 prop=23121 pred gate=device Token # 1314: 116.281ms; value: next_token_ids=tensor([23121], device='cuda:0') mtp accept=1 prop=23121 top1=23121 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.2ms gain=84.8ms ratio=0.43 s0=5.5ms s1=189.7ms wait=0.2/46.0ms pred gate=device Token # 1315: 4.830ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24332 prop=24332 pred gate=device Token # 1316: 116.121ms; value: next_token_ids=tensor([24332], device='cuda:0') mtp accept=1 prop=24332 top1=24332 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=194.8ms gain=84.8ms ratio=0.44 s0=5.4ms s1=189.5ms wait=0.1/46.2ms pred gate=device Token # 1317: 4.695ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24391 prop=24391 pred gate=device Token # 1318: 114.911ms; value: next_token_ids=tensor([24391], device='cuda:0') mtp accept=1 prop=24391 top1=24391 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.4ms gain=84.9ms ratio=0.44 s0=5.1ms s1=189.3ms wait=0.1/46.4ms pred gate=device Token # 1319: 3.865ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22719 prop=22719 pred gate=device Token # 1320: 115.000ms; value: next_token_ids=tensor([22719], device='cuda:0') mtp accept=1 prop=22719 top1=22719 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/47.8ms pred gate=device Token # 1321: 3.842ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.984 next=pair draft=24234 prop=24234 pred gate=device Token # 1322: 114.727ms; value: next_token_ids=tensor([24234], device='cuda:0') mtp accept=1 prop=24234 top1=24234 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/47.8ms pred gate=device Token # 1323: 3.811ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=12747 prop=12747 pred gate=device Token # 1324: 115.216ms; value: next_token_ids=tensor([12747], device='cuda:0') mtp accept=1 prop=12747 top1=12747 accp=1.000 next=draft=201 prop=12747 olap pair=110.0ms serial=194.8ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.8ms wait=0.1/47.8ms pred gate=device Token # 1325: 3.797ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=0 prop=12747 top1=201 accp=0.519 next=pair draft=23036 prop=23036 pred gate=device Token # 1326: 114.775ms; value: next_token_ids=tensor([23036], device='cuda:0') mtp accept=1 prop=23036 top1=23036 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.6ms wait=0.1/47.5ms pred gate=device Token # 1327: 3.877ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23275 prop=23275 pred gate=device Token # 1328: 114.819ms; value: next_token_ids=tensor([23275], device='cuda:0') mtp accept=1 prop=23275 top1=23275 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.8ms wait=0.1/47.8ms pred gate=device Token # 1329: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.568 next=pair draft=23152 prop=23152 pred gate=device Token # 1330: 115.127ms; value: next_token_ids=tensor([23152], device='cuda:0') mtp accept=1 prop=23152 top1=23152 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.4ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.6ms wait=0.1/47.8ms pred gate=device Token # 1331: 3.831ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.937 next=pair draft=23130 prop=23130 pred gate=device Token # 1332: 114.621ms; value: next_token_ids=tensor([23130], device='cuda:0') mtp accept=1 prop=23130 top1=23130 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/47.9ms pred gate=device Token # 1333: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=22957 prop=22957 pred gate=device Token # 1334: 114.734ms; value: next_token_ids=tensor([22957], device='cuda:0') mtp accept=1 prop=22957 top1=22957 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=4.2ms s1=190.4ms wait=0.1/47.4ms pred gate=device Token # 1335: 3.821ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18009 prop=18009 pred gate=device Token # 1336: 115.054ms; value: next_token_ids=tensor([18009], device='cuda:0') mtp accept=1 prop=18009 top1=18009 accp=1.000 next=draft=18009 prop=18009 olap pair=109.9ms serial=194.7ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.8ms wait=0.1/48.1ms pred gate=device Token # 1337: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=18009 top1=223 accp=0.169 next=pair draft=23369 prop=23369 pred gate=device Token # 1338: 114.714ms; value: next_token_ids=tensor([23369], device='cuda:0') mtp accept=1 prop=23369 top1=23369 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.6ms gain=85.2ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/48.0ms pred gate=device Token # 1339: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24896 prop=24896 pred gate=device Token # 1340: 117.264ms; value: next_token_ids=tensor([24896], device='cuda:0') mtp accept=1 prop=24896 top1=24896 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.6ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.4ms wait=0.1/47.3ms pred gate=device Token # 1341: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=24717 prop=24717 pred gate=device Token # 1342: 114.757ms; value: next_token_ids=tensor([24717], device='cuda:0') mtp accept=1 prop=24717 top1=24717 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.5ms gain=84.9ms ratio=0.44 s0=4.3ms s1=190.2ms wait=0.1/47.2ms pred gate=device Token # 1343: 3.767ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19864 prop=19864 pred gate=device Token # 1344: 115.957ms; value: next_token_ids=tensor([19864], device='cuda:0') mtp accept=1 prop=19864 top1=19864 accp=1.000 next=draft=201 prop=201 olap pair=109.9ms serial=194.3ms gain=84.4ms ratio=0.43 s0=6.0ms s1=188.4ms wait=0.2/45.6ms pred gate=device Token # 1345: 4.655ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=23864 prop=23864 pred gate=device Token # 1346: 115.384ms; value: next_token_ids=tensor([23864], device='cuda:0') mtp accept=1 prop=23864 top1=23864 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=192.7ms gain=83.4ms ratio=0.43 s0=8.3ms s1=184.4ms wait=0.2/43.5ms pred gate=device Token # 1347: 4.721ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=24330 prop=24330 pred gate=device Token # 1348: 114.889ms; value: next_token_ids=tensor([24330], device='cuda:0') mtp accept=1 prop=24330 top1=24330 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=193.8ms gain=84.3ms ratio=0.43 s0=5.1ms s1=188.8ms wait=0.1/46.8ms pred gate=device Token # 1349: 3.791ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=24316 prop=24316 pred gate=device Token # 1350: 115.045ms; value: next_token_ids=tensor([24316], device='cuda:0') mtp accept=1 prop=24316 top1=24316 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.8ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.9ms wait=0.1/48.0ms pred gate=device Token # 1351: 3.784ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.573 next=pair draft=23811 prop=23811 pred gate=device Token # 1352: 114.595ms; value: next_token_ids=tensor([23811], device='cuda:0') mtp accept=1 prop=23811 top1=23811 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.6ms gain=84.1ms ratio=0.43 s0=6.6ms s1=187.0ms wait=0.2/45.0ms pred gate=device Token # 1353: 3.826ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.969 next=pair draft=23516 prop=23516 pred gate=device Token # 1354: 115.095ms; value: next_token_ids=tensor([23516], device='cuda:0') mtp accept=1 prop=23516 top1=23516 accp=1.000 next=draft=23516 prop=223 olap pair=109.9ms serial=194.1ms gain=84.2ms ratio=0.43 s0=4.1ms s1=190.0ms wait=0.1/47.5ms pred gate=device Token # 1355: 3.811ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.256 next=pair draft=24724 prop=24724 pred gate=device Token # 1356: 114.762ms; value: next_token_ids=tensor([24724], device='cuda:0') mtp accept=1 prop=24724 top1=24724 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=193.7ms gain=84.0ms ratio=0.43 s0=4.0ms s1=189.6ms wait=0.1/48.0ms pred gate=device Token # 1357: 3.800ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24842 prop=24842 pred gate=device Token # 1358: 115.747ms; value: next_token_ids=tensor([24842], device='cuda:0') mtp accept=1 prop=24842 top1=24842 accp=1.000 next=draft=24842 prop=24842 olap pair=109.8ms serial=193.7ms gain=83.9ms ratio=0.43 s0=8.7ms s1=185.1ms wait=0.2/42.5ms pred gate=device Token # 1359: 4.765ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=24842 top1=223 accp=0.006 next=pair draft=24496 prop=24496 pred gate=device Token # 1360: 114.725ms; value: next_token_ids=tensor([24496], device='cuda:0') mtp accept=1 prop=24496 top1=24496 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.1ms gain=83.7ms ratio=0.43 s0=8.5ms s1=184.6ms wait=0.2/43.0ms pred gate=device Token # 1361: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.948 next=pair draft=24962 prop=24962 pred gate=device Token # 1362: 114.480ms; value: next_token_ids=tensor([24962], device='cuda:0') mtp accept=1 prop=24962 top1=24962 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/48.2ms pred gate=device Token # 1363: 3.820ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.914 next=pair draft=21170 prop=21170 pred gate=device Token # 1364: 114.474ms; value: next_token_ids=tensor([21170], device='cuda:0') mtp accept=1 prop=21170 top1=21170 accp=1.000 next=draft=201 prop=201 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.3ms wait=0.1/48.1ms pred gate=device Token # 1365: 3.725ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=23568 prop=23568 pred gate=device Token # 1366: 115.376ms; value: next_token_ids=tensor([23568], device='cuda:0') mtp accept=1 prop=23568 top1=23568 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.1ms gain=84.8ms ratio=0.43 s0=3.8ms s1=191.3ms wait=0.1/48.0ms pred gate=device Token # 1367: 3.831ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.576 next=pair draft=23348 prop=23348 pred gate=device Token # 1368: 114.316ms; value: next_token_ids=tensor([23348], device='cuda:0') mtp accept=1 prop=23348 top1=23348 accp=1.000 next=draft=23348 prop=23348 olap pair=109.1ms serial=193.9ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/48.1ms pred gate=device Token # 1369: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=23348 top1=223 accp=0.112 next=pair draft=24518 prop=24518 pred gate=device Token # 1370: 114.875ms; value: next_token_ids=tensor([24518], device='cuda:0') mtp accept=1 prop=24518 top1=24518 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/48.0ms pred gate=device Token # 1371: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24751 prop=24751 pred gate=device Token # 1372: 115.347ms; value: next_token_ids=tensor([24751], device='cuda:0') mtp accept=1 prop=24751 top1=24751 accp=1.000 next=draft=223 prop=24751 olap pair=110.2ms serial=195.9ms gain=85.8ms ratio=0.44 s0=3.8ms s1=192.2ms wait=0.1/48.1ms pred gate=device Token # 1373: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=24751 top1=223 accp=0.943 next=pair draft=22451 prop=22451 pred gate=device Token # 1374: 114.836ms; value: next_token_ids=tensor([22451], device='cuda:0') mtp accept=1 prop=22451 top1=22451 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/48.3ms pred gate=device Token # 1375: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24211 prop=24211 pred gate=device Token # 1376: 114.516ms; value: next_token_ids=tensor([24211], device='cuda:0') mtp accept=1 prop=24211 top1=24211 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.3ms pred gate=device Token # 1377: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=25125 prop=25125 pred gate=device Token # 1378: 114.619ms; value: next_token_ids=tensor([25125], device='cuda:0') mtp accept=1 prop=25125 top1=25125 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/48.2ms pred gate=device Token # 1379: 3.800ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=26300 prop=26300 pred gate=device Token # 1380: 114.617ms; value: next_token_ids=tensor([26300], device='cuda:0') mtp accept=1 prop=26300 top1=26300 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/48.2ms pred gate=device Token # 1381: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.962 next=pair draft=25881 prop=25881 pred gate=device Token # 1382: 114.592ms; value: next_token_ids=tensor([25881], device='cuda:0') mtp accept=1 prop=25881 top1=25881 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.5ms wait=0.1/48.2ms pred gate=device Token # 1383: 3.813ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.952 next=pair draft=16704 prop=16704 pred gate=device Token # 1384: 114.696ms; value: next_token_ids=tensor([16704], device='cuda:0') mtp accept=1 prop=16704 top1=16704 accp=1.000 next=draft=201 prop=201 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.8ms wait=0.1/48.1ms pred gate=device Token # 1385: 3.760ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.947 next=pair draft=24243 prop=24243 pred gate=device Token # 1386: 114.542ms; value: next_token_ids=tensor([24243], device='cuda:0') mtp accept=1 prop=24243 top1=24243 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.0ms pred gate=device Token # 1387: 3.793ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24980 prop=24980 pred gate=device Token # 1388: 114.782ms; value: next_token_ids=tensor([24980], device='cuda:0') mtp accept=1 prop=24980 top1=24980 accp=1.000 next=draft=24980 prop=24980 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/48.0ms pred gate=device Token # 1389: 3.788ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=24980 top1=223 accp=0.040 next=pair draft=24865 prop=24865 pred gate=device Token # 1390: 114.509ms; value: next_token_ids=tensor([24865], device='cuda:0') mtp accept=1 prop=24865 top1=24865 accp=1.000 next=draft=24865 prop=24865 olap pair=109.3ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/48.0ms pred gate=device Token # 1391: 3.792ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=24865 top1=223 accp=0.145 next=pair draft=24944 prop=24944 pred gate=device Token # 1392: 114.950ms; value: next_token_ids=tensor([24944], device='cuda:0') mtp accept=1 prop=24944 top1=24944 accp=1.000 next=draft=24944 prop=223 olap pair=109.7ms serial=194.6ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.6ms wait=0.1/47.8ms pred gate=device Token # 1393: 3.734ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.398 next=pair draft=23774 prop=23774 pred gate=device Token # 1394: 114.864ms; value: next_token_ids=tensor([23774], device='cuda:0') mtp accept=1 prop=23774 top1=23774 accp=1.000 next=draft=23774 prop=23774 olap pair=109.7ms serial=194.2ms gain=84.6ms ratio=0.44 s0=6.9ms s1=187.4ms wait=0.2/44.7ms pred gate=device Token # 1395: 3.774ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=23774 top1=223 accp=0.358 next=pair draft=24893 prop=24893 pred gate=device Token # 1396: 115.276ms; value: next_token_ids=tensor([24893], device='cuda:0') mtp accept=1 prop=24893 top1=24893 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=194.4ms gain=84.3ms ratio=0.43 s0=4.0ms s1=190.4ms wait=0.1/47.8ms pred gate=device Token # 1397: 3.815ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.943 next=pair draft=25268 prop=25268 pred gate=device Token # 1398: 115.051ms; value: next_token_ids=tensor([25268], device='cuda:0') mtp accept=1 prop=25268 top1=25268 accp=1.000 next=draft=25268 prop=25268 olap pair=109.9ms serial=193.7ms gain=83.8ms ratio=0.43 s0=4.1ms s1=189.6ms wait=0.1/47.9ms pred gate=device Token # 1399: 3.818ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25268 top1=223 accp=0.492 next=pair draft=23885 prop=23885 pred gate=device Token # 1400: 116.371ms; value: next_token_ids=tensor([23885], device='cuda:0') mtp accept=1 prop=23885 top1=23885 accp=1.000 next=draft=23885 prop=23885 olap pair=110.4ms serial=194.8ms gain=84.4ms ratio=0.43 s0=4.6ms s1=190.2ms wait=0.1/47.5ms pred gate=device Token # 1401: 4.696ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=23885 top1=223 accp=0.001 next=pair draft=25786 prop=25786 pred gate=device Token # 1402: 114.604ms; value: next_token_ids=tensor([25786], device='cuda:0') mtp accept=1 prop=25786 top1=25786 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.2ms gain=83.9ms ratio=0.43 s0=7.1ms s1=186.0ms wait=0.2/44.2ms pred gate=device Token # 1403: 3.814ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22463 prop=22463 pred gate=device Token # 1404: 115.160ms; value: next_token_ids=tensor([22463], device='cuda:0') mtp accept=1 prop=22463 top1=22463 accp=1.000 next=draft=201 prop=201 olap pair=110.0ms serial=194.0ms gain=83.9ms ratio=0.43 s0=4.1ms s1=189.8ms wait=0.1/47.8ms pred gate=device Token # 1405: 3.753ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=25122 prop=25122 pred gate=device Token # 1406: 115.536ms; value: next_token_ids=tensor([25122], device='cuda:0') mtp accept=1 prop=25122 top1=25122 accp=1.000 next=draft=223 prop=25122 olap pair=109.5ms serial=194.1ms gain=84.6ms ratio=0.44 s0=3.9ms s1=190.2ms wait=0.1/48.1ms pred gate=device Token # 1407: 4.280ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25122 top1=223 accp=0.780 next=pair draft=25880 prop=25880 pred gate=device Token # 1408: 115.469ms; value: next_token_ids=tensor([25880], device='cuda:0') mtp accept=1 prop=25880 top1=25880 accp=1.000 next=draft=25880 prop=25880 olap pair=110.3ms serial=194.9ms gain=84.6ms ratio=0.43 s0=4.1ms s1=190.8ms wait=0.1/47.9ms pred gate=device Token # 1409: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25880 top1=223 accp=0.460 next=pair draft=26042 prop=26042 pred gate=device Token # 1410: 116.476ms; value: next_token_ids=tensor([26042], device='cuda:0') mtp accept=1 prop=26042 top1=26042 accp=1.000 next=draft=223 prop=223 olap pair=111.2ms serial=195.6ms gain=84.4ms ratio=0.43 s0=4.1ms s1=191.5ms wait=0.1/47.7ms pred gate=device Token # 1411: 3.819ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25819 prop=25819 pred gate=device Token # 1412: 114.657ms; value: next_token_ids=tensor([25819], device='cuda:0') mtp accept=1 prop=25819 top1=25819 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/48.3ms pred gate=device Token # 1413: 3.727ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24052 prop=24052 pred gate=device Token # 1414: 114.290ms; value: next_token_ids=tensor([24052], device='cuda:0') mtp accept=1 prop=24052 top1=24052 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.7ms gain=84.5ms ratio=0.44 s0=3.9ms s1=189.8ms wait=0.1/48.1ms pred gate=device Token # 1415: 3.739ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23931 prop=23931 pred gate=device Token # 1416: 116.040ms; value: next_token_ids=tensor([23931], device='cuda:0') mtp accept=1 prop=23931 top1=23931 accp=1.000 next=draft=223 prop=223 olap pair=110.9ms serial=196.0ms gain=85.1ms ratio=0.43 s0=4.1ms s1=191.8ms wait=0.1/47.8ms pred gate=device Token # 1417: 3.769ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.981 next=pair draft=25703 prop=25703 pred gate=device Token # 1418: 114.820ms; value: next_token_ids=tensor([25703], device='cuda:0') mtp accept=1 prop=25703 top1=25703 accp=1.000 next=draft=25703 prop=25703 olap pair=109.7ms serial=194.1ms gain=84.4ms ratio=0.44 s0=4.0ms s1=190.1ms wait=0.1/47.9ms pred gate=device Token # 1419: 3.827ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25703 top1=223 accp=0.000 next=pair draft=26192 prop=26192 pred gate=device Token # 1420: 115.936ms; value: next_token_ids=tensor([26192], device='cuda:0') mtp accept=1 prop=26192 top1=26192 accp=1.000 next=draft=26192 prop=26192 olap pair=109.9ms serial=193.6ms gain=83.7ms ratio=0.43 s0=5.8ms s1=187.9ms wait=0.2/46.1ms pred gate=device Token # 1421: 4.706ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26192 top1=223 accp=0.040 next=pair draft=21772 prop=21772 pred gate=device Token # 1422: 115.001ms; value: next_token_ids=tensor([21772], device='cuda:0') mtp accept=1 prop=21772 top1=21772 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.1ms gain=84.5ms ratio=0.44 s0=5.4ms s1=188.8ms wait=0.2/46.5ms pred gate=device Token # 1423: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.909 next=pair draft=3712 prop=3712 pred gate=device Token # 1424: 115.003ms; value: next_token_ids=tensor([3712], device='cuda:0') mtp accept=1 prop=3712 top1=3712 accp=1.000 next=draft=201 prop=201 olap pair=109.0ms serial=193.0ms gain=84.0ms ratio=0.44 s0=5.2ms s1=187.8ms wait=0.1/46.6ms pred gate=device Token # 1425: 4.062ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=19097 prop=19097 pred gate=device Token # 1426: 116.861ms; value: next_token_ids=tensor([19097], device='cuda:0') mtp accept=1 prop=19097 top1=19097 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.5ms gain=84.0ms ratio=0.43 s0=4.0ms s1=189.5ms wait=0.1/48.1ms pred gate=device Token # 1427: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.986 next=pair draft=21344 prop=21344 pred gate=device Token # 1428: 114.555ms; value: next_token_ids=tensor([21344], device='cuda:0') mtp accept=1 prop=21344 top1=21344 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.7ms gain=84.3ms ratio=0.44 s0=4.3ms s1=189.4ms wait=0.1/47.4ms pred gate=device Token # 1429: 3.839ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=21336 prop=21336 pred gate=device Token # 1430: 114.604ms; value: next_token_ids=tensor([21336], device='cuda:0') mtp accept=1 prop=21336 top1=21336 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/48.3ms pred gate=device Token # 1431: 3.856ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=20996 prop=20996 pred gate=device Token # 1432: 114.205ms; value: next_token_ids=tensor([20996], device='cuda:0') mtp accept=1 prop=20996 top1=20996 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=3.8ms s1=189.8ms wait=0.1/48.4ms pred gate=device Token # 1433: 3.846ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.917 next=pair draft=22192 prop=22192 pred gate=device Token # 1434: 114.908ms; value: next_token_ids=tensor([22192], device='cuda:0') mtp accept=1 prop=22192 top1=22192 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.1ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.4ms wait=0.1/48.3ms pred gate=device Token # 1435: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=23486 prop=23486 pred gate=device Token # 1436: 116.535ms; value: next_token_ids=tensor([23486], device='cuda:0') mtp accept=1 prop=23486 top1=23486 accp=1.000 next=draft=223 prop=223 olap pair=110.6ms serial=196.4ms gain=85.8ms ratio=0.44 s0=4.0ms s1=192.4ms wait=0.1/48.1ms pred gate=device Token # 1437: 4.709ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23975 prop=23975 pred gate=device Token # 1438: 115.248ms; value: next_token_ids=tensor([23975], device='cuda:0') mtp accept=1 prop=23975 top1=23975 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.3ms gain=83.8ms ratio=0.43 s0=8.3ms s1=185.0ms wait=0.2/43.0ms pred gate=device Token # 1439: 3.840ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23422 prop=23422 pred gate=device Token # 1440: 114.515ms; value: next_token_ids=tensor([23422], device='cuda:0') mtp accept=1 prop=23422 top1=23422 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/48.1ms pred gate=device Token # 1441: 3.754ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.981 next=pair draft=24091 prop=24091 pred gate=device Token # 1442: 114.839ms; value: next_token_ids=tensor([24091], device='cuda:0') mtp accept=1 prop=24091 top1=24091 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.8ms wait=0.1/47.8ms pred gate=device Token # 1443: 3.818ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19037 prop=19037 pred gate=device Token # 1444: 114.791ms; value: next_token_ids=tensor([19037], device='cuda:0') mtp accept=1 prop=19037 top1=19037 accp=1.000 next=draft=201 prop=201 olap pair=109.7ms serial=194.6ms gain=85.0ms ratio=0.44 s0=4.2ms s1=190.4ms wait=0.1/47.4ms pred gate=device Token # 1445: 3.736ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=23189 prop=23189 pred gate=device Token # 1446: 114.299ms; value: next_token_ids=tensor([23189], device='cuda:0') mtp accept=1 prop=23189 top1=23189 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.6ms gain=84.4ms ratio=0.44 s0=4.2ms s1=189.3ms wait=0.1/47.3ms pred gate=device Token # 1447: 3.800ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=16006 prop=16006 pred gate=device Token # 1448: 114.540ms; value: next_token_ids=tensor([16006], device='cuda:0') mtp accept=1 prop=16006 top1=16006 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=4.0ms s1=190.3ms wait=0.1/47.9ms pred gate=device Token # 1449: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25051 prop=25051 pred gate=device Token # 1450: 114.789ms; value: next_token_ids=tensor([25051], device='cuda:0') mtp accept=1 prop=25051 top1=25051 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.6ms wait=0.1/47.9ms pred gate=device Token # 1451: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=24737 prop=24737 pred gate=device Token # 1452: 114.630ms; value: next_token_ids=tensor([24737], device='cuda:0') mtp accept=1 prop=24737 top1=24737 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=4.1ms s1=190.2ms wait=0.1/47.5ms pred gate=device Token # 1453: 3.798ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.918 next=pair draft=23633 prop=23633 pred gate=device Token # 1454: 114.577ms; value: next_token_ids=tensor([23633], device='cuda:0') mtp accept=1 prop=23633 top1=23633 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.8ms s1=189.4ms wait=0.1/46.6ms pred gate=device Token # 1455: 3.848ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24254 prop=24254 pred gate=device Token # 1456: 114.855ms; value: next_token_ids=tensor([24254], device='cuda:0') mtp accept=1 prop=24254 top1=24254 accp=1.000 next=draft=24254 prop=223 olap pair=109.7ms serial=194.7ms gain=85.0ms ratio=0.44 s0=4.8ms s1=189.9ms wait=0.1/46.5ms pred gate=device Token # 1457: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.067 next=pair draft=25472 prop=25472 pred gate=device Token # 1458: 114.452ms; value: next_token_ids=tensor([25472], device='cuda:0') mtp accept=1 prop=25472 top1=25472 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=4.5ms s1=189.6ms wait=0.1/47.1ms pred gate=device Token # 1459: 3.792ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25089 prop=25089 pred gate=device Token # 1460: 114.679ms; value: next_token_ids=tensor([25089], device='cuda:0') mtp accept=1 prop=25089 top1=25089 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/48.5ms pred gate=device Token # 1461: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.962 next=pair draft=19758 prop=19758 pred gate=device Token # 1462: 115.798ms; value: next_token_ids=tensor([19758], device='cuda:0') mtp accept=1 prop=19758 top1=19758 accp=1.000 next=draft=19758 prop=19758 olap pair=110.7ms serial=196.3ms gain=85.6ms ratio=0.44 s0=3.8ms s1=192.5ms wait=0.1/48.6ms pred gate=device Token # 1463: 3.825ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=19758 top1=223 accp=0.290 next=pair draft=18320 prop=18320 pred gate=device Token # 1464: 114.482ms; value: next_token_ids=tensor([18320], device='cuda:0') mtp accept=1 prop=18320 top1=18320 accp=1.000 next=draft=201 prop=201 olap pair=109.2ms serial=193.8ms gain=84.6ms ratio=0.44 s0=4.1ms s1=189.7ms wait=0.1/47.8ms pred gate=device Token # 1465: 3.799ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=23911 prop=23911 pred gate=device Token # 1466: 114.711ms; value: next_token_ids=tensor([23911], device='cuda:0') mtp accept=1 prop=23911 top1=23911 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=4.4ms s1=190.2ms wait=0.1/47.3ms pred gate=device Token # 1467: 3.827ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=25065 prop=25065 pred gate=device Token # 1468: 114.816ms; value: next_token_ids=tensor([25065], device='cuda:0') mtp accept=1 prop=25065 top1=25065 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=4.4ms s1=190.4ms wait=0.1/47.2ms pred gate=device Token # 1469: 3.835ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=24357 prop=24357 pred gate=device Token # 1470: 114.728ms; value: next_token_ids=tensor([24357], device='cuda:0') mtp accept=1 prop=24357 top1=24357 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.6ms wait=0.1/48.0ms pred gate=device Token # 1471: 3.827ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.895 next=pair draft=25181 prop=25181 pred gate=device Token # 1472: 115.311ms; value: next_token_ids=tensor([25181], device='cuda:0') mtp accept=1 prop=25181 top1=25181 accp=1.000 next=draft=25181 prop=25181 olap pair=110.2ms serial=195.8ms gain=85.7ms ratio=0.44 s0=3.7ms s1=192.1ms wait=0.1/48.6ms pred gate=device Token # 1473: 3.796ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25181 top1=223 accp=0.011 next=pair draft=22580 prop=22580 pred gate=device Token # 1474: 114.482ms; value: next_token_ids=tensor([22580], device='cuda:0') mtp accept=1 prop=22580 top1=22580 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/48.4ms pred gate=device Token # 1475: 3.767ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=25423 prop=25423 pred gate=device Token # 1476: 114.478ms; value: next_token_ids=tensor([25423], device='cuda:0') mtp accept=1 prop=25423 top1=25423 accp=1.000 next=draft=25423 prop=25423 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/48.6ms pred gate=device Token # 1477: 3.819ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25423 top1=223 accp=0.355 next=pair draft=26014 prop=26014 pred gate=device Token # 1478: 115.140ms; value: next_token_ids=tensor([26014], device='cuda:0') mtp accept=1 prop=26014 top1=26014 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.6ms wait=0.1/48.5ms pred gate=device Token # 1479: 3.804ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=22797 prop=22797 pred gate=device Token # 1480: 115.107ms; value: next_token_ids=tensor([22797], device='cuda:0') mtp accept=1 prop=22797 top1=22797 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.7ms gain=84.7ms ratio=0.44 s0=4.0ms s1=190.7ms wait=0.1/48.4ms pred gate=device Token # 1481: 3.817ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.952 next=pair draft=25965 prop=25965 pred gate=device Token # 1482: 114.531ms; value: next_token_ids=tensor([25965], device='cuda:0') mtp accept=1 prop=25965 top1=25965 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.1ms s1=190.0ms wait=0.1/48.0ms pred gate=device Token # 1483: 3.771ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21767 prop=21767 pred gate=device Token # 1484: 114.939ms; value: next_token_ids=tensor([21767], device='cuda:0') mtp accept=1 prop=21767 top1=21767 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=195.1ms gain=85.3ms ratio=0.44 s0=4.4ms s1=190.7ms wait=0.1/47.3ms pred gate=device Token # 1485: 3.787ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.999 next=pair draft=23858 prop=23858 pred gate=device Token # 1486: 114.906ms; value: next_token_ids=tensor([23858], device='cuda:0') mtp accept=1 prop=23858 top1=23858 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=4.1ms s1=190.8ms wait=0.1/47.9ms pred gate=device Token # 1487: 3.801ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.860 next=pair draft=24891 prop=24891 pred gate=device Token # 1488: 114.936ms; value: next_token_ids=tensor([24891], device='cuda:0') mtp accept=1 prop=24891 top1=24891 accp=1.000 next=draft=24891 prop=24891 olap pair=109.8ms serial=195.1ms gain=85.3ms ratio=0.44 s0=4.0ms s1=191.1ms wait=0.1/48.2ms pred gate=device Token # 1489: 3.777ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=24891 top1=223 accp=0.004 next=pair draft=24858 prop=24858 pred gate=device Token # 1490: 115.504ms; value: next_token_ids=tensor([24858], device='cuda:0') mtp accept=1 prop=24858 top1=24858 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=4.1ms s1=191.8ms wait=0.1/47.8ms pred gate=device Token # 1491: 3.739ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26760 prop=26760 pred gate=device Token # 1492: 115.490ms; value: next_token_ids=tensor([26760], device='cuda:0') mtp accept=1 prop=26760 top1=26760 accp=1.000 next=draft=26760 prop=26760 olap pair=110.3ms serial=196.1ms gain=85.8ms ratio=0.44 s0=4.2ms s1=191.9ms wait=0.1/47.6ms pred gate=device Token # 1493: 3.760ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26760 top1=223 accp=0.012 next=pair draft=24960 prop=24960 pred gate=device Token # 1494: 114.754ms; value: next_token_ids=tensor([24960], device='cuda:0') mtp accept=1 prop=24960 top1=24960 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.3ms wait=0.1/48.3ms pred gate=device Token # 1495: 3.841ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24206 prop=24206 pred gate=device Token # 1496: 114.669ms; value: next_token_ids=tensor([24206], device='cuda:0') mtp accept=1 prop=24206 top1=24206 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.9ms gain=84.6ms ratio=0.44 s0=4.0ms s1=190.0ms wait=0.1/48.3ms pred gate=device Token # 1497: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25401 prop=25401 pred gate=device Token # 1498: 114.713ms; value: next_token_ids=tensor([25401], device='cuda:0') mtp accept=1 prop=25401 top1=25401 accp=1.000 next=draft=25401 prop=25401 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.8ms wait=0.1/48.5ms pred gate=device Token # 1499: 3.779ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25401 top1=223 accp=0.005 next=pair draft=26461 prop=26461 pred gate=device Token # 1500: 115.830ms; value: next_token_ids=tensor([26461], device='cuda:0') mtp accept=1 prop=26461 top1=26461 accp=1.000 next=draft=26461 prop=26461 olap pair=110.6ms serial=196.7ms gain=86.1ms ratio=0.44 s0=3.8ms s1=192.9ms wait=0.1/48.6ms pred gate=device Token # 1501: 3.822ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26461 top1=223 accp=0.189 next=pair draft=27314 prop=27314 pred gate=device Token # 1502: 114.679ms; value: next_token_ids=tensor([27314], device='cuda:0') mtp accept=1 prop=27314 top1=27314 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.1ms s1=190.4ms wait=0.1/48.0ms pred gate=device Token # 1503: 3.735ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=20323 prop=20323 pred gate=device Token # 1504: 115.431ms; value: next_token_ids=tensor([20323], device='cuda:0') mtp accept=1 prop=20323 top1=20323 accp=1.000 next=draft=201 prop=201 olap pair=110.3ms serial=195.8ms gain=85.5ms ratio=0.44 s0=4.3ms s1=191.5ms wait=0.1/47.5ms pred gate=device Token # 1505: 3.782ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.999 next=pair draft=25804 prop=25804 pred gate=device Token # 1506: 114.367ms; value: next_token_ids=tensor([25804], device='cuda:0') mtp accept=1 prop=25804 top1=25804 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.5ms gain=84.4ms ratio=0.44 s0=4.2ms s1=189.3ms wait=0.1/47.9ms pred gate=device Token # 1507: 3.742ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26219 prop=26219 pred gate=device Token # 1508: 115.307ms; value: next_token_ids=tensor([26219], device='cuda:0') mtp accept=1 prop=26219 top1=26219 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.7ms gain=85.5ms ratio=0.44 s0=3.9ms s1=191.8ms wait=0.1/48.4ms pred gate=device Token # 1509: 3.769ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25498 prop=25498 pred gate=device Token # 1510: 114.895ms; value: next_token_ids=tensor([25498], device='cuda:0') mtp accept=1 prop=25498 top1=25498 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/48.7ms pred gate=device Token # 1511: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.811 next=pair draft=24798 prop=24798 pred gate=device Token # 1512: 116.940ms; value: next_token_ids=tensor([24798], device='cuda:0') mtp accept=1 prop=24798 top1=24798 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.2ms gain=84.7ms ratio=0.44 s0=4.0ms s1=190.1ms wait=0.1/48.4ms pred gate=device Token # 1513: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24786 prop=24786 pred gate=device Token # 1514: 114.623ms; value: next_token_ids=tensor([24786], device='cuda:0') mtp accept=1 prop=24786 top1=24786 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.1ms gain=84.6ms ratio=0.44 s0=4.1ms s1=190.0ms wait=0.1/48.2ms pred gate=device Token # 1515: 3.795ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.952 next=pair draft=26035 prop=26035 pred gate=device Token # 1516: 114.743ms; value: next_token_ids=tensor([26035], device='cuda:0') mtp accept=1 prop=26035 top1=26035 accp=1.000 next=draft=26035 prop=26035 olap pair=109.6ms serial=193.7ms gain=84.1ms ratio=0.43 s0=3.9ms s1=189.8ms wait=0.1/48.5ms pred gate=device Token # 1517: 3.781ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26035 top1=223 accp=0.035 next=pair draft=26999 prop=26999 pred gate=device Token # 1518: 115.526ms; value: next_token_ids=tensor([26999], device='cuda:0') mtp accept=1 prop=26999 top1=26999 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.4ms gain=85.0ms ratio=0.44 s0=3.9ms s1=191.4ms wait=0.1/48.4ms pred gate=device Token # 1519: 3.761ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26777 prop=26777 pred gate=device Token # 1520: 115.220ms; value: next_token_ids=tensor([26777], device='cuda:0') mtp accept=1 prop=26777 top1=26777 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.5ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.7ms wait=0.1/48.7ms pred gate=device Token # 1521: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.967 next=pair draft=26139 prop=26139 pred gate=device Token # 1522: 114.442ms; value: next_token_ids=tensor([26139], device='cuda:0') mtp accept=1 prop=26139 top1=26139 accp=1.000 next=draft=26139 prop=26139 olap pair=109.2ms serial=193.7ms gain=84.4ms ratio=0.44 s0=4.3ms s1=189.4ms wait=0.1/47.3ms pred gate=device Token # 1523: 3.765ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26139 top1=223 accp=0.241 next=pair draft=17154 prop=17154 pred gate=device Token # 1524: 114.650ms; value: next_token_ids=tensor([17154], device='cuda:0') mtp accept=1 prop=17154 top1=17154 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.2ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/47.3ms pred gate=device Token # 1525: 3.767ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.996 next=pair draft=25353 prop=25353 pred gate=device Token # 1526: 114.483ms; value: next_token_ids=tensor([25353], device='cuda:0') mtp accept=1 prop=25353 top1=25353 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=4.1ms s1=189.9ms wait=0.1/47.9ms pred gate=device Token # 1527: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25633 prop=25633 pred gate=device Token # 1528: 114.977ms; value: next_token_ids=tensor([25633], device='cuda:0') mtp accept=1 prop=25633 top1=25633 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.0ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.3ms wait=0.1/48.7ms pred gate=device Token # 1529: 3.780ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=26552 prop=26552 pred gate=device Token # 1530: 114.887ms; value: next_token_ids=tensor([26552], device='cuda:0') mtp accept=1 prop=26552 top1=26552 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.4ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.6ms pred gate=device Token # 1531: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.841 next=pair draft=27222 prop=27222 pred gate=device Token # 1532: 115.407ms; value: next_token_ids=tensor([27222], device='cuda:0') mtp accept=1 prop=27222 top1=27222 accp=1.000 next=draft=223 prop=27222 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=3.8ms s1=192.0ms wait=0.1/48.6ms pred gate=device Token # 1533: 3.791ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=27222 top1=223 accp=0.874 next=pair draft=20295 prop=20295 pred gate=device Token # 1534: 114.137ms; value: next_token_ids=tensor([20295], device='cuda:0') mtp accept=1 prop=20295 top1=20295 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.4ms gain=84.5ms ratio=0.44 s0=3.8ms s1=189.6ms wait=0.1/48.7ms pred gate=device Token # 1535: 3.799ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24966 prop=24966 pred gate=device Token # 1536: 114.191ms; value: next_token_ids=tensor([24966], device='cuda:0') mtp accept=1 prop=24966 top1=24966 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.8ms wait=0.1/48.6ms pred gate=device Token # 1537: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27278 prop=27278 pred gate=device Token # 1538: 114.245ms; value: next_token_ids=tensor([27278], device='cuda:0') mtp accept=1 prop=27278 top1=27278 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.8ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.1ms wait=0.1/48.7ms pred gate=device Token # 1539: 3.793ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.946 next=pair draft=27277 prop=27277 pred gate=device Token # 1540: 114.600ms; value: next_token_ids=tensor([27277], device='cuda:0') mtp accept=1 prop=27277 top1=27277 accp=1.000 next=draft=27277 prop=27277 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/48.7ms pred gate=device Token # 1541: 3.814ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=27277 top1=223 accp=0.010 next=pair draft=26307 prop=26307 pred gate=device Token # 1542: 114.686ms; value: next_token_ids=tensor([26307], device='cuda:0') mtp accept=1 prop=26307 top1=26307 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/48.6ms pred gate=device Token # 1543: 3.781ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=19749 prop=19749 pred gate=device Token # 1544: 114.191ms; value: next_token_ids=tensor([19749], device='cuda:0') mtp accept=1 prop=19749 top1=19749 accp=1.000 next=draft=201 prop=201 olap pair=109.0ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.7ms s1=189.8ms wait=0.1/48.7ms pred gate=device Token # 1545: 3.777ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.932 next=pair draft=26160 prop=26160 pred gate=device Token # 1546: 114.197ms; value: next_token_ids=tensor([26160], device='cuda:0') mtp accept=1 prop=26160 top1=26160 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=3.9ms s1=189.7ms wait=0.1/48.4ms pred gate=device Token # 1547: 3.791ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=25355 prop=25355 pred gate=device Token # 1548: 114.377ms; value: next_token_ids=tensor([25355], device='cuda:0') mtp accept=1 prop=25355 top1=25355 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/48.7ms pred gate=device Token # 1549: 3.830ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.962 next=pair draft=26655 prop=26655 pred gate=device Token # 1550: 114.948ms; value: next_token_ids=tensor([26655], device='cuda:0') mtp accept=1 prop=26655 top1=26655 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.8ms gain=85.0ms ratio=0.44 s0=4.4ms s1=190.3ms wait=0.1/47.4ms pred gate=device Token # 1551: 3.795ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.688 next=pair draft=26484 prop=26484 pred gate=device Token # 1552: 115.753ms; value: next_token_ids=tensor([26484], device='cuda:0') mtp accept=1 prop=26484 top1=26484 accp=1.000 next=draft=26484 prop=26484 olap pair=109.6ms serial=194.4ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.4ms wait=0.1/48.3ms pred gate=device Token # 1553: 3.837ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26484 top1=223 accp=0.003 next=pair draft=25695 prop=25695 pred gate=device Token # 1554: 114.464ms; value: next_token_ids=tensor([25695], device='cuda:0') mtp accept=1 prop=25695 top1=25695 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.2ms wait=0.1/48.6ms pred gate=device Token # 1555: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.951 next=pair draft=27208 prop=27208 pred gate=device Token # 1556: 115.128ms; value: next_token_ids=tensor([27208], device='cuda:0') mtp accept=1 prop=27208 top1=27208 accp=1.000 next=draft=27208 prop=27208 olap pair=110.0ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.9ms s1=191.5ms wait=0.1/48.5ms pred gate=device Token # 1557: 3.819ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=27208 top1=223 accp=0.129 next=pair draft=25601 prop=25601 pred gate=device Token # 1558: 114.280ms; value: next_token_ids=tensor([25601], device='cuda:0') mtp accept=1 prop=25601 top1=25601 accp=1.000 next=draft=25601 prop=223 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=3.9ms s1=189.7ms wait=0.1/48.3ms pred gate=device Token # 1559: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.178 next=pair draft=26165 prop=26165 pred gate=device Token # 1560: 114.550ms; value: next_token_ids=tensor([26165], device='cuda:0') mtp accept=1 prop=26165 top1=26165 accp=1.000 next=draft=26165 prop=26165 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/48.5ms pred gate=device Token # 1561: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26165 top1=223 accp=0.009 next=pair draft=27835 prop=27835 pred gate=device Token # 1562: 114.672ms; value: next_token_ids=tensor([27835], device='cuda:0') mtp accept=1 prop=27835 top1=27835 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/48.8ms pred gate=device Token # 1563: 3.789ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=23239 prop=23239 pred gate=device Token # 1564: 114.810ms; value: next_token_ids=tensor([23239], device='cuda:0') mtp accept=1 prop=23239 top1=23239 accp=1.000 next=draft=201 prop=201 olap pair=109.7ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.1ms wait=0.1/48.8ms pred gate=device Token # 1565: 3.696ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=24983 prop=24983 pred gate=device Token # 1566: 114.220ms; value: next_token_ids=tensor([24983], device='cuda:0') mtp accept=1 prop=24983 top1=24983 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=3.9ms s1=189.6ms wait=0.1/48.6ms pred gate=device Token # 1567: 3.766ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.988 next=pair draft=26670 prop=26670 pred gate=device Token # 1568: 115.013ms; value: next_token_ids=tensor([26670], device='cuda:0') mtp accept=1 prop=26670 top1=26670 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.1ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.4ms wait=0.1/48.7ms pred gate=device Token # 1569: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27343 prop=27343 pred gate=device Token # 1570: 114.321ms; value: next_token_ids=tensor([27343], device='cuda:0') mtp accept=1 prop=27343 top1=27343 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=192.9ms gain=83.9ms ratio=0.44 s0=5.8ms s1=187.1ms wait=0.2/46.4ms pred gate=device Token # 1571: 3.848ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26934 prop=26934 pred gate=device Token # 1572: 114.245ms; value: next_token_ids=tensor([26934], device='cuda:0') mtp accept=1 prop=26934 top1=26934 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.8ms wait=0.1/48.5ms pred gate=device Token # 1573: 3.743ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25282 prop=25282 pred gate=device Token # 1574: 113.888ms; value: next_token_ids=tensor([25282], device='cuda:0') mtp accept=1 prop=25282 top1=25282 accp=1.000 next=draft=223 prop=223 olap pair=108.7ms serial=192.9ms gain=84.2ms ratio=0.44 s0=3.8ms s1=189.1ms wait=0.1/48.6ms pred gate=device Token # 1575: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25174 prop=25174 pred gate=device Token # 1576: 114.630ms; value: next_token_ids=tensor([25174], device='cuda:0') mtp accept=1 prop=25174 top1=25174 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.6ms pred gate=device Token # 1577: 3.748ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=26362 prop=26362 pred gate=device Token # 1578: 115.093ms; value: next_token_ids=tensor([26362], device='cuda:0') mtp accept=1 prop=26362 top1=26362 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.5ms gain=84.4ms ratio=0.44 s0=4.0ms s1=189.5ms wait=0.1/48.3ms pred gate=device Token # 1579: 4.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28494 prop=28494 pred gate=device Token # 1580: 114.992ms; value: next_token_ids=tensor([28494], device='cuda:0') mtp accept=1 prop=28494 top1=28494 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.1ms gain=84.5ms ratio=0.44 s0=4.2ms s1=189.9ms wait=0.1/48.0ms pred gate=device Token # 1581: 3.819ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=28305 prop=28305 pred gate=device Token # 1582: 114.228ms; value: next_token_ids=tensor([28305], device='cuda:0') mtp accept=1 prop=28305 top1=28305 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=3.9ms s1=189.6ms wait=0.1/48.6ms pred gate=device Token # 1583: 3.788ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.869 next=pair draft=23243 prop=23243 pred gate=device Token # 1584: 114.930ms; value: next_token_ids=tensor([23243], device='cuda:0') mtp accept=1 prop=23243 top1=23243 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=195.0ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/48.5ms pred gate=device Token # 1585: 3.754ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.988 next=pair draft=27470 prop=27470 pred gate=device Token # 1586: 114.034ms; value: next_token_ids=tensor([27470], device='cuda:0') mtp accept=1 prop=27470 top1=27470 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.3ms gain=84.4ms ratio=0.44 s0=4.2ms s1=189.2ms wait=0.1/48.1ms pred gate=device Token # 1587: 3.845ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28691 prop=28691 pred gate=device Token # 1588: 114.533ms; value: next_token_ids=tensor([28691], device='cuda:0') mtp accept=1 prop=28691 top1=28691 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/48.5ms pred gate=device Token # 1589: 3.816ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.709 next=pair draft=27279 prop=27279 pred gate=device Token # 1590: 114.763ms; value: next_token_ids=tensor([27279], device='cuda:0') mtp accept=1 prop=27279 top1=27279 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.6ms gain=84.1ms ratio=0.43 s0=6.1ms s1=187.5ms wait=0.2/45.7ms pred gate=device Token # 1591: 3.817ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.768 next=pair draft=26513 prop=26513 pred gate=device Token # 1592: 114.921ms; value: next_token_ids=tensor([26513], device='cuda:0') mtp accept=1 prop=26513 top1=26513 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.8ms wait=0.1/48.4ms pred gate=device Token # 1593: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26412 prop=26412 pred gate=device Token # 1594: 115.145ms; value: next_token_ids=tensor([26412], device='cuda:0') mtp accept=1 prop=26412 top1=26412 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=193.7ms gain=83.7ms ratio=0.43 s0=4.1ms s1=189.7ms wait=0.1/48.3ms pred gate=device Token # 1595: 3.829ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.991 next=pair draft=27409 prop=27409 pred gate=device Token # 1596: 114.757ms; value: next_token_ids=tensor([27409], device='cuda:0') mtp accept=1 prop=27409 top1=27409 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.2ms gain=84.6ms ratio=0.44 s0=4.0ms s1=190.2ms wait=0.1/48.4ms pred gate=device Token # 1597: 3.900ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=28051 prop=28051 pred gate=device Token # 1598: 116.435ms; value: next_token_ids=tensor([28051], device='cuda:0') mtp accept=1 prop=28051 top1=28051 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=3.8ms s1=189.7ms wait=0.1/48.7ms pred gate=device Token # 1599: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27618 prop=27618 pred gate=device Token # 1600: 115.515ms; value: next_token_ids=tensor([27618], device='cuda:0') mtp accept=1 prop=27618 top1=27618 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=196.1ms gain=85.7ms ratio=0.44 s0=3.9ms s1=192.2ms wait=0.1/48.4ms pred gate=device Token # 1601: 3.831ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.962 next=pair draft=27471 prop=27471 pred gate=device Token # 1602: 114.851ms; value: next_token_ids=tensor([27471], device='cuda:0') mtp accept=1 prop=27471 top1=27471 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/48.6ms pred gate=device Token # 1603: 3.805ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=25431 prop=25431 pred gate=device Token # 1604: 114.433ms; value: next_token_ids=tensor([25431], device='cuda:0') mtp accept=1 prop=25431 top1=25431 accp=1.000 next=draft=201 prop=201 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.1ms wait=0.1/48.5ms pred gate=device Token # 1605: 3.746ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=27582 prop=27582 pred gate=device Token # 1606: 114.263ms; value: next_token_ids=tensor([27582], device='cuda:0') mtp accept=1 prop=27582 top1=27582 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.7ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.0ms wait=0.1/48.6ms pred gate=device Token # 1607: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25403 prop=25403 pred gate=device Token # 1608: 114.735ms; value: next_token_ids=tensor([25403], device='cuda:0') mtp accept=1 prop=25403 top1=25403 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.8ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/48.5ms pred gate=device Token # 1609: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.686 next=pair draft=28389 prop=28389 pred gate=device Token # 1610: 114.814ms; value: next_token_ids=tensor([28389], device='cuda:0') mtp accept=1 prop=28389 top1=28389 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/48.5ms pred gate=device Token # 1611: 3.774ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28835 prop=28835 pred gate=device Token # 1612: 114.040ms; value: next_token_ids=tensor([28835], device='cuda:0') mtp accept=1 prop=28835 top1=28835 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.4ms gain=84.5ms ratio=0.44 s0=3.8ms s1=189.6ms wait=0.1/48.6ms pred gate=device Token # 1613: 3.852ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=27561 prop=27561 pred gate=device Token # 1614: 114.384ms; value: next_token_ids=tensor([27561], device='cuda:0') mtp accept=1 prop=27561 top1=27561 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.2ms wait=0.1/48.6ms pred gate=device Token # 1615: 3.832ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29490 prop=29490 pred gate=device Token # 1616: 115.190ms; value: next_token_ids=tensor([29490], device='cuda:0') mtp accept=1 prop=29490 top1=29490 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=194.5ms gain=84.5ms ratio=0.43 s0=5.8ms s1=188.7ms wait=0.2/45.6ms pred gate=device Token # 1617: 3.845ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28602 prop=28602 pred gate=device Token # 1618: 115.721ms; value: next_token_ids=tensor([28602], device='cuda:0') mtp accept=1 prop=28602 top1=28602 accp=1.000 next=draft=223 prop=223 olap pair=110.5ms serial=196.2ms gain=85.7ms ratio=0.44 s0=4.3ms s1=191.9ms wait=0.1/47.4ms pred gate=device Token # 1619: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=28273 prop=28273 pred gate=device Token # 1620: 114.888ms; value: next_token_ids=tensor([28273], device='cuda:0') mtp accept=1 prop=28273 top1=28273 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.6ms gain=84.9ms ratio=0.44 s0=4.3ms s1=190.3ms wait=0.1/47.4ms pred gate=device Token # 1621: 3.802ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=25361 prop=25361 pred gate=device Token # 1622: 115.101ms; value: next_token_ids=tensor([25361], device='cuda:0') mtp accept=1 prop=25361 top1=25361 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=195.0ms gain=85.1ms ratio=0.44 s0=4.3ms s1=190.7ms wait=0.1/47.4ms pred gate=device Token # 1623: 3.758ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.923 next=pair draft=6391 prop=6391 pred gate=device Token # 1624: 115.083ms; value: next_token_ids=tensor([6391], device='cuda:0') mtp accept=1 prop=6391 top1=6391 accp=1.000 next=draft=201 prop=201 olap pair=109.9ms serial=195.4ms gain=85.4ms ratio=0.44 s0=4.4ms s1=191.0ms wait=0.1/47.3ms pred gate=device Token # 1625: 3.786ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.977 next=pair draft=19698 prop=19698 pred gate=device Token # 1626: 114.646ms; value: next_token_ids=tensor([19698], device='cuda:0') mtp accept=1 prop=19698 top1=19698 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.8ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/47.1ms pred gate=device Token # 1627: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22825 prop=22825 pred gate=device Token # 1628: 114.891ms; value: next_token_ids=tensor([22825], device='cuda:0') mtp accept=1 prop=22825 top1=22825 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.2ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/47.4ms pred gate=device Token # 1629: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=24470 prop=24470 pred gate=device Token # 1630: 114.454ms; value: next_token_ids=tensor([24470], device='cuda:0') mtp accept=1 prop=24470 top1=24470 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.3ms wait=0.1/48.5ms pred gate=device Token # 1631: 3.791ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25196 prop=25196 pred gate=device Token # 1632: 114.862ms; value: next_token_ids=tensor([25196], device='cuda:0') mtp accept=1 prop=25196 top1=25196 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/48.5ms pred gate=device Token # 1633: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25641 prop=25641 pred gate=device Token # 1634: 114.853ms; value: next_token_ids=tensor([25641], device='cuda:0') mtp accept=1 prop=25641 top1=25641 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.9ms gain=85.1ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/47.4ms pred gate=device Token # 1635: 3.792ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25273 prop=25273 pred gate=device Token # 1636: 114.661ms; value: next_token_ids=tensor([25273], device='cuda:0') mtp accept=1 prop=25273 top1=25273 accp=1.000 next=draft=25273 prop=223 olap pair=109.5ms serial=194.4ms gain=85.0ms ratio=0.44 s0=4.1ms s1=190.4ms wait=0.1/48.0ms pred gate=device Token # 1637: 3.814ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.458 next=pair draft=26105 prop=26105 pred gate=device Token # 1638: 117.017ms; value: next_token_ids=tensor([26105], device='cuda:0') mtp accept=1 prop=26105 top1=26105 accp=1.000 next=draft=223 prop=223 olap pair=111.8ms serial=197.1ms gain=85.3ms ratio=0.43 s0=4.7ms s1=192.3ms wait=0.1/46.6ms pred gate=device Token # 1639: 3.820ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24877 prop=24877 pred gate=device Token # 1640: 114.753ms; value: next_token_ids=tensor([24877], device='cuda:0') mtp accept=1 prop=24877 top1=24877 accp=1.000 next=draft=223 prop=24877 olap pair=109.6ms serial=194.7ms gain=85.0ms ratio=0.44 s0=4.6ms s1=190.1ms wait=0.2/47.0ms pred gate=device Token # 1641: 3.804ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=24877 top1=223 accp=0.523 next=pair draft=25543 prop=25543 pred gate=device Token # 1642: 114.473ms; value: next_token_ids=tensor([25543], device='cuda:0') mtp accept=1 prop=25543 top1=25543 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.6ms wait=0.1/47.3ms pred gate=device Token # 1643: 3.822ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.993 next=pair draft=21484 prop=21484 pred gate=device Token # 1644: 114.484ms; value: next_token_ids=tensor([21484], device='cuda:0') mtp accept=1 prop=21484 top1=21484 accp=1.000 next=draft=201 prop=201 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=4.1ms s1=189.9ms wait=0.1/47.8ms pred gate=device Token # 1645: 3.781ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=25340 prop=25340 pred gate=device Token # 1646: 114.566ms; value: next_token_ids=tensor([25340], device='cuda:0') mtp accept=1 prop=25340 top1=25340 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/48.5ms pred gate=device Token # 1647: 3.779ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.907 next=pair draft=23907 prop=23907 pred gate=device Token # 1648: 114.526ms; value: next_token_ids=tensor([23907], device='cuda:0') mtp accept=1 prop=23907 top1=23907 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/48.7ms pred gate=device Token # 1649: 3.800ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27028 prop=27028 pred gate=device Token # 1650: 115.152ms; value: next_token_ids=tensor([27028], device='cuda:0') mtp accept=1 prop=27028 top1=27028 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.5ms gain=85.5ms ratio=0.44 s0=3.7ms s1=191.8ms wait=0.1/48.8ms pred gate=device Token # 1651: 3.832ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.942 next=pair draft=26792 prop=26792 pred gate=device Token # 1652: 114.783ms; value: next_token_ids=tensor([26792], device='cuda:0') mtp accept=1 prop=26792 top1=26792 accp=1.000 next=draft=26792 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=4.0ms s1=190.6ms wait=0.1/48.1ms pred gate=device Token # 1653: 3.843ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.408 next=pair draft=27042 prop=27042 pred gate=device Token # 1654: 115.130ms; value: next_token_ids=tensor([27042], device='cuda:0') mtp accept=1 prop=27042 top1=27042 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.9ms gain=85.0ms ratio=0.44 s0=4.3ms s1=190.6ms wait=0.1/47.5ms pred gate=device Token # 1655: 3.800ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24517 prop=24517 pred gate=device Token # 1656: 114.581ms; value: next_token_ids=tensor([24517], device='cuda:0') mtp accept=1 prop=24517 top1=24517 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.7ms gain=84.2ms ratio=0.43 s0=4.2ms s1=189.5ms wait=0.1/47.8ms pred gate=device Token # 1657: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25945 prop=25945 pred gate=device Token # 1658: 115.090ms; value: next_token_ids=tensor([25945], device='cuda:0') mtp accept=1 prop=25945 top1=25945 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.3ms gain=85.4ms ratio=0.44 s0=4.1ms s1=191.3ms wait=0.1/47.9ms pred gate=device Token # 1659: 3.800ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25994 prop=25994 pred gate=device Token # 1660: 114.896ms; value: next_token_ids=tensor([25994], device='cuda:0') mtp accept=1 prop=25994 top1=25994 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.7ms gain=85.0ms ratio=0.44 s0=4.1ms s1=190.6ms wait=0.1/48.3ms pred gate=device Token # 1661: 3.801ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=28516 prop=28516 pred gate=device Token # 1662: 114.439ms; value: next_token_ids=tensor([28516], device='cuda:0') mtp accept=1 prop=28516 top1=28516 accp=1.000 next=draft=28516 prop=28516 olap pair=109.3ms serial=193.9ms gain=84.6ms ratio=0.44 s0=4.4ms s1=189.5ms wait=0.1/48.1ms pred gate=device Token # 1663: 3.791ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28516 top1=223 accp=0.001 next=pair draft=21814 prop=21814 pred gate=device Token # 1664: 114.919ms; value: next_token_ids=tensor([21814], device='cuda:0') mtp accept=1 prop=21814 top1=21814 accp=1.000 next=draft=201 prop=201 olap pair=109.6ms serial=194.3ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.9ms wait=0.1/47.8ms pred gate=device Token # 1665: 3.791ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.998 next=pair draft=25753 prop=25753 pred gate=device Token # 1666: 113.954ms; value: next_token_ids=tensor([25753], device='cuda:0') mtp accept=1 prop=25753 top1=25753 accp=1.000 next=draft=223 prop=223 olap pair=108.8ms serial=193.1ms gain=84.3ms ratio=0.44 s0=4.0ms s1=189.2ms wait=0.1/48.6ms pred gate=device Token # 1667: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26277 prop=26277 pred gate=device Token # 1668: 114.831ms; value: next_token_ids=tensor([26277], device='cuda:0') mtp accept=1 prop=26277 top1=26277 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.2ms gain=84.6ms ratio=0.44 s0=4.9ms s1=189.4ms wait=0.1/47.4ms pred gate=device Token # 1669: 3.818ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28297 prop=28297 pred gate=device Token # 1670: 114.550ms; value: next_token_ids=tensor([28297], device='cuda:0') mtp accept=1 prop=28297 top1=28297 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=3.9ms s1=190.1ms wait=0.1/48.6ms pred gate=device Token # 1671: 3.807ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.973 next=pair draft=26403 prop=26403 pred gate=device Token # 1672: 115.273ms; value: next_token_ids=tensor([26403], device='cuda:0') mtp accept=1 prop=26403 top1=26403 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.1ms gain=85.1ms ratio=0.44 s0=3.9ms s1=191.2ms wait=0.1/48.5ms pred gate=device Token # 1673: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.961 next=pair draft=20807 prop=20807 pred gate=device Token # 1674: 114.673ms; value: next_token_ids=tensor([20807], device='cuda:0') mtp accept=1 prop=20807 top1=20807 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/48.7ms pred gate=device Token # 1675: 3.767ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27750 prop=27750 pred gate=device Token # 1676: 114.693ms; value: next_token_ids=tensor([27750], device='cuda:0') mtp accept=1 prop=27750 top1=27750 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.6ms pred gate=device Token # 1677: 3.843ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27517 prop=27517 pred gate=device Token # 1678: 114.989ms; value: next_token_ids=tensor([27517], device='cuda:0') mtp accept=1 prop=27517 top1=27517 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.1ms gain=85.4ms ratio=0.44 s0=3.9ms s1=191.2ms wait=0.1/48.6ms pred gate=device Token # 1679: 3.807ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28709 prop=28709 pred gate=device Token # 1680: 114.547ms; value: next_token_ids=tensor([28709], device='cuda:0') mtp accept=1 prop=28709 top1=28709 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.8ms gain=84.5ms ratio=0.44 s0=4.1ms s1=189.7ms wait=0.1/48.1ms pred gate=device Token # 1681: 3.912ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.598 next=pair draft=29063 prop=29063 pred gate=device Token # 1682: 114.709ms; value: next_token_ids=tensor([29063], device='cuda:0') mtp accept=1 prop=29063 top1=29063 accp=1.000 next=draft=29063 prop=29063 olap pair=109.5ms serial=194.0ms gain=84.6ms ratio=0.44 s0=4.3ms s1=189.7ms wait=0.1/47.5ms pred gate=device Token # 1683: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=29063 top1=223 accp=0.167 next=pair draft=22212 prop=22212 pred gate=device Token # 1684: 116.877ms; value: next_token_ids=tensor([22212], device='cuda:0') mtp accept=1 prop=22212 top1=22212 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.0ms gain=84.6ms ratio=0.44 s0=4.3ms s1=189.7ms wait=0.1/47.5ms pred gate=device Token # 1685: 3.763ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.987 next=pair draft=26151 prop=26151 pred gate=device Token # 1686: 114.725ms; value: next_token_ids=tensor([26151], device='cuda:0') mtp accept=1 prop=26151 top1=26151 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.5ms gain=84.9ms ratio=0.44 s0=4.1ms s1=190.4ms wait=0.1/48.2ms pred gate=device Token # 1687: 3.787ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.939 next=pair draft=26354 prop=26354 pred gate=device Token # 1688: 115.093ms; value: next_token_ids=tensor([26354], device='cuda:0') mtp accept=1 prop=26354 top1=26354 accp=1.000 next=draft=26354 prop=26354 olap pair=109.9ms serial=195.5ms gain=85.6ms ratio=0.44 s0=3.8ms s1=191.7ms wait=0.1/48.5ms pred gate=device Token # 1689: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26354 top1=223 accp=0.018 next=pair draft=26352 prop=26352 pred gate=device Token # 1690: 115.373ms; value: next_token_ids=tensor([26352], device='cuda:0') mtp accept=1 prop=26352 top1=26352 accp=1.000 next=draft=223 prop=26352 olap pair=110.2ms serial=195.8ms gain=85.6ms ratio=0.44 s0=3.9ms s1=191.9ms wait=0.1/48.4ms pred gate=device Token # 1691: 3.813ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26352 top1=223 accp=0.559 next=pair draft=29216 prop=29216 pred gate=device Token # 1692: 114.842ms; value: next_token_ids=tensor([29216], device='cuda:0') mtp accept=1 prop=29216 top1=29216 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.4ms gain=84.8ms ratio=0.44 s0=4.1ms s1=190.2ms wait=0.1/48.0ms pred gate=device Token # 1693: 3.842ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.983 next=pair draft=27772 prop=27772 pred gate=device Token # 1694: 114.754ms; value: next_token_ids=tensor([27772], device='cuda:0') mtp accept=1 prop=27772 top1=27772 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.2ms gain=84.6ms ratio=0.44 s0=5.8ms s1=188.3ms wait=0.2/46.0ms pred gate=device Token # 1695: 3.808ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=26500 prop=26500 pred gate=device Token # 1696: 114.555ms; value: next_token_ids=tensor([26500], device='cuda:0') mtp accept=1 prop=26500 top1=26500 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/48.7ms pred gate=device Token # 1697: 3.812ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=27308 prop=27308 pred gate=device Token # 1698: 114.656ms; value: next_token_ids=tensor([27308], device='cuda:0') mtp accept=1 prop=27308 top1=27308 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/48.6ms pred gate=device Token # 1699: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.991 next=pair draft=28455 prop=28455 pred gate=device Token # 1700: 114.985ms; value: next_token_ids=tensor([28455], device='cuda:0') mtp accept=1 prop=28455 top1=28455 accp=1.000 next=draft=28455 prop=28455 olap pair=109.8ms serial=194.7ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.5ms wait=0.1/47.8ms pred gate=device Token # 1701: 3.798ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28455 top1=223 accp=0.191 next=pair draft=26661 prop=26661 pred gate=device Token # 1702: 114.496ms; value: next_token_ids=tensor([26661], device='cuda:0') mtp accept=1 prop=26661 top1=26661 accp=1.000 next=draft=223 prop=26661 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/48.5ms pred gate=device Token # 1703: 3.738ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26661 top1=223 accp=0.769 next=pair draft=19913 prop=19913 pred gate=device Token # 1704: 114.558ms; value: next_token_ids=tensor([19913], device='cuda:0') mtp accept=1 prop=19913 top1=19913 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=3.9ms s1=190.2ms wait=0.1/48.4ms pred gate=device Token # 1705: 3.736ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.894 next=pair draft=26091 prop=26091 pred gate=device Token # 1706: 114.933ms; value: next_token_ids=tensor([26091], device='cuda:0') mtp accept=1 prop=26091 top1=26091 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/48.6ms pred gate=device Token # 1707: 3.738ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26827 prop=26827 pred gate=device Token # 1708: 114.825ms; value: next_token_ids=tensor([26827], device='cuda:0') mtp accept=1 prop=26827 top1=26827 accp=1.000 next=draft=26827 prop=26827 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=4.0ms s1=190.8ms wait=0.1/48.0ms pred gate=device Token # 1709: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=26827 top1=223 accp=0.001 next=pair draft=28380 prop=28380 pred gate=device Token # 1710: 114.769ms; value: next_token_ids=tensor([28380], device='cuda:0') mtp accept=1 prop=28380 top1=28380 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=193.8ms gain=84.3ms ratio=0.43 s0=4.1ms s1=189.7ms wait=0.1/48.2ms pred gate=device Token # 1711: 3.827ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.976 next=pair draft=26407 prop=26407 pred gate=device Token # 1712: 114.421ms; value: next_token_ids=tensor([26407], device='cuda:0') mtp accept=1 prop=26407 top1=26407 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.2ms wait=0.1/48.6ms pred gate=device Token # 1713: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27151 prop=27151 pred gate=device Token # 1714: 114.819ms; value: next_token_ids=tensor([27151], device='cuda:0') mtp accept=1 prop=27151 top1=27151 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=193.5ms gain=83.8ms ratio=0.43 s0=4.3ms s1=189.1ms wait=0.1/47.9ms pred gate=device Token # 1715: 3.739ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28833 prop=28833 pred gate=device Token # 1716: 115.031ms; value: next_token_ids=tensor([28833], device='cuda:0') mtp accept=1 prop=28833 top1=28833 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.7ms gain=84.9ms ratio=0.44 s0=4.3ms s1=190.4ms wait=0.1/47.5ms pred gate=device Token # 1717: 3.809ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28045 prop=28045 pred gate=device Token # 1718: 114.518ms; value: next_token_ids=tensor([28045], device='cuda:0') mtp accept=1 prop=28045 top1=28045 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.0ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.7ms wait=0.1/47.4ms pred gate=device Token # 1719: 3.806ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27708 prop=27708 pred gate=device Token # 1720: 115.540ms; value: next_token_ids=tensor([27708], device='cuda:0') mtp accept=1 prop=27708 top1=27708 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.0ms gain=83.4ms ratio=0.43 s0=8.7ms s1=184.2ms wait=0.2/42.8ms pred gate=device Token # 1721: 4.747ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.889 next=pair draft=28983 prop=28983 pred gate=device Token # 1722: 115.899ms; value: next_token_ids=tensor([28983], device='cuda:0') mtp accept=1 prop=28983 top1=28983 accp=1.000 next=draft=223 prop=28983 olap pair=109.7ms serial=193.4ms gain=83.7ms ratio=0.43 s0=8.7ms s1=184.7ms wait=0.2/42.8ms pred gate=device Token # 1723: 4.685ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28983 top1=223 accp=0.793 next=pair draft=17720 prop=17720 pred gate=device Token # 1724: 114.662ms; value: next_token_ids=tensor([17720], device='cuda:0') mtp accept=1 prop=17720 top1=17720 accp=1.000 next=draft=17720 prop=17720 olap pair=109.3ms serial=193.8ms gain=84.5ms ratio=0.44 s0=4.7ms s1=189.1ms wait=0.1/47.1ms pred gate=device Token # 1725: 3.775ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=0 prop=17720 top1=201 accp=0.444 next=pair draft=27665 prop=27665 pred gate=device Token # 1726: 114.844ms; value: next_token_ids=tensor([27665], device='cuda:0') mtp accept=1 prop=27665 top1=27665 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.3ms gain=84.7ms ratio=0.44 s0=4.2ms s1=190.1ms wait=0.1/47.6ms pred gate=device Token # 1727: 3.834ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27189 prop=27189 pred gate=device Token # 1728: 114.866ms; value: next_token_ids=tensor([27189], device='cuda:0') mtp accept=1 prop=27189 top1=27189 accp=1.000 next=draft=27189 prop=27189 olap pair=109.7ms serial=194.7ms gain=85.0ms ratio=0.44 s0=4.1ms s1=190.6ms wait=0.1/47.9ms pred gate=device Token # 1729: 3.778ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=27189 top1=223 accp=0.093 next=pair draft=26681 prop=26681 pred gate=device Token # 1730: 114.690ms; value: next_token_ids=tensor([26681], device='cuda:0') mtp accept=1 prop=26681 top1=26681 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/48.6ms pred gate=device Token # 1731: 3.789ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.944 next=pair draft=27395 prop=27395 pred gate=device Token # 1732: 114.748ms; value: next_token_ids=tensor([27395], device='cuda:0') mtp accept=1 prop=27395 top1=27395 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/48.6ms pred gate=device Token # 1733: 3.791ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24974 prop=24974 pred gate=device Token # 1734: 115.503ms; value: next_token_ids=tensor([24974], device='cuda:0') mtp accept=1 prop=24974 top1=24974 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.7ms gain=84.2ms ratio=0.43 s0=5.9ms s1=187.8ms wait=0.2/46.3ms pred gate=device Token # 1735: 4.694ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26010 prop=26010 pred gate=device Token # 1736: 114.843ms; value: next_token_ids=tensor([26010], device='cuda:0') mtp accept=1 prop=26010 top1=26010 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/48.6ms pred gate=device Token # 1737: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28349 prop=28349 pred gate=device Token # 1738: 114.732ms; value: next_token_ids=tensor([28349], device='cuda:0') mtp accept=1 prop=28349 top1=28349 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.8ms gain=84.3ms ratio=0.43 s0=3.9ms s1=189.9ms wait=0.1/48.6ms pred gate=device Token # 1739: 3.827ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=28922 prop=28922 pred gate=device Token # 1740: 114.832ms; value: next_token_ids=tensor([28922], device='cuda:0') mtp accept=1 prop=28922 top1=28922 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/48.6ms pred gate=device Token # 1741: 3.812ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.864 next=pair draft=29666 prop=29666 pred gate=device Token # 1742: 114.582ms; value: next_token_ids=tensor([29666], device='cuda:0') mtp accept=1 prop=29666 top1=29666 accp=1.000 next=draft=223 prop=29666 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=4.1ms s1=190.2ms wait=0.1/48.1ms pred gate=device Token # 1743: 3.868ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=29666 top1=223 accp=0.520 next=pair draft=23150 prop=23150 pred gate=device Token # 1744: 114.578ms; value: next_token_ids=tensor([23150], device='cuda:0') mtp accept=1 prop=23150 top1=23150 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/48.6ms pred gate=device Token # 1745: 3.773ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.997 next=pair draft=27895 prop=27895 pred gate=device Token # 1746: 115.191ms; value: next_token_ids=tensor([27895], device='cuda:0') mtp accept=1 prop=27895 top1=27895 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.3ms gain=85.3ms ratio=0.44 s0=5.0ms s1=190.3ms wait=0.1/47.2ms pred gate=device Token # 1747: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=28763 prop=28763 pred gate=device Token # 1748: 115.055ms; value: next_token_ids=tensor([28763], device='cuda:0') mtp accept=1 prop=28763 top1=28763 accp=1.000 next=draft=28763 prop=28763 olap pair=109.9ms serial=195.0ms gain=85.1ms ratio=0.44 s0=4.1ms s1=190.9ms wait=0.1/48.1ms pred gate=device Token # 1749: 3.799ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28763 top1=223 accp=0.489 next=pair draft=28968 prop=28968 pred gate=device Token # 1750: 114.414ms; value: next_token_ids=tensor([28968], device='cuda:0') mtp accept=1 prop=28968 top1=28968 accp=1.000 next=draft=28968 prop=28968 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.2ms wait=0.1/48.7ms pred gate=device Token # 1751: 3.772ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28968 top1=223 accp=0.000 next=pair draft=27501 prop=27501 pred gate=device Token # 1752: 114.604ms; value: next_token_ids=tensor([27501], device='cuda:0') mtp accept=1 prop=27501 top1=27501 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.1ms s1=190.0ms wait=0.1/48.1ms pred gate=device Token # 1753: 3.762ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28084 prop=28084 pred gate=device Token # 1754: 114.475ms; value: next_token_ids=tensor([28084], device='cuda:0') mtp accept=1 prop=28084 top1=28084 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.3ms wait=0.1/48.6ms pred gate=device Token # 1755: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=16813 prop=16813 pred gate=device Token # 1756: 115.168ms; value: next_token_ids=tensor([16813], device='cuda:0') mtp accept=1 prop=16813 top1=16813 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=193.7ms gain=83.7ms ratio=0.43 s0=4.1ms s1=189.6ms wait=0.1/48.3ms pred gate=device Token # 1757: 3.801ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=23529 prop=23529 pred gate=device Token # 1758: 114.796ms; value: next_token_ids=tensor([23529], device='cuda:0') mtp accept=1 prop=23529 top1=23529 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/48.6ms pred gate=device Token # 1759: 3.820ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.986 next=pair draft=28836 prop=28836 pred gate=device Token # 1760: 114.722ms; value: next_token_ids=tensor([28836], device='cuda:0') mtp accept=1 prop=28836 top1=28836 accp=1.000 next=draft=28836 prop=28836 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/48.5ms pred gate=device Token # 1761: 3.822ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28836 top1=223 accp=0.001 next=pair draft=29126 prop=29126 pred gate=device Token # 1762: 114.409ms; value: next_token_ids=tensor([29126], device='cuda:0') mtp accept=1 prop=29126 top1=29126 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/48.6ms pred gate=device Token # 1763: 3.739ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=24969 prop=24969 pred gate=device Token # 1764: 114.773ms; value: next_token_ids=tensor([24969], device='cuda:0') mtp accept=1 prop=24969 top1=24969 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.0ms gain=84.6ms ratio=0.44 s0=4.8ms s1=189.2ms wait=0.1/47.5ms pred gate=device Token # 1765: 3.794ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=26612 prop=26612 pred gate=device Token # 1766: 114.696ms; value: next_token_ids=tensor([26612], device='cuda:0') mtp accept=1 prop=26612 top1=26612 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.3ms wait=0.1/48.5ms pred gate=device Token # 1767: 3.814ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27803 prop=27803 pred gate=device Token # 1768: 115.229ms; value: next_token_ids=tensor([27803], device='cuda:0') mtp accept=1 prop=27803 top1=27803 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.1ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.9ms wait=0.1/47.7ms pred gate=device Token # 1769: 3.765ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28686 prop=28686 pred gate=device Token # 1770: 115.337ms; value: next_token_ids=tensor([28686], device='cuda:0') mtp accept=1 prop=28686 top1=28686 accp=1.000 next=draft=223 prop=223 olap pair=110.2ms serial=194.3ms gain=84.1ms ratio=0.43 s0=4.2ms s1=190.0ms wait=0.1/48.0ms pred gate=device Token # 1771: 3.763ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29374 prop=29374 pred gate=device Token # 1772: 114.487ms; value: next_token_ids=tensor([29374], device='cuda:0') mtp accept=1 prop=29374 top1=29374 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.2ms wait=0.1/48.6ms pred gate=device Token # 1773: 3.693ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25905 prop=25905 pred gate=device Token # 1774: 114.968ms; value: next_token_ids=tensor([25905], device='cuda:0') mtp accept=1 prop=25905 top1=25905 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=195.1ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.4ms wait=0.1/48.6ms pred gate=device Token # 1775: 3.725ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29017 prop=29017 pred gate=device Token # 1776: 114.835ms; value: next_token_ids=tensor([29017], device='cuda:0') mtp accept=1 prop=29017 top1=29017 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.9ms s1=191.0ms wait=0.1/48.4ms pred gate=device Token # 1777: 3.761ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28840 prop=28840 pred gate=device Token # 1778: 114.666ms; value: next_token_ids=tensor([28840], device='cuda:0') mtp accept=1 prop=28840 top1=28840 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.7ms s1=190.9ms wait=0.1/48.7ms pred gate=device Token # 1779: 3.827ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27183 prop=27183 pred gate=device Token # 1780: 115.078ms; value: next_token_ids=tensor([27183], device='cuda:0') mtp accept=1 prop=27183 top1=27183 accp=1.000 next=draft=27183 prop=223 olap pair=109.9ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/48.7ms pred gate=device Token # 1781: 3.795ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.006 next=pair draft=28897 prop=28897 pred gate=device Token # 1782: 114.850ms; value: next_token_ids=tensor([28897], device='cuda:0') mtp accept=1 prop=28897 top1=28897 accp=1.000 next=draft=28897 prop=28897 olap pair=109.7ms serial=193.6ms gain=83.9ms ratio=0.43 s0=4.0ms s1=189.6ms wait=0.1/48.5ms pred gate=device Token # 1783: 3.799ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28897 top1=223 accp=0.060 next=pair draft=21677 prop=21677 pred gate=device Token # 1784: 115.257ms; value: next_token_ids=tensor([21677], device='cuda:0') mtp accept=1 prop=21677 top1=21677 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=194.8ms gain=85.0ms ratio=0.44 s0=4.6ms s1=190.2ms wait=0.1/48.0ms pred gate=device Token # 1785: 3.803ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.974 next=pair draft=29037 prop=29037 pred gate=device Token # 1786: 114.958ms; value: next_token_ids=tensor([29037], device='cuda:0') mtp accept=1 prop=29037 top1=29037 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.9ms gain=85.1ms ratio=0.44 s0=3.8ms s1=191.0ms wait=0.1/48.6ms pred gate=device Token # 1787: 3.788ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27738 prop=27738 pred gate=device Token # 1788: 114.826ms; value: next_token_ids=tensor([27738], device='cuda:0') mtp accept=1 prop=27738 top1=27738 accp=1.000 next=draft=27738 prop=27738 olap pair=109.7ms serial=194.5ms gain=84.9ms ratio=0.44 s0=4.2ms s1=190.3ms wait=0.1/47.9ms pred gate=device Token # 1789: 3.779ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=27738 top1=223 accp=0.189 next=pair draft=30217 prop=30217 pred gate=device Token # 1790: 114.796ms; value: next_token_ids=tensor([30217], device='cuda:0') mtp accept=1 prop=30217 top1=30217 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=4.4ms s1=190.3ms wait=0.1/47.2ms pred gate=device Token # 1791: 3.816ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29161 prop=29161 pred gate=device Token # 1792: 114.798ms; value: next_token_ids=tensor([29161], device='cuda:0') mtp accept=1 prop=29161 top1=29161 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.2ms ratio=0.44 s0=4.1ms s1=190.7ms wait=0.1/48.0ms pred gate=device Token # 1793: 3.779ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.988 next=pair draft=28769 prop=28769 pred gate=device Token # 1794: 114.739ms; value: next_token_ids=tensor([28769], device='cuda:0') mtp accept=1 prop=28769 top1=28769 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.6ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/48.5ms pred gate=device Token # 1795: 3.847ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.859 next=pair draft=28506 prop=28506 pred gate=device Token # 1796: 114.787ms; value: next_token_ids=tensor([28506], device='cuda:0') mtp accept=1 prop=28506 top1=28506 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/48.6ms pred gate=device Token # 1797: 3.793ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.993 next=pair draft=28701 prop=28701 pred gate=device Token # 1798: 114.255ms; value: next_token_ids=tensor([28701], device='cuda:0') mtp accept=1 prop=28701 top1=28701 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.8ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.0ms wait=0.1/48.5ms pred gate=device Token # 1799: 3.831ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28170 prop=28170 pred gate=device Token # 1800: 114.549ms; value: next_token_ids=tensor([28170], device='cuda:0') mtp accept=1 prop=28170 top1=28170 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.5ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.7ms wait=0.1/48.6ms pred gate=device Token # 1801: 3.826ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=29795 prop=29795 pred gate=device Token # 1802: 114.490ms; value: next_token_ids=tensor([29795], device='cuda:0') mtp accept=1 prop=29795 top1=29795 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.7ms s1=190.4ms wait=0.1/48.7ms pred gate=device Token # 1803: 3.845ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26602 prop=26602 pred gate=device Token # 1804: 114.551ms; value: next_token_ids=tensor([26602], device='cuda:0') mtp accept=1 prop=26602 top1=26602 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.8ms s1=190.5ms wait=0.1/48.6ms pred gate=device Token # 1805: 3.767ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.996 next=pair draft=29265 prop=29265 pred gate=device Token # 1806: 114.873ms; value: next_token_ids=tensor([29265], device='cuda:0') mtp accept=1 prop=29265 top1=29265 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.1ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.3ms wait=0.1/48.7ms pred gate=device Token # 1807: 3.800ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29438 prop=29438 pred gate=device Token # 1808: 114.860ms; value: next_token_ids=tensor([29438], device='cuda:0') mtp accept=1 prop=29438 top1=29438 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/48.6ms pred gate=device Token # 1809: 3.816ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29627 prop=29627 pred gate=device Token # 1810: 114.526ms; value: next_token_ids=tensor([29627], device='cuda:0') mtp accept=1 prop=29627 top1=29627 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/48.8ms pred gate=device Token # 1811: 3.814ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29503 prop=29503 pred gate=device Token # 1812: 114.808ms; value: next_token_ids=tensor([29503], device='cuda:0') mtp accept=1 prop=29503 top1=29503 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.7ms s1=191.2ms wait=0.1/48.7ms pred gate=device Token # 1813: 3.780ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30243 prop=30243 pred gate=device Token # 1814: 115.010ms; value: next_token_ids=tensor([30243], device='cuda:0') mtp accept=1 prop=30243 top1=30243 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/48.6ms pred gate=device Token # 1815: 3.804ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.992 next=pair draft=27389 prop=27389 pred gate=device Token # 1816: 114.715ms; value: next_token_ids=tensor([27389], device='cuda:0') mtp accept=1 prop=27389 top1=27389 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=193.6ms gain=84.0ms ratio=0.43 s0=4.2ms s1=189.4ms wait=0.1/48.3ms pred gate=device Token # 1817: 3.799ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29842 prop=29842 pred gate=device Token # 1818: 114.247ms; value: next_token_ids=tensor([29842], device='cuda:0') mtp accept=1 prop=29842 top1=29842 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=4.4ms s1=189.2ms wait=0.1/47.3ms pred gate=device Token # 1819: 3.847ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31457 prop=31457 pred gate=device Token # 1820: 114.854ms; value: next_token_ids=tensor([31457], device='cuda:0') mtp accept=1 prop=31457 top1=31457 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.8ms gain=85.1ms ratio=0.44 s0=3.9ms s1=191.0ms wait=0.1/48.5ms pred gate=device Token # 1821: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27047 prop=27047 pred gate=device Token # 1822: 114.709ms; value: next_token_ids=tensor([27047], device='cuda:0') mtp accept=1 prop=27047 top1=27047 accp=1.000 next=draft=27047 prop=27047 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/48.3ms pred gate=device Token # 1823: 3.842ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=27047 top1=223 accp=0.445 next=pair draft=8996 prop=8996 pred gate=device Token # 1824: 114.929ms; value: next_token_ids=tensor([8996], device='cuda:0') mtp accept=1 prop=8996 top1=8996 accp=1.000 next=draft=201 prop=201 olap pair=109.8ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.2ms wait=0.1/48.8ms pred gate=device Token # 1825: 3.763ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=23600 prop=23600 pred gate=device Token # 1826: 114.466ms; value: next_token_ids=tensor([23600], device='cuda:0') mtp accept=1 prop=23600 top1=23600 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.9ms gain=84.6ms ratio=0.44 s0=4.4ms s1=189.5ms wait=0.1/47.2ms pred gate=device Token # 1827: 3.702ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=24158 prop=24158 pred gate=device Token # 1828: 114.760ms; value: next_token_ids=tensor([24158], device='cuda:0') mtp accept=1 prop=24158 top1=24158 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=4.5ms s1=190.2ms wait=0.1/47.1ms pred gate=device Token # 1829: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26739 prop=26739 pred gate=device Token # 1830: 114.530ms; value: next_token_ids=tensor([26739], device='cuda:0') mtp accept=1 prop=26739 top1=26739 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.0ms gain=84.6ms ratio=0.44 s0=4.8ms s1=189.1ms wait=0.1/46.6ms pred gate=device Token # 1831: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25743 prop=25743 pred gate=device Token # 1832: 114.516ms; value: next_token_ids=tensor([25743], device='cuda:0') mtp accept=1 prop=25743 top1=25743 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.8ms gain=84.5ms ratio=0.44 s0=4.4ms s1=189.4ms wait=0.1/47.5ms pred gate=device Token # 1833: 3.865ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27191 prop=27191 pred gate=device Token # 1834: 114.232ms; value: next_token_ids=tensor([27191], device='cuda:0') mtp accept=1 prop=27191 top1=27191 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.9ms wait=0.1/48.8ms pred gate=device Token # 1835: 3.770ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27923 prop=27923 pred gate=device Token # 1836: 114.577ms; value: next_token_ids=tensor([27923], device='cuda:0') mtp accept=1 prop=27923 top1=27923 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.9ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/48.0ms pred gate=device Token # 1837: 3.784ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25624 prop=25624 pred gate=device Token # 1838: 114.569ms; value: next_token_ids=tensor([25624], device='cuda:0') mtp accept=1 prop=25624 top1=25624 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/48.6ms pred gate=device Token # 1839: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27142 prop=27142 pred gate=device Token # 1840: 114.643ms; value: next_token_ids=tensor([27142], device='cuda:0') mtp accept=1 prop=27142 top1=27142 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/48.7ms pred gate=device Token # 1841: 3.866ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.983 next=pair draft=28077 prop=28077 pred gate=device Token # 1842: 114.522ms; value: next_token_ids=tensor([28077], device='cuda:0') mtp accept=1 prop=28077 top1=28077 accp=1.000 next=draft=28077 prop=28077 olap pair=109.4ms serial=194.3ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/48.8ms pred gate=device Token # 1843: 3.801ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=28077 top1=223 accp=0.140 next=pair draft=23888 prop=23888 pred gate=device Token # 1844: 114.618ms; value: next_token_ids=tensor([23888], device='cuda:0') mtp accept=1 prop=23888 top1=23888 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.8ms wait=0.1/48.8ms pred gate=device Token # 1845: 3.738ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=26393 prop=26393 pred gate=device Token # 1846: 114.243ms; value: next_token_ids=tensor([26393], device='cuda:0') mtp accept=1 prop=26393 top1=26393 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=192.9ms gain=83.8ms ratio=0.43 s0=7.5ms s1=185.4ms wait=0.2/44.1ms pred gate=device Token # 1847: 3.803ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25544 prop=25544 pred gate=device Token # 1848: 114.559ms; value: next_token_ids=tensor([25544], device='cuda:0') mtp accept=1 prop=25544 top1=25544 accp=1.000 next=draft=25544 prop=25544 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.7ms s1=190.6ms wait=0.1/48.9ms pred gate=device Token # 1849: 3.937ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25544 top1=223 accp=0.423 next=pair draft=28404 prop=28404 pred gate=device Token # 1850: 115.353ms; value: next_token_ids=tensor([28404], device='cuda:0') mtp accept=1 prop=28404 top1=28404 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.4ms gain=85.3ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/48.6ms pred gate=device Token # 1851: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.983 next=pair draft=27160 prop=27160 pred gate=device Token # 1852: 114.594ms; value: next_token_ids=tensor([27160], device='cuda:0') mtp accept=1 prop=27160 top1=27160 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=85.0ms ratio=0.44 s0=3.7ms s1=190.7ms wait=0.1/48.8ms pred gate=device Token # 1853: 3.810ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=27648 prop=27648 pred gate=device Token # 1854: 114.150ms; value: next_token_ids=tensor([27648], device='cuda:0') mtp accept=1 prop=27648 top1=27648 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.7ms s1=190.0ms wait=0.1/48.6ms pred gate=device Token # 1855: 3.757ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27853 prop=27853 pred gate=device Token # 1856: 114.312ms; value: next_token_ids=tensor([27853], device='cuda:0') mtp accept=1 prop=27853 top1=27853 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.9ms gain=84.8ms ratio=0.44 s0=3.7ms s1=190.3ms wait=0.1/48.7ms pred gate=device Token # 1857: 3.759ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28405 prop=28405 pred gate=device Token # 1858: 114.028ms; value: next_token_ids=tensor([28405], device='cuda:0') mtp accept=1 prop=28405 top1=28405 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.4ms gain=84.5ms ratio=0.44 s0=3.7ms s1=189.7ms wait=0.1/48.8ms pred gate=device Token # 1859: 3.811ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28787 prop=28787 pred gate=device Token # 1860: 113.922ms; value: next_token_ids=tensor([28787], device='cuda:0') mtp accept=1 prop=28787 top1=28787 accp=1.000 next=draft=223 prop=223 olap pair=108.8ms serial=193.1ms gain=84.4ms ratio=0.44 s0=3.7ms s1=189.4ms wait=0.1/48.8ms pred gate=device Token # 1861: 3.779ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=28800 prop=28800 pred gate=device Token # 1862: 114.157ms; value: next_token_ids=tensor([28800], device='cuda:0') mtp accept=1 prop=28800 top1=28800 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.5ms gain=84.5ms ratio=0.44 s0=3.7ms s1=189.8ms wait=0.1/48.6ms pred gate=device Token # 1863: 3.788ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=18214 prop=18214 pred gate=device Token # 1864: 114.411ms; value: next_token_ids=tensor([18214], device='cuda:0') mtp accept=1 prop=18214 top1=18214 accp=1.000 next=draft=201 prop=201 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.3ms wait=0.1/48.7ms pred gate=device Token # 1865: 3.765ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.999 next=pair draft=27836 prop=27836 pred gate=device Token # 1866: 114.175ms; value: next_token_ids=tensor([27836], device='cuda:0') mtp accept=1 prop=27836 top1=27836 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.4ms gain=84.4ms ratio=0.44 s0=3.9ms s1=189.5ms wait=0.1/48.6ms pred gate=device Token # 1867: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26727 prop=26727 pred gate=device Token # 1868: 114.231ms; value: next_token_ids=tensor([26727], device='cuda:0') mtp accept=1 prop=26727 top1=26727 accp=1.000 next=draft=26727 prop=223 olap pair=108.8ms serial=192.0ms gain=83.2ms ratio=0.43 s0=8.7ms s1=183.2ms wait=0.2/42.5ms pred gate=device Token # 1869: 3.823ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.272 next=pair draft=28667 prop=28667 pred gate=device Token # 1870: 114.423ms; value: next_token_ids=tensor([28667], device='cuda:0') mtp accept=1 prop=28667 top1=28667 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.0ms gain=83.7ms ratio=0.43 s0=8.2ms s1=184.8ms wait=0.2/43.3ms pred gate=device Token # 1871: 3.815ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29415 prop=29415 pred gate=device Token # 1872: 114.236ms; value: next_token_ids=tensor([29415], device='cuda:0') mtp accept=1 prop=29415 top1=29415 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=3.7ms s1=190.0ms wait=0.1/48.7ms pred gate=device Token # 1873: 3.806ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27897 prop=27897 pred gate=device Token # 1874: 114.046ms; value: next_token_ids=tensor([27897], device='cuda:0') mtp accept=1 prop=27897 top1=27897 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.5ms gain=84.5ms ratio=0.44 s0=3.7ms s1=189.7ms wait=0.1/48.6ms pred gate=device Token # 1875: 3.926ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29775 prop=29775 pred gate=device Token # 1876: 114.443ms; value: next_token_ids=tensor([29775], device='cuda:0') mtp accept=1 prop=29775 top1=29775 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.9ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.6ms wait=0.1/47.5ms pred gate=device Token # 1877: 3.781ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28415 prop=28415 pred gate=device Token # 1878: 114.482ms; value: next_token_ids=tensor([28415], device='cuda:0') mtp accept=1 prop=28415 top1=28415 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.9ms ratio=0.44 s0=3.9ms s1=190.2ms wait=0.1/48.3ms pred gate=device Token # 1879: 3.746ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27766 prop=27766 pred gate=device Token # 1880: 114.325ms; value: next_token_ids=tensor([27766], device='cuda:0') mtp accept=1 prop=27766 top1=27766 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/48.6ms pred gate=device Token # 1881: 3.880ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.945 next=pair draft=29094 prop=29094 pred gate=device Token # 1882: 115.259ms; value: next_token_ids=tensor([29094], device='cuda:0') mtp accept=1 prop=29094 top1=29094 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=194.6ms gain=84.5ms ratio=0.43 s0=3.9ms s1=190.7ms wait=0.1/48.3ms pred gate=device Token # 1883: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.982 next=pair draft=25168 prop=25168 pred gate=device Token # 1884: 114.223ms; value: next_token_ids=tensor([25168], device='cuda:0') mtp accept=1 prop=25168 top1=25168 accp=1.000 next=draft=201 prop=201 olap pair=109.1ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.9ms s1=189.7ms wait=0.1/48.4ms pred gate=device Token # 1885: 3.750ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.998 next=pair draft=29107 prop=29107 pred gate=device Token # 1886: 113.849ms; value: next_token_ids=tensor([29107], device='cuda:0') mtp accept=1 prop=29107 top1=29107 accp=1.000 next=draft=223 prop=223 olap pair=108.7ms serial=193.1ms gain=84.4ms ratio=0.44 s0=3.8ms s1=189.4ms wait=0.1/48.6ms pred gate=device Token # 1887: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27523 prop=27523 pred gate=device Token # 1888: 114.136ms; value: next_token_ids=tensor([27523], device='cuda:0') mtp accept=1 prop=27523 top1=27523 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.6ms gain=84.6ms ratio=0.44 s0=3.8ms s1=189.8ms wait=0.1/48.6ms pred gate=device Token # 1889: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.976 next=pair draft=29032 prop=29032 pred gate=device Token # 1890: 115.133ms; value: next_token_ids=tensor([29032], device='cuda:0') mtp accept=1 prop=29032 top1=29032 accp=1.000 next=draft=223 prop=223 olap pair=110.0ms serial=195.3ms gain=85.4ms ratio=0.44 s0=3.8ms s1=191.5ms wait=0.1/48.7ms pred gate=device Token # 1891: 3.806ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30371 prop=30371 pred gate=device Token # 1892: 114.257ms; value: next_token_ids=tensor([30371], device='cuda:0') mtp accept=1 prop=30371 top1=30371 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.7ms gain=84.6ms ratio=0.44 s0=4.0ms s1=189.7ms wait=0.1/48.2ms pred gate=device Token # 1893: 3.780ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29264 prop=29264 pred gate=device Token # 1894: 114.099ms; value: next_token_ids=tensor([29264], device='cuda:0') mtp accept=1 prop=29264 top1=29264 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.2ms gain=84.3ms ratio=0.44 s0=4.4ms s1=188.8ms wait=0.1/47.3ms pred gate=device Token # 1895: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28940 prop=28940 pred gate=device Token # 1896: 114.031ms; value: next_token_ids=tensor([28940], device='cuda:0') mtp accept=1 prop=28940 top1=28940 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.3ms gain=84.3ms ratio=0.44 s0=4.4ms s1=188.9ms wait=0.1/47.6ms pred gate=device Token # 1897: 3.708ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=25784 prop=25784 pred gate=device Token # 1898: 114.187ms; value: next_token_ids=tensor([25784], device='cuda:0') mtp accept=1 prop=25784 top1=25784 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.5ms gain=84.5ms ratio=0.44 s0=4.4ms s1=189.1ms wait=0.1/47.4ms pred gate=device Token # 1899: 3.797ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30143 prop=30143 pred gate=device Token # 1900: 114.454ms; value: next_token_ids=tensor([30143], device='cuda:0') mtp accept=1 prop=30143 top1=30143 accp=1.000 next=draft=30143 prop=30143 olap pair=109.3ms serial=194.0ms gain=84.8ms ratio=0.44 s0=4.3ms s1=189.7ms wait=0.1/47.4ms pred gate=device Token # 1901: 3.789ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=30143 top1=223 accp=0.004 next=pair draft=30793 prop=30793 pred gate=device Token # 1902: 114.150ms; value: next_token_ids=tensor([30793], device='cuda:0') mtp accept=1 prop=30793 top1=30793 accp=1.000 next=draft=223 prop=223 olap pair=108.9ms serial=193.3ms gain=84.4ms ratio=0.44 s0=4.3ms s1=189.0ms wait=0.1/47.3ms pred gate=device Token # 1903: 3.801ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.938 next=pair draft=24438 prop=24438 pred gate=device Token # 1904: 114.490ms; value: next_token_ids=tensor([24438], device='cuda:0') mtp accept=1 prop=24438 top1=24438 accp=1.000 next=draft=201 prop=201 olap pair=109.3ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.7ms wait=0.1/47.4ms pred gate=device Token # 1905: 3.789ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=29573 prop=29573 pred gate=device Token # 1906: 114.724ms; value: next_token_ids=tensor([29573], device='cuda:0') mtp accept=1 prop=29573 top1=29573 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=3.9ms s1=190.7ms wait=0.1/48.5ms pred gate=device Token # 1907: 3.816ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30377 prop=30377 pred gate=device Token # 1908: 114.552ms; value: next_token_ids=tensor([30377], device='cuda:0') mtp accept=1 prop=30377 top1=30377 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/48.6ms pred gate=device Token # 1909: 3.843ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29341 prop=29341 pred gate=device Token # 1910: 114.826ms; value: next_token_ids=tensor([29341], device='cuda:0') mtp accept=1 prop=29341 top1=29341 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.9ms gain=85.2ms ratio=0.44 s0=3.8ms s1=191.1ms wait=0.1/48.8ms pred gate=device Token # 1911: 3.805ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.993 next=pair draft=28509 prop=28509 pred gate=device Token # 1912: 114.361ms; value: next_token_ids=tensor([28509], device='cuda:0') mtp accept=1 prop=28509 top1=28509 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=4.0ms s1=189.9ms wait=0.1/48.3ms pred gate=device Token # 1913: 3.742ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29444 prop=29444 pred gate=device Token # 1914: 114.398ms; value: next_token_ids=tensor([29444], device='cuda:0') mtp accept=1 prop=29444 top1=29444 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.8ms gain=84.6ms ratio=0.44 s0=4.5ms s1=189.3ms wait=0.1/47.3ms pred gate=device Token # 1915: 3.817ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30787 prop=30787 pred gate=device Token # 1916: 114.240ms; value: next_token_ids=tensor([30787], device='cuda:0') mtp accept=1 prop=30787 top1=30787 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.3ms gain=84.4ms ratio=0.44 s0=4.1ms s1=189.2ms wait=0.1/47.8ms pred gate=device Token # 1917: 3.830ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29138 prop=29138 pred gate=device Token # 1918: 114.293ms; value: next_token_ids=tensor([29138], device='cuda:0') mtp accept=1 prop=29138 top1=29138 accp=1.000 next=draft=223 prop=223 olap pair=109.1ms serial=193.8ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.0ms wait=0.1/48.7ms pred gate=device Token # 1919: 3.833ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28266 prop=28266 pred gate=device Token # 1920: 114.158ms; value: next_token_ids=tensor([28266], device='cuda:0') mtp accept=1 prop=28266 top1=28266 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.4ms gain=84.5ms ratio=0.44 s0=3.7ms s1=189.7ms wait=0.1/48.7ms pred gate=device Token # 1921: 3.804ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30423 prop=30423 pred gate=device Token # 1922: 114.769ms; value: next_token_ids=tensor([30423], device='cuda:0') mtp accept=1 prop=30423 top1=30423 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.7ms s1=191.0ms wait=0.1/48.7ms pred gate=device Token # 1923: 3.806ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=15098 prop=15098 pred gate=device Token # 1924: 114.409ms; value: next_token_ids=tensor([15098], device='cuda:0') mtp accept=1 prop=15098 top1=15098 accp=1.000 next=draft=201 prop=201 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=3.8ms s1=190.1ms wait=0.1/48.7ms pred gate=device Token # 1925: 3.788ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.980 next=pair draft=29197 prop=29197 pred gate=device Token # 1926: 114.474ms; value: next_token_ids=tensor([29197], device='cuda:0') mtp accept=1 prop=29197 top1=29197 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.7ms s1=190.4ms wait=0.1/48.6ms pred gate=device Token # 1927: 3.726ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28723 prop=28723 pred gate=device Token # 1928: 114.449ms; value: next_token_ids=tensor([28723], device='cuda:0') mtp accept=1 prop=28723 top1=28723 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.9ms gain=84.6ms ratio=0.44 s0=3.8ms s1=190.0ms wait=0.1/48.6ms pred gate=device Token # 1929: 3.782ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29658 prop=29658 pred gate=device Token # 1930: 114.930ms; value: next_token_ids=tensor([29658], device='cuda:0') mtp accept=1 prop=29658 top1=29658 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=195.0ms gain=85.3ms ratio=0.44 s0=3.7ms s1=191.3ms wait=0.1/48.8ms pred gate=device Token # 1931: 3.755ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29556 prop=29556 pred gate=device Token # 1932: 114.678ms; value: next_token_ids=tensor([29556], device='cuda:0') mtp accept=1 prop=29556 top1=29556 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=193.9ms gain=84.5ms ratio=0.44 s0=5.0ms s1=188.9ms wait=0.1/47.2ms pred gate=device Token # 1933: 3.804ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27481 prop=27481 pred gate=device Token # 1934: 114.549ms; value: next_token_ids=tensor([27481], device='cuda:0') mtp accept=1 prop=27481 top1=27481 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.0ms gain=83.7ms ratio=0.43 s0=4.9ms s1=188.2ms wait=0.1/46.9ms pred gate=device Token # 1935: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29291 prop=29291 pred gate=device Token # 1936: 114.491ms; value: next_token_ids=tensor([29291], device='cuda:0') mtp accept=1 prop=29291 top1=29291 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.8ms gain=84.5ms ratio=0.44 s0=3.9ms s1=189.9ms wait=0.1/48.3ms pred gate=device Token # 1937: 3.808ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=28997 prop=28997 pred gate=device Token # 1938: 114.509ms; value: next_token_ids=tensor([28997], device='cuda:0') mtp accept=1 prop=28997 top1=28997 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.2ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.4ms wait=0.1/48.7ms pred gate=device Token # 1939: 3.829ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30942 prop=30942 pred gate=device Token # 1940: 114.388ms; value: next_token_ids=tensor([30942], device='cuda:0') mtp accept=1 prop=30942 top1=30942 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=194.1ms gain=84.8ms ratio=0.44 s0=3.7ms s1=190.4ms wait=0.1/48.6ms pred gate=device Token # 1941: 3.813ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=31257 prop=31257 pred gate=device Token # 1942: 114.695ms; value: next_token_ids=tensor([31257], device='cuda:0') mtp accept=1 prop=31257 top1=31257 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.0ms s1=190.4ms wait=0.1/48.2ms pred gate=device Token # 1943: 3.737ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=22994 prop=22994 pred gate=device Token # 1944: 115.037ms; value: next_token_ids=tensor([22994], device='cuda:0') mtp accept=1 prop=22994 top1=22994 accp=1.000 next=draft=201 prop=201 olap pair=109.9ms serial=194.2ms gain=84.4ms ratio=0.43 s0=4.1ms s1=190.2ms wait=0.1/48.1ms pred gate=device Token # 1945: 3.729ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.975 next=pair draft=28892 prop=28892 pred gate=device Token # 1946: 115.222ms; value: next_token_ids=tensor([28892], device='cuda:0') mtp accept=1 prop=28892 top1=28892 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.1ms gain=85.1ms ratio=0.44 s0=4.2ms s1=190.9ms wait=0.1/48.3ms pred gate=device Token # 1947: 3.785ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30160 prop=30160 pred gate=device Token # 1948: 115.665ms; value: next_token_ids=tensor([30160], device='cuda:0') mtp accept=1 prop=30160 top1=30160 accp=1.000 next=draft=223 prop=223 olap pair=110.5ms serial=195.9ms gain=85.3ms ratio=0.44 s0=4.0ms s1=191.8ms wait=0.1/48.3ms pred gate=device Token # 1949: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.561 next=pair draft=29229 prop=29229 pred gate=device Token # 1950: 115.101ms; value: next_token_ids=tensor([29229], device='cuda:0') mtp accept=1 prop=29229 top1=29229 accp=1.000 next=draft=29229 prop=29229 olap pair=109.1ms serial=193.5ms gain=84.3ms ratio=0.44 s0=4.2ms s1=189.3ms wait=0.1/48.2ms pred gate=device Token # 1951: 4.704ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=29229 top1=223 accp=0.049 next=pair draft=30251 prop=30251 pred gate=device Token # 1952: 116.417ms; value: next_token_ids=tensor([30251], device='cuda:0') mtp accept=1 prop=30251 top1=30251 accp=1.000 next=draft=223 prop=223 olap pair=110.3ms serial=195.2ms gain=84.9ms ratio=0.43 s0=5.5ms s1=189.7ms wait=0.2/46.7ms pred gate=device Token # 1953: 4.702ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.965 next=pair draft=28782 prop=28782 pred gate=device Token # 1954: 114.872ms; value: next_token_ids=tensor([28782], device='cuda:0') mtp accept=1 prop=28782 top1=28782 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.2ms gain=84.6ms ratio=0.44 s0=4.6ms s1=189.5ms wait=0.1/47.7ms pred gate=device Token # 1955: 3.807ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.952 next=pair draft=30400 prop=30400 pred gate=device Token # 1956: 115.285ms; value: next_token_ids=tensor([30400], device='cuda:0') mtp accept=1 prop=30400 top1=30400 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=193.6ms gain=84.0ms ratio=0.43 s0=7.0ms s1=186.7ms wait=0.2/44.9ms pred gate=device Token # 1957: 3.878ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.993 next=pair draft=29597 prop=29597 pred gate=device Token # 1958: 115.850ms; value: next_token_ids=tensor([29597], device='cuda:0') mtp accept=1 prop=29597 top1=29597 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.6ms gain=84.7ms ratio=0.44 s0=5.8ms s1=188.9ms wait=0.2/46.1ms pred gate=device Token # 1959: 4.705ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.998 next=pair draft=26754 prop=26754 pred gate=device Token # 1960: 115.412ms; value: next_token_ids=tensor([26754], device='cuda:0') mtp accept=1 prop=26754 top1=26754 accp=1.000 next=draft=223 prop=223 olap pair=109.7ms serial=194.2ms gain=84.5ms ratio=0.44 s0=6.0ms s1=188.2ms wait=0.2/46.0ms pred gate=device Token # 1961: 3.895ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.981 next=pair draft=31142 prop=31142 pred gate=device Token # 1962: 114.698ms; value: next_token_ids=tensor([31142], device='cuda:0') mtp accept=1 prop=31142 top1=31142 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.1ms s1=190.3ms wait=0.1/47.8ms pred gate=device Token # 1963: 3.783ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.995 next=pair draft=26631 prop=26631 pred gate=device Token # 1964: 114.706ms; value: next_token_ids=tensor([26631], device='cuda:0') mtp accept=1 prop=26631 top1=26631 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=194.5ms gain=85.0ms ratio=0.44 s0=4.1ms s1=190.4ms wait=0.1/48.4ms pred gate=device Token # 1965: 3.750ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=29345 prop=29345 pred gate=device Token # 1966: 114.741ms; value: next_token_ids=tensor([29345], device='cuda:0') mtp accept=1 prop=29345 top1=29345 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.7ms gain=85.1ms ratio=0.44 s0=3.8ms s1=190.9ms wait=0.1/48.5ms pred gate=device Token # 1967: 3.782ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29571 prop=29571 pred gate=device Token # 1968: 114.580ms; value: next_token_ids=tensor([29571], device='cuda:0') mtp accept=1 prop=29571 top1=29571 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.3ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.7ms pred gate=device Token # 1969: 3.822ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=30704 prop=30704 pred gate=device Token # 1970: 114.392ms; value: next_token_ids=tensor([30704], device='cuda:0') mtp accept=1 prop=30704 top1=30704 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.2ms wait=0.1/48.5ms pred gate=device Token # 1971: 3.753ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=30591 prop=30591 pred gate=device Token # 1972: 115.659ms; value: next_token_ids=tensor([30591], device='cuda:0') mtp accept=1 prop=30591 top1=30591 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.0ms gain=84.3ms ratio=0.43 s0=6.2ms s1=187.7ms wait=0.2/46.0ms pred gate=device Token # 1973: 4.694ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=28409 prop=28409 pred gate=device Token # 1974: 115.004ms; value: next_token_ids=tensor([28409], device='cuda:0') mtp accept=1 prop=28409 top1=28409 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=193.3ms gain=83.7ms ratio=0.43 s0=8.8ms s1=184.6ms wait=0.2/43.0ms pred gate=device Token # 1975: 3.769ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29979 prop=29979 pred gate=device Token # 1976: 114.359ms; value: next_token_ids=tensor([29979], device='cuda:0') mtp accept=1 prop=29979 top1=29979 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=193.9ms gain=84.7ms ratio=0.44 s0=3.7ms s1=190.2ms wait=0.1/48.9ms pred gate=device Token # 1977: 3.790ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=22199 prop=22199 pred gate=device Token # 1978: 114.962ms; value: next_token_ids=tensor([22199], device='cuda:0') mtp accept=1 prop=22199 top1=22199 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.6ms gain=84.2ms ratio=0.44 s0=5.8ms s1=187.7ms wait=0.2/46.4ms pred gate=device Token # 1979: 3.870ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.972 next=pair draft=29485 prop=29485 pred gate=device Token # 1980: 115.278ms; value: next_token_ids=tensor([29485], device='cuda:0') mtp accept=1 prop=29485 top1=29485 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.6ms gain=85.5ms ratio=0.44 s0=3.8ms s1=191.8ms wait=0.1/48.6ms pred gate=device Token # 1981: 3.833ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.973 next=pair draft=29899 prop=29899 pred gate=device Token # 1982: 114.537ms; value: next_token_ids=tensor([29899], device='cuda:0') mtp accept=1 prop=29899 top1=29899 accp=1.000 next=draft=29899 prop=223 olap pair=109.3ms serial=194.2ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.2ms wait=0.1/48.6ms pred gate=device Token # 1983: 3.849ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.184 next=pair draft=24537 prop=24537 pred gate=device Token # 1984: 114.673ms; value: next_token_ids=tensor([24537], device='cuda:0') mtp accept=1 prop=24537 top1=24537 accp=1.000 next=draft=201 prop=201 olap pair=109.5ms serial=192.9ms gain=83.4ms ratio=0.43 s0=4.3ms s1=188.5ms wait=0.1/47.9ms pred gate=device Token # 1985: 3.795ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.681 next=pair draft=29723 prop=29723 pred gate=device Token # 1986: 114.991ms; value: next_token_ids=tensor([29723], device='cuda:0') mtp accept=1 prop=29723 top1=29723 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=193.3ms gain=83.5ms ratio=0.43 s0=4.4ms s1=188.9ms wait=0.1/47.8ms pred gate=device Token # 1987: 3.813ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31479 prop=31479 pred gate=device Token # 1988: 115.667ms; value: next_token_ids=tensor([31479], device='cuda:0') mtp accept=1 prop=31479 top1=31479 accp=1.000 next=draft=223 prop=223 olap pair=110.5ms serial=194.4ms gain=83.9ms ratio=0.43 s0=4.3ms s1=190.1ms wait=0.1/48.3ms pred gate=device Token # 1989: 3.792ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31098 prop=31098 pred gate=device Token # 1990: 114.994ms; value: next_token_ids=tensor([31098], device='cuda:0') mtp accept=1 prop=31098 top1=31098 accp=1.000 next=draft=223 prop=223 olap pair=109.8ms serial=194.3ms gain=84.5ms ratio=0.43 s0=4.0ms s1=190.3ms wait=0.1/48.4ms pred gate=device Token # 1991: 3.821ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.984 next=pair draft=29496 prop=29496 pred gate=device Token # 1992: 114.653ms; value: next_token_ids=tensor([29496], device='cuda:0') mtp accept=1 prop=29496 top1=29496 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.6ms pred gate=device Token # 1993: 3.786ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.996 next=pair draft=29565 prop=29565 pred gate=device Token # 1994: 114.613ms; value: next_token_ids=tensor([29565], device='cuda:0') mtp accept=1 prop=29565 top1=29565 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=3.9ms s1=190.4ms wait=0.1/48.6ms pred gate=device Token # 1995: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.999 next=pair draft=30445 prop=30445 pred gate=device Token # 1996: 114.626ms; value: next_token_ids=tensor([30445], device='cuda:0') mtp accept=1 prop=30445 top1=30445 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.4ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.6ms pred gate=device Token # 1997: 3.775ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.988 next=pair draft=28556 prop=28556 pred gate=device Token # 1998: 114.499ms; value: next_token_ids=tensor([28556], device='cuda:0') mtp accept=1 prop=28556 top1=28556 accp=1.000 next=draft=223 prop=223 olap pair=109.4ms serial=194.2ms gain=84.8ms ratio=0.44 s0=4.0ms s1=190.2ms wait=0.1/48.3ms pred gate=device Token # 1999: 3.773ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.959 next=pair draft=31594 prop=31594 pred gate=device Token # 2000: 114.665ms; value: next_token_ids=tensor([31594], device='cuda:0') mtp accept=1 prop=31594 top1=31594 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.3ms gain=84.8ms ratio=0.44 s0=4.5ms s1=189.7ms wait=0.1/47.1ms pred gate=device Token # 2001: 3.874ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.965 next=pair draft=25744 prop=25744 pred gate=device Token # 2002: 114.562ms; value: next_token_ids=tensor([25744], device='cuda:0') mtp accept=1 prop=25744 top1=25744 accp=1.000 next=draft=25744 prop=25744 olap pair=109.3ms serial=194.1ms gain=84.7ms ratio=0.44 s0=4.4ms s1=189.6ms wait=0.1/47.1ms pred gate=device Token # 2003: 3.808ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=25744 top1=223 accp=0.003 next=pair draft=27247 prop=27247 pred gate=device Token # 2004: 114.273ms; value: next_token_ids=tensor([27247], device='cuda:0') mtp accept=1 prop=27247 top1=27247 accp=1.000 next=draft=201 prop=201 olap pair=109.1ms serial=193.6ms gain=84.5ms ratio=0.44 s0=4.3ms s1=189.3ms wait=0.1/47.3ms pred gate=device Token # 2005: 3.766ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=1.000 next=pair draft=30915 prop=30915 pred gate=device Token # 2006: 114.642ms; value: next_token_ids=tensor([30915], device='cuda:0') mtp accept=1 prop=30915 top1=30915 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=4.3ms s1=190.1ms wait=0.1/47.7ms pred gate=device Token # 2007: 3.764ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=29607 prop=29607 pred gate=device Token # 2008: 114.761ms; value: next_token_ids=tensor([29607], device='cuda:0') mtp accept=1 prop=29607 top1=29607 accp=1.000 next=draft=223 prop=223 olap pair=109.6ms serial=194.6ms gain=85.0ms ratio=0.44 s0=4.3ms s1=190.2ms wait=0.1/47.3ms pred gate=device Token # 2009: 3.776ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31601 prop=31601 pred gate=device Token # 2010: 115.512ms; value: next_token_ids=tensor([31601], device='cuda:0') mtp accept=1 prop=31601 top1=31601 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=192.6ms gain=83.3ms ratio=0.43 s0=9.0ms s1=183.6ms wait=0.2/42.6ms pred gate=device Token # 2011: 4.725ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31416 prop=31416 pred gate=device Token # 2012: 114.809ms; value: next_token_ids=tensor([31416], device='cuda:0') mtp accept=1 prop=31416 top1=31416 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.2ms gain=84.7ms ratio=0.44 s0=4.3ms s1=189.9ms wait=0.1/47.7ms pred gate=device Token # 2013: 3.784ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31554 prop=31554 pred gate=device Token # 2014: 114.226ms; value: next_token_ids=tensor([31554], device='cuda:0') mtp accept=1 prop=31554 top1=31554 accp=1.000 next=draft=223 prop=223 olap pair=109.0ms serial=193.2ms gain=84.2ms ratio=0.44 s0=4.2ms s1=189.1ms wait=0.1/47.9ms pred gate=device Token # 2015: 3.764ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31774 prop=31774 pred gate=device Token # 2016: 114.748ms; value: next_token_ids=tensor([31774], device='cuda:0') mtp accept=1 prop=31774 top1=31774 accp=1.000 next=draft=223 prop=31774 olap pair=109.5ms serial=194.4ms gain=84.8ms ratio=0.44 s0=4.5ms s1=189.9ms wait=0.1/47.8ms pred gate=device Token # 2017: 3.813ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=31774 top1=223 accp=0.756 next=pair draft=31273 prop=31273 pred gate=device Token # 2018: 115.089ms; value: next_token_ids=tensor([31273], device='cuda:0') mtp accept=1 prop=31273 top1=31273 accp=1.000 next=draft=223 prop=223 olap pair=109.9ms serial=194.9ms gain=85.1ms ratio=0.44 s0=4.1ms s1=190.8ms wait=0.1/48.0ms pred gate=device Token # 2019: 3.826ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=31676 prop=31676 pred gate=device Token # 2020: 114.467ms; value: next_token_ids=tensor([31676], device='cuda:0') mtp accept=1 prop=31676 top1=31676 accp=1.000 next=draft=31676 prop=31676 olap pair=109.3ms serial=193.7ms gain=84.4ms ratio=0.44 s0=4.3ms s1=189.4ms wait=0.1/47.7ms pred gate=device Token # 2021: 3.811ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=0 prop=31676 top1=223 accp=0.014 next=pair draft=28002 prop=28002 pred gate=device Token # 2022: 117.098ms; value: next_token_ids=tensor([28002], device='cuda:0') mtp accept=1 prop=28002 top1=28002 accp=1.000 next=draft=223 prop=223 olap pair=109.5ms serial=194.4ms gain=84.9ms ratio=0.44 s0=3.8ms s1=190.6ms wait=0.1/48.6ms pred gate=device Token # 2023: 3.812ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.997 next=pair draft=6884 prop=6884 pred gate=device Token # 2024: 114.560ms; value: next_token_ids=tensor([6884], device='cuda:0') mtp accept=1 prop=6884 top1=6884 accp=1.000 next=draft=201 prop=201 olap pair=109.4ms serial=193.2ms gain=83.8ms ratio=0.43 s0=7.3ms s1=185.8ms wait=0.2/44.7ms pred gate=device Token # 2025: 3.763ms; value: next_token_ids=tensor([201], device='cuda:0') mtp accept=1 prop=201 top1=201 accp=0.999 next=pair draft=22066 prop=22066 pred gate=device Token # 2026: 116.447ms; value: next_token_ids=tensor([22066], device='cuda:0') mtp accept=1 prop=22066 top1=22066 accp=1.000 next=draft=223 prop=223 olap pair=111.3ms serial=197.5ms gain=86.2ms ratio=0.44 s0=4.1ms s1=193.4ms wait=0.1/48.0ms pred gate=device Token # 2027: 3.824ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=21097 prop=21097 pred gate=device Token # 2028: 115.256ms; value: next_token_ids=tensor([21097], device='cuda:0') mtp accept=1 prop=21097 top1=21097 accp=1.000 next=draft=223 prop=223 olap pair=110.1ms serial=195.4ms gain=85.3ms ratio=0.44 s0=5.0ms s1=190.4ms wait=0.1/47.0ms pred gate=device Token # 2029: 3.819ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=26838 prop=26838 pred gate=device Token # 2030: 114.462ms; value: next_token_ids=tensor([26838], device='cuda:0') mtp accept=1 prop=26838 top1=26838 accp=1.000 next=draft=223 prop=223 olap pair=109.3ms serial=193.7ms gain=84.4ms ratio=0.44 s0=4.4ms s1=189.3ms wait=0.1/48.1ms pred gate=device Token # 2031: 3.873ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=0.971 next=pair draft=27643 prop=27643 pred gate=device Token # 2032: 114.427ms; value: next_token_ids=tensor([27643], device='cuda:0') mtp accept=1 prop=27643 top1=27643 accp=1.000 next=draft=223 prop=223 olap pair=109.2ms serial=194.0ms gain=84.8ms ratio=0.44 s0=3.8ms s1=190.2ms wait=0.1/48.6ms pred gate=device Token # 2033: 3.849ms; value: next_token_ids=tensor([223], device='cuda:0') mtp accept=1 prop=223 top1=223 accp=1.000 next=pair draft=27121 prop=27121 pred gate=device