Annotation of wikisrc/users/jym/benchmarks.mdwn, revision 1.1
1.1 ! wiki 1: # PAE and Xen balloon benchmarks #
! 2:
! 3: ## Protocol ##
! 4:
! 5: Three tests were performed to benchmark the kernel:
! 6:
! 7: 1. build.sh runs. The results are those returned by [[!template id=man name="time" section="1"]].
! 8: 1. hackbench, a popular tool used by Linux to benchmarks thread/process creation time.
! 9: 1. sysbench, which can benchmark mulitple aspect of a system. Presently, the memory bandwidth, thread creation, and OLTP (online transaction processing) tests were used.
! 10:
! 11: All were done three times, with a reboot between each of these tests.
! 12:
! 13: The machine used:
! 14:
! 15: [[!template id=programlisting text="""
! 16: # cpuctl list
! 17: Num HwId Unbound LWPs Interrupts Last change
! 18: ---- ---- ------------ -------------- ----------------------------
! 19: 0 0 online intr Sun Jul 11 00:25:31 2010
! 20: 1 1 online intr Sun Jul 11 00:25:31 2010
! 21: # cpuctl identify 0
! 22: cpu0: Intel Pentium 4 (686-class), 2798.78 MHz, id 0xf29
! 23: cpu0: features 0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
! 24: cpu0: features 0xbfebfbff<PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX>
! 25: cpu0: features 0xbfebfbff<FXSR,SSE,SSE2,SS,HTT,TM,SBF>
! 26: cpu0: features2 0x4400<CID,xTPR>
! 27: cpu0: "Intel(R) Pentium(R) 4 CPU 2.80GHz"
! 28: cpu0: I-cache 12K uOp cache 8-way, D-cache 8KB 64B/line 4-way
! 29: cpu0: L2 cache 512KB 64B/line 8-way
! 30: cpu0: ITLB 4K/4M: 64 entries
! 31: cpu0: DTLB 4K/4M: 64 entries
! 32: cpu0: Initial APIC ID 0
! 33: cpu0: Cluster/Package ID 0
! 34: cpu0: SMT ID 0
! 35: cpu0: family 0f model 02 extfamily 00 extmodel 00
! 36: # cpuctl identify 1
! 37: cpu1: Intel Pentium 4 (686-class), 2798.78 MHz, id 0xf29
! 38: cpu1: features 0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
! 39: cpu1: features 0xbfebfbff<PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX>
! 40: cpu1: features 0xbfebfbff<FXSR,SSE,SSE2,SS,HTT,TM,SBF>
! 41: cpu1: features2 0x4400<CID,xTPR>
! 42: cpu1: "Intel(R) Pentium(R) 4 CPU 2.80GHz"
! 43: cpu1: I-cache 12K uOp cache 8-way, D-cache 8KB 64B/line 4-way
! 44: cpu1: L2 cache 512KB 64B/line 8-way
! 45: cpu1: ITLB 4K/4M: 64 entries
! 46: cpu1: DTLB 4K/4M: 64 entries
! 47: cpu1: Initial APIC ID 0
! 48: cpu1: Cluster/Package ID 0
! 49: cpu1: SMT ID 0
! 50: cpu1: family 0f model 02 extfamily 00 extmodel 00
! 51: """]]
! 52:
! 53: This machine uses HT - so technically speaking, it is not a true bi-CPU host.
! 54:
! 55: ## PAE ##
! 56:
! 57: Overall, PAE affects memory performance by a 15-20% ratio; this is particularly noticeable with sysbench and hackbench, where bandwidth and thread/process creation time are all slower.
! 58:
! 59: Userland remains rather unaffected, with differences in the 5% range; build.sh -j4 runs approximately 5% slower under PAE, both for native and Xen case.
! 60:
! 61: Do not be surprised by the important "user" result for build.sh benchmark in the native vs Xen case. Build being performed with -j4 (4 make sub-jobs in parallel), many processes may run concurrently under i386 native, crediting more time for userland, while under Xen, the kernel is not SMP capable.
! 62:
! 63: Notice that, in a MP context, Xen stays behind by a 40% margin for parallel build. Given that Xen overhead is considered negligible, it shows that NetBSD build system gets an important boost when parallelized, at least for bi-CPU setups. Just to show that the concurrent build is not purely rhetorical :)
! 64:
! 65: ## Xen ballooning ##
! 66:
! 67: In essence, there is not much to say. Results are all below the 5% margin, adding the balloon thread did not affect performance or process creation/scheduling drastically. It is all noise. The timeout delay added by cherry@ seems to be reasonable (can be revisited later, but does not seem to be critical).
CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb